Server blue screens when cold booted, BAD_POOL_CALLER 0x000000c2

So I've just been having this very irritating problem with a server I built a few months ago:

Basically, what the machine does is if it has been shut off for more than a day then cold booted, it will reach the windows 7 login screen and then blue screen.
sometimes it will continue to do this infinitely unless I hard power it off and start it over again, then it will usually start up properly after that with no issue.
Sometimes the screen will also go completely black right after logging in and the cursor will freeze requiring a hard reset.

The error is nearly every time a BAD_POOL_CALLER 0x000000c2 (0x00000007, 0x00001097, 0x00053370, 0x8cda6a40) with some variation in the values

Here's the kicker though - other than that, I've left this server up for weeks at a time with no reboot or shutdown and no reliability issues or problems.

here's a little background on the machine and it's purpose:

I work at a video production house and the main purpose of this server is to offload P2 memory cards (essentially PCMCIA memory cards) from cameras for archival and then to pass them along over a gigabit switch to our editing workstation where they are worked on.

The only special parts this machine has that may not be familiar are the two readers used to interface with the cards, which is called an Amtron PCD-TP-110CS. Basically it's a passthrough device to interface PCMCIA cards through PCI slots.

These go into a standard PCI (not express) slot and work with built in drivers that come with Win7x32.

Here's the rest of it's parts:
Hitachi 500GB Boot disk
2x Western Digital 2TB Green drives for storing the video (all 3 are sat)
Phenom II X4 925 (does not run hot, has a zalman cooler)
G.SKILL Ripjaws Series 4GB (2 x 2GB) DDR3 (PC3 12800)
MSI sata DVD-RW
OCZ 500W Modular PSU
m4a88td-v evo/usb3 (using Motherboard graphics+sound)

No special software installed, machine has no internet connection and has been scanned with nod32 so there are no viruses or malware

Other things I've tried:
checking temperatures with hwmonitor - all ok the case is extremely well ventilated and has a lot of fans

letting windows 7 try to repair the installation (it doesn't do anything)

running various ram test utilities and boot cds for a few days straight (ram is fine)

running various burn in tools to try and pinpoint the problem piece of hardware (no results)

reinstalling windows 7 (issue still persists)

updating windows (it's all current)

updating motherboard bios and other drivers (didn't help)

replacing the cmos battery on the motherboard (nothing)


I've attached a collection of minidumps from some of the crashes.
I would be deeply grateful if someone could help me figure out what's wrong with this system
 

Attachments

  • Minidump.zip
    185.7 KB · Views: 1
I actually don't remember the exact number of passes but it was definitely left running for more than 3 days straight and didn't come back with any issue
 
The reason for my question is because after reading the five most recent of your dumps two are of importance.

They are both 0x0000001A: MEMORY_MANAGEMENT
This memory management error is usually hardware related.

Both specifically cited memory corruption as the issue.


Since Memtest shows no errors then find the voltage specs of your RAM and compare it to the voltage setting in your BIOS. Do they match? How about the CAS timings?
 
I re-ran memtest for about 10 passes and got a couple thousand errors, so I gave my ram manufacturer a call about RMA, and they recommended I first turn down the timings from 9-9-9 24 (motherboard defaults) to 8-8-8 24 and set the clockrate to 1333mhz (it was on auto) I didn't touch the voltage settings.

I've been running memtest again for about two hours and have passed six times with no errors yet, will post back with results later.
 
Good work. The real test will come when you are normally using this system. Please keep us updated.
 
Back