5 months of Multiple BSODs on a brand new computer - driver or hardware error?

Status
Not open for further replies.
5 months of Multiple BSODs on a brand new computer - driver or hardware error? Mini dump analysis help please.

Hi, I’m looking for anyone who can analyse mini dumps and help me determine the source of my problems.

The system has been giving me hell from the start. The problem often manifests in different driver and memory errors - To my level of understanding the errors seem to be fairly unrelated. When it runs well it’s awesome and it really performs. When it starts messing up, that’s it, it might BSOD on every start up 5times in a row and then start working for no apparent reason.

I ruled out: faulty PSU and faulty RAM, and system always runs cool, (the hottest component being the fanless graphics card).

I have also run extensive tests on system hard drive:
Seatools returned no errors after firmware upgrade and 2 long tests. However DiskCheckup keeps changing its mind out predicted TEC date: it jumps from Jan 2010 too Aug 2011 either of which is unacceptable and well within manufacture warranty

Reinstalled windows countless times with different installation CDs

So that leaves MOBO, CPU, GPU or maybe some software of driver issue.

I feel that it must be hardware related but I can’t work out what it is, on top of that the faults are often intermittent making trial and error testing unreliable. I have collected many minidumps now, but I can’t work out if they are related or not – has my comp dozens of problems or just a couple with variegated symptoms.

Questions:
Q1. Can anybody help me?
Q2. How should I post minidumps and info?

Hard drive questions:
Seatools return no errors. (I have been told that this is the definitive test for a faulty hard drive.)
Q3. PassMark DiskCheckup keeps changing its mind out predicted TEC date. Does this indicate a problem with my hard drive?
Q4. When ever I run chkdsk it says that it has found errors and is repairing them. Does this indicate a problem with my hard drive?

GPU questions:
Video driver has supposedly caused some of the BSODs .
The drag and drop mouse marks don’t disappear – almost as if the screen is not refreshing.
Q5. Could these be signs of a faulty graphics card?

Thanks a lot. I’m beginning to go a little bit mad from all this troubleshooting.:dead:

Jane89
 
Well its an inner kernal crash by the looks of it, so its not Windows, else we would all have it. I'm no expert but its related to cacheing, so I would look at 3 things,

First, check in your BIOS if memory caching, sometimes called shadowing, is on, some hardware doesn't like this.

Secondly, run chkdsk /f on your PC, this is quite rare, but sometimes a virtual cache can be frequently run on a damaged part of the HDD, also, does your drive support SMART and is it running? Because the HDD cache itself could be damaged (You should be able to return it under warranty)

Third and I hope most unlikely issue is the internal CPU caches, the L1 and L2 cache, and personally, I have no idea how to test those.

Post your PC specs too. :)
 
Thankyou for your post 04ihegba

My System specs:
New build:
MOBO: Gigabyte GA-EP45-DS5 (BIOS F12)
CPU: E8500
PSU: Antec Truepower 650w
GPU: NVIDIA GT9600 fanless GV-NX96T512HP (Gigabyte)
RAM: Corsair Dominator (4gig total) Twin2X4096-8500C5DF
Keyboard & mouse: Logitech Wavepro
DRIVES:
DVD: Liteon (SATA)
Card reader: Generic (internal USB)
HD1 (System): Seagate 500gig (intel ICH10)
HD2 (RAID0 stiped): 2 x Seagate 500gig (intel ICH10)
HD3 WD 1000gig (GSATA)
HD4 Seagate 1000gig (GSATA)

*All the lastest drivers were downloaded from Intel, Nvidia & Gigabyte before install
*BIOS is the lastest version F12
*RAID bios in operation

BIOS M.T.I. settings:
Everything is set to standard defaults:
Robust graphics booster: auto
Performance enhance: standard
RAM: My voltage is at 2.1 and my timings are at 5-5-5-15 at 1066 (set by automatically by F12 bios through the “Extreme memory profile” setting of “Auto”
MCH core raised from 1.1v to 1.22v
CPU vcore: auto
CPU speed 333x9.5

I will look into your suggestions and post back

Thanks again
 
Hi, 04ihegba

1. I couldn’t find anything in my BIOS about memory caching or shadowing. There is something called “no execute memory protect” this set to disabled. I can’t remember exactly what this does. I’m quite sure my hard drives are using their onboard cache, if that is related to caching or shadowing. (?)

2. SMART is enabled - PassMark DiskCheckup is using SMART monitoring when it “keeps changing its mind out predicted TEC date” […] “from Jan 2010 too Aug 2011”.

I will run chkdsk /F and see if it helps,

3. Yeah sounds like a difficult test. I am unconvinced by all of the CPU tests that I have run with Hirens.

Could a faulty graphics card cause an “inner kernal crash” ?
And what is an inner kernel crash?

Thanks again 04ihegba
 
Basically, windows consists of the kernel, which is the original background code that all windows copies share in common, most of this is inside ntdll. Drivers directly connect to the kernel, and so are declared as in "kernel mode". Everything else is in "User Mode" and connects to the kernal and the drivers through DLLs etc. (pretty oversimplified tbh)

Kernel mode files take precedence over anything bet the kernel itself, and each of them is given an number called an IRQL, (Interrupt Request Level). See here for more info on that http://ext2fsd.sourceforge.net/documents/irql.htm

Basically, from looking at your last log (It was a pretty quick look because I was busy). All of your code failures that led up to an IRQL failure were from ntdll, and because all Windows copies share ntdll with you, it wasn't that that was to fault.

ntdll was trying to execute caching when it failed so I suggested checking anything cache related on your machine, but seeing as your dump before that was a different issue, I would check all your cables and cards are fitted securely before doing anything else.

I'll have a better look right now, I just thought I would answer your queries first.
 
It may be over simplified but at least it makes sense, I hadn’t realised the connection between the kernel and the IRQL before, thanks. :)

Another funny anomaly is my optical drive: It will boot windows, it wont boot to Hirens boot disk. If I want to boot Hirens, I have to pull the old IDE CD ROM out from my old computer and plug it in. It's the same for other DOS based boot utils like Norton ghost, seatools. Is this related to system running in RAID BIOS or a manifestation of the hardware problem causing the other problems?

I’ll try to find out more about Caching options and settings on my Mobo

Thanks again

I look forward to finding out more about your analysis
 
The problem has been following this pattern:
1. Computer works for a while (maybe 2 hours maybe a day)
2. Computer BSODs with random driver error (ntfs.sys – nv4 – sr.sys etc…).
3. I leave it for 5 mins or an hour and it boots ups fine
4. Cycle starts again

This to me suggests over heating or something like that – but all my hardware monitoring suggests otherwise most components clocking temps of 30 – 40 degrees Celsius.

Since my last post I have tried:
1. Swapping graphics cards – BSODs persist
2. Running memory at 667mhz – BSODs persist
3. Unplugging card reader – BSODs persist
4. Testing memory with WMD (windows memory diagnostic) – Passed
5. Resetting BIOS to defaults, and turning off default turbo acceleration and default extreme memory profile, and turning on RAID bios (essential for my system to operate) – BSODs persist
6. Searching for bios options that that may relate in anyway to caching and shadowing on the ‘F12 EP45-DS5 bios’

Today I had a new revelation:

Instead of leaving system for 5 min after a BSOD I booted straight to WMD (windows memory diagnostic) and the memory failed every test. This is memory that has previously passed several tests recently 2 passes of WDM and previously 15passes of memtest86.

So I jiggle the Ram modules a bit and reboot. Memory passes every test in WMD.

What does this mean:
1. Do I have loose memory slots on my MOBO?
2. Do I have memory that intermittently fails?
3. Or memory slots that intermittently fails?

What ever the problem is, I need a stable system. I am using the computer as music workstation – system instability is not an option. All the components are new (supposedly high quality), expensive (and supposedly compatible)

Where can I go from here to get a stable system (constantly jiggling and reseating RAM is where I want to be with my brand new “beast workstation”.

Any Ideas? Anyone?
 
Sorry I took so long to reply. At this point, your stop messages (BSoDs) are so varied that If I was you I would hope it is a memory problem. Bad memory can still pass WMD and mem86.

I would consider buying new memory. Providing that you are not forced to go online and pay delivery costs, but a single stick of small but compatible memory, install it, and use it for the next few days. If those days a freeze free, then you have found your culprit. If it still gives you problems, then it is an I/O fault and your motherboard is the problem.
 
Status
Not open for further replies.
Back