Frequent BSODs (after much debugging)

Status
Not open for further replies.

risingTide

Posts: 101   +0
Greetings.

Quick Description: My computer seizes up with BSODs at random times.

Details: First note the specs on my profile. (If I need to supply more I certainly will.) As you can see, my system is a bit outdated, but works great for my purposes at this time. A few months ago I reinstalled Windows. I made no hardware / software changes on the new install and have in fact re-installed Windows on this machine before with no glitches. Before this I was running this machine with the same configuration for literally 5 years with only a handful of BSODs ever. This time, however, after upgrading all the drivers, my Windows updates, etc. I have random BSODs from time to time…sometimes up to 5 a day. There doesn’t seem to be a pattern as to what I have open or am doing at the time of the BSOD. Sometimes the machine could have been idle for a few hours before it happens. Sometimes it happens when I shutdown. Sometimes it is completely random.

Things I’ve done so far (after reading the extremely helpful posts “Troubleshooting Reboots/System Crashes”, “Before Posting Your Mini-Dumps, Please Read This”, and “A Guide to Stop Error Messages"):

· I’ve updated all my drivers to the best of my knowledge (including the latest BIOS).
· All my hardware is well-seated and connected.
· I’ve run CCleaner multiple times.
· I’ve run Ad-Aware multiple times.
· EventLog – I have a number of red items in my event log. Under Application they have to do with McAfee Security Center. Clicking on the links reveals no helpful information from Microsoft, however. I’m not sure these are the cause of the BSODs or if the BSODs are causing these. My machine continues to run after McAfee crashes most of the time. Under System I also have a red item related to McAfee similar to the above. Is there something further to look into here?
· I’ve run MemTest+ and my 1 stick of 512 RAM is fine.
· I’ve run SpeedFan and it looks like my deviations are within 5% or 6% on the +3.3, +5, and +12V. However, I do have a little “fire symbol” by a few fans on the upper part of the GUI for that program, but the hottest one is running at 66C. Is there something further to look into here?
· I’ve recently replaced a case fan and the fan over the north bridge…but the BSODs continue.
· I’ve cleaned the dust out of the machine pretty thoroughly…but the BSODs continue.
· I’ve run chkdsk a number of times. Sometimes it finds a minor error on Stage Two and repairs it. Is there something further to look into here?
· I’ve reset my page file to be 1.5 times the size of my RAM (768 min/max) and ran a defrag.
· It should also be noted that I also receive numerous IE 7 crashes. Basically IE just shuts down with no warning and often won’t open back up again until I reboot. Is there something further I should look into here?

After all that I’m still getting BSODs. Should any of the areas in blue above be looked into more? And if so how? What is the next step?

Also, I’ve examined my Stop errors with WinDebug and looked at their cause on the site listed in the sticky post above. However, I’m not sure where to go from there because most of the things it says to check I have already checked as you can see from the list above. My most recent Stop Errors are of this nature:
1. 10000008E – Prob caused by: mfehidk.sys
2. 1A – Probably caused by: memory_corruption
3. 10000050 (Could not read faulting driver name) – Prob caused by: ntoskrnl.exe
4. DE – Prob caused by: memory_corruption
5. 10000008E – Prob caused by: mfehidk.sys

I’ve attached a copy of my Everest report (see next post) and the 5 most recent BSODs. Where should I go from here?

Any help would be extremely appreciated!!
 
A five year old computer has needs to have a lot of things replaced... Most BSODS in an older unit are caused by the CPU fan wearing out, hard drive wearing out, video graphics wearing out (or bad Video graphics drivers), or power supply wearing out. It is time to replace them, starting with the CPU fan.
66C is high for a five year old computer, but Speedfan is mainly useless anyway, with more gimmicks, inaccuracyu, and marketing than real usefulness.
I would DEFINITELY REPLACE or add to the 512 MB of memory. That is simply not adequate any more with a machine like yours. I would also replace the Power Supply and the hard drive, because they will go soon if they are not defective now.
Get rid of McAfee, and switch to Avast or Antivir, Kaspersky, or NOD32. Then add MBAM MalwareBytes or SuperAntispyware, and Adaware 2008
 
I forgot to mention that I also recently ran a drive test and my hard drive passed fine. As noted above I've replaced two fans recently. My video card has the latest drivers (which also worked fine before the windows reinstall, as did everything else). The 66C was only on "Fan 3" - the rest were much lower (but apparently that is irrelevant anyway if SpeedFan is "mainly useless"). I have a new 1 Gb stick of RAM that I've swapped out but the problem continues with it as well. And as noted above I already have Adaware and have used it.

In light of these modifications, is it possible to look further into the current problem (and at least narrow down the faulty component if that's the issue) before replacing random parts?

Many thanks!!

(Also, if SpeedFan is "mainly useless" then why is it recommended on the sticky ("Troubleshooting Reboots/System Crashes") to download and use it?
 
Well, you have done everything that one would normally try. It is difficult to test for aging components... your test is that it doesn't work.

A five year old hard drive that still passes its drive fitness tests?
A five year old cpu fan on a board, where it should be replaced every two years? When the cost is $15 to $25 to replace.
Five years of mileage on a video graphics card, when the graphics card and driver are a close second in causes of BSOD's.
A five year old CMOS battery is two years past prime. Only $3.50 at Wal-Mart or BestBuy, or $21 at Radio Shack.
If you like SpeedFan, use speedfan. Is it telling you anything useful? We don't consider it reliable, as reports are flakey from day to day... and are not with the reliability of other devices.. Consistency would make it a good tool You don't see a lot of the "pro's" on this forum using it.

One test I would do is install a new hard drive, since they are cheap and you will eventually need one anyway. Keep your current drive and install. Then run all these tests to see if the BSOD has gone away.
Then I would do the low cost items. New CMOS battery. New CPU fan. A good cleaning with Dust Off or other difluoroethane gas canned air. Replace all the flat cables, as they shrink and become brittle. If they are pulling out of their sockets or grips, there could be a problem right there.
 
Maybe i can Help a bit. You do have a few errors they all are related to memory problems and some incompatible software e.g. RAM, L2 Cache (CPU), Video RAM (Graphics Card) etc.

You have 2 0x8E errors which are caused by "mfehidk.sys" which is the Daemon Tools Driver, I would suggest removing Daemon Tools.

You have a 0x50 error which is related to Memory problems as well as Incompatible software and was caused by "ntoskrnl.exe" which is an essential Windows start up process, so my guess would be that if Daemon Tools is on start up then remove Daemon Tools altogether including the Folder in the Program Files in C: Drive.

You have a 0x1A which is Hardware related so i would suggest running Prime95 to test your CPU the MiniDump mentions System Cache so test the CPU since you have already tested the HDD - http://files.extremeoverclocking.com/file.php?f=103
If that doesn't find anything try taking out your Graphics Card and running the system since you have tested just about every other piece of Hardware in your comp.

The last error is a 0xDE error which is a very unknown error but is related to Pool Corruption in a file area in this case memory corruption again. I'm not sure what to suggest for this error but try the above.

And if none of this helps the problem then do as raybay said and update a few parts but i would not see a lot wrong with a 5 year old system, I know optical drives would only last this long and fans maybe going but the essential parts CPU, RAM, Motherboard and HDD would last a bit longer if they do what you need them to do.
 
woody1191...

Thanks for the added info. I'm going to start by uninstalling this "Daemon Tools Driver"...however, I don't remember installing it and I can't find it in the add/remove programs (or using Revo Uninstaller). What is it and how can I safely remove it?

Thanks.
 
No its Daemon Tools, You need to uninstall i am referring to the Minidump which cites the Daemon tool driver i listed above. Daemon Tools is used to mount ISO files and similar files, apart from this i don't know much else about it here is the link to the main website though - http://www.daemon-tools.cc/dtcc/announcements.php

If you haven't ever installed Daemon Tools then having looked again on Google it is also supposedly part of McAfee as well. So first go into C: Drive then Program Files and locate the Daemon Tools folder and delete it, if you can't find it this may be a problem with McAfee so contact them about the problem - The MiniDumps 071308-01.dmp and 071708-01.dmp list the Driver and 0x8E errors.
 
Ah. I've been wondering if McAfee was an offender for some time. Perhaps it is at least part of the problem. I'll start by taking your (and raybay's) advice and getting rid of it and trying another Anti-Virus. I never really liked it but it was free with Comcast so I decided to give it another go.

-----------
raybay: I want to apologize for being a bit short with you on my post. As I was reading back over the thread I could tell that some of my frustration came out in my reply to you and that was very wrong of me. I was hoping for a more "concrete answer" after having spent so much time debugging myself and it came out in my post.

I do appreciate your help and will definitely take your advice to heart...especially with replacing some of the cheaper items for starters like you recommended. And begin by getting rid of McAfee!

You were trying to help, and I was being quite dumb. Please forgive me.
 
Not a problem to me. This is such a frustrating time. You have others interepreting Event Viewer code as though they know what they are doing, when some of that is merely an even... and not a cause. But we are all making the best educated guesses based upon experience. We learn from you as much as you learn from us.
I have built and rebuilt an uncountable number of computers, and worked as a field rep for some of the biggest companies as well as governments in 27 states... and several foreign countries... but I am surprised by something new just about every day.
None of us know very much, because Windows, Mac, and Linux systems don't focus on repairing or troubleshooting... So we guess and learn, and perhaps get a little better as time goes on.
We have an event viewer, but no reliable code interpreters. We have old equipment that can be bad, but can last for years. We have lots of fun spending your money.
You will figger it out, or we will help. But I expect there will still be more surprises for this forum based upon what you learn.
 
Not a problem to me. This is such a frustrating time. You have others interepreting Event Viewer code as though they know what they are doing, when some of that is merely an even... and not a cause.

raybay unless i'm reading into this wrong from what you meant, it looks quite clear that you are referring to me you don't seem to think my thoughts are any good for this problem.
I have looked at those MiniDumps and made my best judgment on what the problem is, that is my opinion on it and suggested some things to do. I know you are experienced in this field clearly, but you should know to never write anyones thoughts off on a problem, unless you can clearly say they are wrong and prove it or if the person tests it out and then says that didn't sort the problem. Since risingtide is currently looking at the problem and hasn't yet said if what i said worked or not you can't write my advice off on the problem, he is the person with the computer and since i am across the Atlantic Ocean (over 4000 miles away) i offer the advice based on his description of the problem and the MiniDumps he has posted.

I don't believe that it is fair to have made that comment really and find it disrespectful. So unless you can tell me otherwise don't write my comments off. I am still learning and so will not be as experienced as people like you and so you have seen quite a lot more in life than me and i will listen to your thoughts on problems because of your experiences, but to write mine off is wrong unless you can prove it.
So if you can provide me any proof on why my thoughts are wrong i would like to hear them. At this point i am quite in my rights to complain but will wait to see what you have to say about this.
 
I've done some more poking around and in all honesty can't even find a Daemon Tools folder in the Program files at all. I did read some of the posts on McAfee and I see what woody1191 said about my error being attributed to it. But before unistalling McAfee I decided to try something else based on woody1191 one post. He had posted

You have a 0x1A which is Hardware related so i would suggest running Prime95 to test your CPU the MiniDump mentions System Cache so test the CPU since you have already tested the HDD - http://files.extremeoverclocking.com/file.php?f=103

so I downloaded Prime95 and ran it. Interestingly it only ran for about 30 seconds before it found an error:

[Jul 19 12:32] Work thread starting
[Jul 19 12:32] Beginning a continuous self-test to check your computer.
[Jul 19 12:32] Please read stress.txt. Choose Test/Stop to end this test.
[Jul 19 12:32] Test 1, 4000 Lucas-Lehmer iterations of M19922945 using x87 FFT length 1024K.
[Jul 19 12:32] FATAL ERROR: Rounding was 0.4999995991, expected less than 0.4
[Jul 19 12:32] Hardware failure detected, consult stress.txt file.
[Jul 19 12:32] Torture Test ran 0 minutes - 1 errors, 0 warnings.
[Jul 19 12:32] Work thread stopped.

If I run it again it returns similar results. Could someone aid in interpretting what this means? (I can't find a stress.txt file either.)

Thank you.
 
I did a search for stress.txt on my comp having run Prime95 myself just now i didn't find it, but from what i can tell it is saved as "prime" if the program is still on your desktop otherwise it saves it in the folder or area where the program is.
It looks like the CPU maybe failing but that report should tell you exactly what happened.
 
Hmm...I've got two files of interest in there. The first one is prime.txt and looks like this:

V24OptionsConverted=1
SendAllFactorData=1
StressTester=1
UsePrimenet=0
MinTortureFFT=8
MaxTortureFFT=4096
TortureMem=256
TortureTime=15
HideIcon=0
TrayIcon=1
Left=149
Top=191
Right=1349
Bottom=1045
W0=0 0 1190 390 0 -1 -1 -1 -1
W2=0 390 1190 780 0 -1 -1 -1 -1

[PrimeNet]
Debug=1

The second is results.txt and looks like this:

[Fri Jul 18 22:45:24 2008]
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
[Fri Jul 18 22:48:29 2008]
FATAL ERROR: Rounding was 0.4999998984, expected less than 0.4
Hardware failure detected, consult stress.txt file.
[Sat Jul 19 12:32:52 2008]
FATAL ERROR: Rounding was 0.4999995991, expected less than 0.4
Hardware failure detected, consult stress.txt file.

Unfortunately neither of those seems to tell me anything more. Does this mean anything to anyone else?
 
Ok i looked into to it and found a simple way on how to recognise a problem on this forum - http://www.maximumpc.com/forums/viewtopic.php?t=72380

Which of the 3 tests did you run to get this error? - Small FFT's, In-place Large FFT's or Blend Test
If it was Small FFT's it is most likely a CPU problem.
If it was Large FFT's you will have to run the other 2 tests.
If it was Blend then it maybe the RAM.

This is what is explained on the Forum linked above the post was by the user "00john00" on 21st January if you want to look at the post.
 
Nice info here woody1191. I read the whole thread on that forum and it is extremely helpful. Looks like there is some investigating to do here.

For clarification, the test I ran that failed was the "blend". I'm going to try the other two tests and then post back with my results...perhaps then we'll have a better idea of where to start. There was another program mentioned called "Orthos" in that thread that is supposedly more user friendly...perhaps I should try that one as well / instead?
 
Yes you could use that program as well sounds like it makes the test logs more easy to understand having looked at the program website.
 
Welp...I ran the three tests. The first one (Small FFT's) fails after about 15 minutes. The second one (Large FFT's) fails after a few seconds, as does the final one (Blend) as posted earlier. So, I'm pretty sure this is bad...but there might be someway to fix it according to the link above that shows 00john00's post from the other forum. Any ideas on working on these errors? If possible?

In the meantime I'm going to download Orthos and try to get some better error messages from it.
 
Well since the Large FFT's fail and Blend fails as well it is pointing towards RAM, but they test a lot of RAM (Blend) and a bit of RAM (Large FFT's), not sure why the other fails (Small FFT's) since that focuses on overheating and stressing the CPU.

Try using some different RAM (Same Configuration) that you know works if possible since you only have 1 RAM Card in the comp. If this doesn't make the Computer more stable you might be looking at the CPU.
 
Well, I’ve had some interesting results…and good ones. I swapped out the RAM and used the new 1 GB stick to run all three tests again over the last three nights. All three ran for over 13 hours (without any errors) and never even stopped…I had to eventually just stop them manually the next day. Not only that, but I haven’t had any BSOD’s (or IE crashes for that matter) since I swapped RAM. Seems as if we have come to a solution!! Which is interesting because I'm almost positive I tried this 1 GB stick before only I still had the errors. The only thing I can think of is that instead of swapping them, I was using both at the same time. This leads to a few questions:

1) If the old 512 stick is bad and I had both the 1 GB and the 512 stick in at the same time, would that cause my system to be unstable even if the 1 GB was good? (I’m assuming yes.) I can test this by putting the old one back in and running the Prime95 tests again with both.

2) Why did my 512 stick pass the MemTest86+ test (at least 7 passes) if it failed on the Prime95? Do these two programs test different aspects of the RAM?

3) Is it possible that (as RayBay hinted at above) that I just didn’t have enough RAM with the 512 to run my system? This seems odd because I ran it for 5 years with just that 512 stick, and haven’t added anything significant since then except McAfee. If this is the case, and I add it on top of the new 1 GB it should still work, so I can test that I suppose.

If any one has any input on these 3 items or anything else I’d love to hear. I’ll try adding the old stick back in with the new one and see if my computer goes back to its old behaviour, and then post back.
 
If you use two unlike memory modules, you can have this problem. On many motherboards, you must use modules that are the same frequency and the same storage characteristics.
If you use a DDR2 PC 4200, for instance, you may have to use a second DDR2 PC 4200 for the second module. You cannot, in some motherboards use a DDR2 PC 2700 1 GB and and a DDR2 PC2700 512 MB module.
Nor can you use a PC2700 and a PC 3200 in some boards.
Other boards do not care.
But to assure the best chance of everthing working together, used matched sets of memory, that come in a box together. If you cannot do that, use the same size, and the same frequency.
Yours may simply be too different...
Some are third tier memory, some are first tier memory.
 
Not only that, but I haven’t had any BSOD’s (or IE crashes for that matter) since I swapped RAM. Seems as if we have come to a solution!!.

Good to hear this :).

1) If the old 512 stick is bad and I had both the 1 GB and the 512 stick in at the same time, would that cause my system to be unstable even if the 1 GB was good? (I’m assuming yes.) I can test this by putting the old one back in and running the Prime95 tests again with both.

Yes you are correct it would have made it unstable if the 512mb card was in.

2) Why did my 512 stick pass the MemTest86+ test (at least 7 passes) if it failed on the Prime95? Do these two programs test different aspects of the RAM?

They both run very differently, Memtest86 tests specific addresses designed to see if the RAM is faulty (Basically some very complex Addresses in some Tests and quite simple Addresses in others I believe), it is known that Bad RAM can pass Memtest and this proves it but it seems Prime95 is a good second test for the RAM.

Prime95 searches for something called Mersenne Primes or something like that and so is very picky if it doesn't find the correct calculation from using the CPU to calculate this and then it passes it onto the Program (On RAM) and so then finds the error.

3) Is it possible that (as RayBay hinted at above) that I just didn’t have enough RAM with the 512 to run my system? This seems odd because I ran it for 5 years with just that 512 stick, and haven’t added anything significant since then except McAfee. If this is the case, and I add it on top of the new 1 GB it should still work, so I can test that I suppose.

If any one has any input on these 3 items or anything else I’d love to hear. I’ll try adding the old stick back in with the new one and see if my computer goes back to its old behaviour, and then post back.

Since it is XP it would run on 512mb quite easily it would be able to run a couple of programs at the same time say a Virus Scan and continue Web Browsing. I don't know about McAfee it might need a bit of RAM to run.
 
Raybay, good call. I made sure that when I bought the second stick of RAM that it was exactly the same (Kingston 2100, same ECC, etc.) even down to the part number, minus the size of course. So I don't think that was the problem...but your second idea that the board might not be able to handle two sticks that aren't of the same size could be what's going on there. Unfortunately, the only way to really test that at this point would be to buy another 512 stick and try it with the old 512, which is pretty much a waste of money at this point for me because I'd buy another 1 GB instead. But this is really informative and good to know.

woody1191, my guess you're right that the old stick passed MemTest86+, but not Prime95. In the next few days I'm going to try to but them BOTH in and see if it BSODs again. My results (I think) would look like this:

If it works fine, then perhaps 512K just wasn't enough RAM (for some sudden reason that I may not ever figure out).

If it doesn't work fine, then a) the 512K stick is bad or b) my mobo can't handle two sticks of different size. Unfortunately I may never know the answer to that if this is the case, but if it works on the new 1 GB stick I'll still be happy!!

Will post back after testing.

Thanks for both your input!!
 
Okay...I now have both the new 1 GB stick in and the old 512 MB stick in with these results of running Prime95:

The first test (Small FFT’s) ran for over 13 hours and never crashed…I had to eventually stop it manually.

The second test (Large FFT’s) crashed after 5 minutes.

The third test (Blend) crashed in under 1 minute.

Also, the BSOD’s are back and so are the IE crashes!

So, it seems to me that one of two things are the case here:
a) The 512 stick is indeed bad.
b) It was crashing before because there wasn’t enough RAM with the just the 512 stick, and it is crashing now because my mobo can’t handle two different sticks that aren’t exactly the same size.

Thoughts? I’m almost positive it’s got to be option a) myself. Since there’s no way to narrow it down between a) and b) conclusively without buying another 512 stick, it looks like this is as far as the testing goes. But that’s fine for me. Either way, placing just the new 1 GB stick in solves my problems!
 
Status
Not open for further replies.
Back