BSOD - kernel stack inpage error - maybe a couple times a week, hard to nail down

Status
Not open for further replies.

lkjhg

Posts: 12   +0
In simple terms, I have been getting the kernel stack inpage error for about a month now - a couple times twice a day, more typically every 3-5 days. Both of my hard drives (C for boot and programs; D for data) have been imaged in the past 30 days without error or incident. Avast! sweeps the computer each night and there are no viruses. I run most of the free malware scans on the computer, and no spyware is found (and I'm a cautious user - it woudl be very hard for me to put malware on the computer, even by accident).

The computer was built from newly-purchased parts in March 2006 and runs Windows XP SP 3. It gets all the Windows updates as they're pushed to users.

I have attached the most recent BSOD dump file. And I ran chkdsk /f /r tonight on the C drive and this is the result I got:
Event Type: Information
Event Source: Winlogon
Event Category: None
Event ID: 1001
Date: 7/12/2010
Time: 7:38:58 PM
User: N/A
Computer: XP
Description:
Checking file system on C:
The type of the file system is NTFS.

A disk check has been scheduled.
Windows will now check the disk.
Cleaning up instance tags for file 0xcccc.
Cleaning up minor inconsistencies on the drive.
Cleaning up 1017 unused index entries from index $SII of file 0x9.
Cleaning up 1017 unused index entries from index $SDH of file 0x9.
Cleaning up 1017 unused security descriptors.
CHKDSK is verifying Usn Journal...
Usn Journal verification completed.
CHKDSK is verifying file data (stage 4 of 5)...
File data verification completed.
CHKDSK is verifying free space (stage 5 of 5)...
Free space verification is complete.
CHKDSK discovered free space marked as allocated in the
master file table (MFT) bitmap.
CHKDSK discovered free space marked as allocated in the volume bitmap.
Windows has made corrections to the file system.

58990644 KB total disk space.
18796004 KB in 74894 files.
58596 KB in 8652 indexes.
0 KB in bad sectors.
261900 KB in use by the system.
65536 KB occupied by the log file.
39874144 KB available on disk.

4096 bytes in each allocation unit.
14747661 total allocation units on disk.
9968536 allocation units available on disk.

Internal Info:
a0 57 01 00 65 46 01 00 b5 cf 01 00 00 00 00 00 .W..eF..........
92 c6 00 00 02 00 00 00 58 0f 00 00 00 00 00 00 ........X.......
4e 80 7a 07 00 00 00 00 e4 0d 8b 5a 00 00 00 00 N.z........Z....
00 62 11 32 00 00 00 00 80 f8 03 95 02 00 00 00 .b.2............
2c 8e 42 fc 01 00 00 00 56 6c ed 2c 05 00 00 00 ,.B.....Vl.,....
e0 77 ca be 00 00 00 00 b0 3b 07 00 8e 24 01 00 .w.......;...$..
00 00 00 00 00 90 37 7b 04 00 00 00 cc 21 00 00 ......7{.....!..

Windows has finished checking your disk.
Please wait while your computer restarts.


For more information, see Help and Support Center at [link deleted].

Aside from the above, here are things I've tried to this point:
  • Reseated the SATA cables for the drives
  • Vacuumed dust from the case
  • Turned off the pagefile, defragged both HDs, turned pagefile back on
  • Updated all drivers

Given all this, any ideas as to what the problem might be? I have attached the most recent dumpfile.
 

Attachments

  • Mini071110-01.dmp
    96 KB · Views: 5
Your error code is 0x00000077: KERNEL_STACK_INPAGE_ERROR
A page of kernel data requested from the pagefile could not be found or read into memory. This message also can indicate disk hardware failure, disk data corruption, or possible virus infection.

The cited probable cause of your system crashes is memory corruption so the first step you need to take is to run memtest on your RAM.

See the link below and follow the instructions. There is a newer version than what is listed; use the newer. If you need to see what the Memtest screen looks like go to reply #21. The third screen is the Memtest screen.

Step1 - Let it run for a LONG time. The rule is a minimum of 7 Passes; the more Passes after 7 so much the better. The only exception is if you start getting errors before 7 Passes then you can skip to Step 2.

There are 8 individual tests per Pass. Many people will start this test before going to bed and check it the next day.

If you have errors you have corrupted memory and it needs to be replaced.

Step 2 – Because of errors you need to run this test per stick of RAM. Take out one and run the test. Then take that one out and put the other in and run the test. If you start getting errors before 7 Passes you know that stick is corrupted and you don’t need to run the test any further on that stick.


Link: https://www.techspot.com/vb/topic62524.html


* Get back to us with the results.


*** If Memtest shows no errors then find the voltage specs of your RAM and compare it to the voltage setting in your BIOS. Do they match?
 
I will start running memtest overnight this evening.

How long might I expect it to run? I have 1 gig of memory (2 x 512) and a Pentium 4 processor, 3.2 GHz.
 
The more gigs of RAM the longer it takes. The reall need however is a minimum of 7 Passes and if you can/will do more so much the better. Start it before going to bed and let it run all night and check it when you get up.

It is quite safe to let it run all night.
 
So I let memtest run for 9 hours and 39 minutes, and it did 23 passes with zero errors. Which is good, right?

My only small concern is that I watched it at the outset and the first pass took about 14 minutes. The next pass took about 25, and if you do the math, the average pass took about 25 minutes. So why was the first pass so much faster than all the other ones?

Aside from the details of memtest though, what now?

Hard drives are fine - I put in the June Windows updates and imaged C with no problems, and I had imaged D a few days back with no problems.

No viruses or malware.

Pagefile defragged (HDs themselves defragged).

chkdsk /f /r did fix some seemingly minor things (output is posted at top of thread), and no BSODs in the time since that.

Drivers are all current.

So what's left? The chkdsk is the only change I have made to things since the last BSOD and maybe that will fix things. But I've had something like 6-8 of these BSODs in the past 30 days and am figuring it's not a question of "if" it's a question of "when" the next one hits me.
 
Did you do this step: *** If Memtest shows no errors then find the voltage specs of your RAM and compare it to the voltage setting in your BIOS. Do they match?
 
Did you do this step: *** If Memtest shows no errors then find the voltage specs of your RAM and compare it to the voltage setting in your BIOS. Do they match?

The short answer is that I have not done that step.

The longer answer is that there were three years from when the computer was built to when any BSOD ever appeared, and I'd be surprised (ignorantly - I don't claim knowledge of this stuff) if the voltages were different, given how relatively rare these BSODs are, even now.

Since I try to open the box as rarely as possible, I'll comb through my invoice for the original memory to find its name and specs.
 
Try running the test with one stick at a time. Bad memory has been known from time to time to pass this test.
 
the CHKDSK issues corrected *could* have induce the Stack Overflow.

watch the system and document when the next Overflow occurs.
once restarted, rerun the CHKDSK and note any more HD corruption
 
the CHKDSK issues corrected *could* have induce the Stack Overflow.

watch the system and document when the next Overflow occurs.
once restarted, rerun the CHKDSK and note any more HD corruption

Interesting... it's now nine days since my last BSOD. My chkdsk results for C are posted at the top of the thread, and a few days after that I ran a chkdsk on D as well. Same types of fixes as for C, although fewer fixes on a much larger drive.

None of the errors were media errors, just index entries and security descriptors.

I probably won't fully relax until I've gone a month without an error, but nine days of full uptime is a significant step forward, and at least I can say that yes, I did something substantive (the chkdsk) that might be behind the fix.
 
What a totally totally frustrating situation!

I have a friend very adept in PC hardware work, mods, optimizations, all that.

So despite my PC going 9+ days with no BSOD, I accepted his invitation to have him work on the computer.

Fundamentally, I'm glad I did. He removed the never-used floppy drive and its long cable, freeing up space and improving airflow. He moved the drives (again two HDs and a DVD drive so each had plenty of space (one or more empty bays) above and below it. He even discovered that the power connector plugged to my D drive was warped (possibly from all the heat), so he took a good and previously unused connector from the PS and plugged that in.

He also experimented and found that because of the positioning of the fans in the case, relative to the vent holes, it was cooler in the box with the side vents covered (basically, the hot hardware was at the front of the case, and the fans were maximizing cool airflow through the back of the case - covering the vents shifts the air draw to the front of the case).

The D drive had been running at 135-139 F and it's rated at 131 F max. After the mods, it's running 30+ degrees cooler.

So I came home all happy - until this morning when I had another BSOD waiting for me - first in ten days. Then on reboot, it didn't find my D drive, although it booted fine and browsed the web fine. I powered it down for a few minutes, restarted, it found D, and all was normal.

But I'm left wondering why I'm getting any BSOD (read the whole thread for a more full history) and I'm wondering why on a small number of occasions, it doesn't find the D drive.
 
I'm sorry for replying to my own posts, but I keep wanting to update with new information as it happens.

After the previous post, I successfully booted, went to work, and came home to find several disturbing error messages, of the type I'd never ever seen before, basically saying that this or that file on D could not be found.

Checking the event log, there were several messages to the effect that sisraid could not find the device (D). Then the O/S told me that I needed to run chkdsk on D - and in the middle of that, another BSOD. It finally ran an okay chkdsk on restart, booted fine, and then I was getting lots of clinking and BSODs.

Opened the computer, found a loose SATA cable (loose connection to MB), fixed that, had several okay boots, closed the box up, and on startup, it said it couldn't find the boot device. Restarted and it booted fine. I was on it for 15 minutes and I shut it to black.

That was all last night.

This evening, it booted normally, then 20 minutes later, no warning, a BSOD.

Odd thing - all the errors in the last two days have been kernel data inpage errors. Prior to that and going back 14 months, all BSODs had been kernel stack inpage errors. What's the difference between the two, and why would I suddenly be getting the kernel data errors?

I am fearing that it's the drive controller(s) on my ASUS motherboard, and am looking at picking up an add-in SATA card for the computer.

Thoughts?
 
Minidump from most recent crash is attached. What does it mean?
 

Attachments

  • Mini072310-01.dmp
    96 KB · Views: 1
Okay, this one has some mystery to is. The driver cited as the cause of your issues is sptd6605.sys. I read the minidump file twice to make sure I wrote it down correctly.

Now here is the issue: There is absolutely nothing on the web concerning this driver.

sptd.sys, however, is a Daemon Tools driver that we have seen many people have issues with over the last four years. It could be sptd6605.sys is another driver belonging to the same software, but we don't know this with 100% assurance.

I suggest you a) temporarily uninstall Daemon Tools and b) do a full security scan.
 
Okay, I've uninstalled Daemon Tools. When you say a full security scan, are you talking AV, malware, what?

I have no memory of ever installing Daemon Tools and no idea why I ever did. I doubt I used it ever. What's it supposed to do?
 
In scanning, the system crashed again, this time with a kernel stack inpage error (previous crashes this week had been kernel data inpage errors). I have attached the latest dump.

It turns out that yes, I did install, almost four years ago, Daemon tools. Never used it - it didn't do what I needed - but I didn't uninstall it either. So that was a blind alley.

At the top of the thread I stated that I am rigorous in my computing - very suspicious of what I click on in terms of web links or attachments - and I scan the computer nightly for viruses using avast. Also lots of spyware scans (ad-aware, spybot, super antispyware, etc.). Never find anything.

Today I have disconnected my D (data) drive. I am expecting that I will eventually get another BSOD. This would likely mean it's either the controller (more likely) or the C drive itself (less likely). If I'm suddenly stable, it would point to the D drive, or to the SATA cable for the drive, or something with the connection.

Again, bears repeating that I have fully imaged the C and D drives in the past six weeks, while all this nonsense has been going on. DriveImage, the program I use, has balked before when imaging drives with a bad sector or whatever. But it sailed right through a full image of both of these drives.
 

Attachments

  • Mini072410-01.dmp
    96 KB · Views: 2
I think the problem has been identified

I brought in a friend who has helped me before on computer issues, and I think I'm moving in the right direction.

We had the D drive disconnected and it ran flawlessly for several days. Put in a new drive controller and hooked C and D to it, and in copying some files from D heard clicking. Within a few hours, we were getting errors on D.

We put a different, "known working" drive in the system in place of D, restored a recent image of D onto it, and it's now been running, full functionality, no event errors, no BSODs, anything, coming up on 72 hours.

My friend has more than 30 years of experience, professional and hobbyist, working with computers and said that this is the "strangest drive failure" he's ever seen - in that the drive didn't just seize up and fail in an instant or over days or weeks. It's been sending out signals going back 18 months that something was amiss.

I'll run the computer on the borrowed drive for another week and if no more problems, will just buy a new D drive to pop in.
 
That is strange especially in the light of the fact your minidumps cited memory corruption! :confused:
 
Status
Not open for further replies.
Back