Intermittent BSOD Problem

Status
Not open for further replies.

DocF

Posts: 8   +0
Hello.

I have had an intermittent BSOD problem for a while now and I would like to get to the bottom of what is causing it, as it is slowly driving me mad.

The Situation
I build up my own pcs gradually when certain parts become old or too slow. My current system is good spec:
AMD Athlon 64 3200+
1GB PC3200 DDR
MSI K8N Neo mobo
X800XL AGP 256Mb Graphics
Mobo Onboard sound feature Realtek AC97

The problem is this:
When playing 3dgames (noticeably World of Warcraft and Half-Life 2) I sometimes get a BSOD which is BLANK, i.e. no text just a blue screen.

I have to reboot and load up my mini-dumps using WinDbg to actually find out what caused the BSOD. I have had about 30 crashes over the past 3 months and they all either point to ati2dvag.dll or ati3duag.dll.

As far as I know these are ATI driver files in their cat drivers. These BSOD’s have been ongoing since I purchased the card and first installed cat 5.5v. Every time I install a new version I do it by the book going into safemode, deleting old versions etc. But it doesn’t seem to help, eventually after a reinstall a blank BSOD is inevitable.

What I Have Tried
First of all I wanted to check out the ram since I know BSOD’s are a common result of bad memory. I have run both memtest64 and prime95 for 8 hours each and both have come up with zero errors.

Another possibility was my temperatures. As im writing this my pc has been on for about 10hours, played many games and the core is at about 50C.

I have been through the power calculation tool and it says I need no more than 300W for my setup. I have a Chieftec 350W PSU Model: HPC-340-201 (about 3 years old from a AMD 1800+ system) still in place, and after checking voltages using SystemFan the +12V seems normal going between 12.07 and 11.85.

I have tried switching the ram sticks around but no joy.

What is curious is the BLANK BSOD, and is this a symptom of a specific problem? I am at a loose end now and any advice would be much appreciated. As you can imagine I would like to find the problem and not resort to buy a brand new system to relieve my torment :). As stated earlier this is intermittant. I could go days without a BSOD then get three within 3 hours. When the computer is idling is runs like a dream, only problem is when I game.

Thanks in advance,

DF
 
Hi fastco, thanks for the swift reply. I have used Driver Cleaner with the more recent versions of Cat drivers, alas to no avail. *scrathes head* Is there any way to test the memory of the card?
 
These BSOD’s have been ongoing since I purchased the card and first installed cat 5.5v.
the problem could be ur card.. try using ur card on another system and see if it still goes to BSOD when playing 3D games.. or it could also be ur PSU.. some capacitor going old.. tsk3x..

its the same with my system.. everytime i play 3d games.. it restarts or a BSOD shows up.. iv uninstalled the AGP card but still it restarts.. so im thinking the PSU is broken since when i used WinDiag to test my ram.. it did not report any error.. so iv come to a conclusion that the PSU is either broken or it could not support ur Video Card power..
 
shinobi101 said:
its the same with my system.. everytime i play 3d games.. it restarts or a BSOD shows up.. iv uninstalled the AGP card but still it restarts.. so im thinking the PSU is broken since when i used WinDiag to test my ram.. it did not report any error.. so iv come to a conclusion that the PSU is either broken or it could not support ur Video Card power..


Thanks for th reply shinobi101. the diffrence between mine and yours is that it's not every time i play a game. I'm guranteed at least an hour (normaly more) of game play before a BSOD occurs, and like i said i could go days without one occuring.

If it was a PSU problem I'm thinking that the BSOD's would be more predictable and frequent. I think my best option is install the card in another machine and see how it goes. I can also revert back to my nvidia5600 (yucky :( ) and see if i get BSOD's with that. I'm hoping my mobo isnt a factor in this, I dont see how i can diagnose a problem with that, btw mobo is up to date with latest drivers and BIOS, it uses nForce3 chipset.
 
Been a bad day today, had 3 reboots.

This one occured while playing World of Warcraft.

*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

KERNEL_MODE_EXCEPTION_NOT_HANDLED_M (1000008e)
This is a very common bugcheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Some common problems are exception code 0x80000003. This means a hard
coded breakpoint or assertion was hit, but this system was booted
/NODEBUG. This is not supposed to happen as developers should never have
hardcoded breakpoints in retail code, but ...
If this happens, make sure a debugger gets connected, and the
system is booted /DEBUG. This will let us see why this breakpoint is
happening.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: bf17b437, The address that the exception occurred at
Arg3: f4b5eaac, Trap Frame
Arg4: 00000000

Debugging Details:
------------------


EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

FAULTING_IP:
ati3duag+9c437
bf17b437 8b00 mov eax,[eax]

TRAP_FRAME: f4b5eaac -- (.trap fffffffff4b5eaac)
ErrCode = 00000000
eax=65faa428 ebx=00000004 ecx=e1ceb8a8 edx=00000004 esi=e15ac010 edi=e614e988
eip=bf17b437 esp=f4b5eb20 ebp=e15ae7ec iopl=0 nv up ei ng nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010286
ati3duag+0x9c437:
bf17b437 8b00 mov eax,[eax] ds:0023:65faa428=????????
Resetting default scope

CUSTOMER_CRASH_COUNT: 3

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0x8E

LAST_CONTROL_TRANSFER: from e614a4d0 to bf17b437

STACK_TEXT:
WARNING: Stack unwind information not available. Following frames may be wrong.
e15ae7ec e614a4d0 00000001 00000003 00000003 ati3duag+0x9c437
e15ae7f0 00000000 00000003 00000003 00000001 0xe614a4d0


STACK_COMMAND: .bugcheck ; kb

FOLLOWUP_IP:
ati3duag+9c437
bf17b437 8b00 mov eax,[eax]

FAULTING_SOURCE_CODE:


SYMBOL_STACK_INDEX: 0

FOLLOWUP_NAME: MachineOwner

SYMBOL_NAME: ati3duag+9c437

MODULE_NAME: ati3duag

IMAGE_NAME: ati3duag.dll

DEBUG_FLR_IMAGE_TIMESTAMP: 44d11f6d

FAILURE_BUCKET_ID: 0x8E_ati3duag+9c437

BUCKET_ID: 0x8E_ati3duag+9c437

Followup: MachineOwner
---------

Again while playing World of Warcraft a few hours earlier.

*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

KERNEL_MODE_EXCEPTION_NOT_HANDLED_M (1000008e)
This is a very common bugcheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Some common problems are exception code 0x80000003. This means a hard
coded breakpoint or assertion was hit, but this system was booted
/NODEBUG. This is not supposed to happen as developers should never have
hardcoded breakpoints in retail code, but ...
If this happens, make sure a debugger gets connected, and the
system is booted /DEBUG. This will let us see why this breakpoint is
happening.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: bf17b437, The address that the exception occurred at
Arg3: a3b8aaac, Trap Frame
Arg4: 00000000

Debugging Details:
------------------


EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

FAULTING_IP:
ati3duag+9c437
bf17b437 8b00 mov eax,[eax]

TRAP_FRAME: a3b8aaac -- (.trap ffffffffa3b8aaac)
ErrCode = 00000000
eax=61754d40 ebx=00000002 ecx=e10cf8a8 edx=00000004 esi=e4c23010 edi=e5d45988
eip=bf17b437 esp=a3b8ab20 ebp=e4c25724 iopl=0 nv up ei ng nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010286
ati3duag+0x9c437:
bf17b437 8b00 mov eax,[eax] ds:0023:61754d40=????????
Resetting default scope

CUSTOMER_CRASH_COUNT: 2

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0x8E

LAST_CONTROL_TRANSFER: from e5c0a988 to bf17b437

STACK_TEXT:
WARNING: Stack unwind information not available. Following frames may be wrong.
e4c25724 e5c0a988 00000001 00000003 00000003 ati3duag+0x9c437
e4c25728 00000000 00000003 00000003 00000001 0xe5c0a988


STACK_COMMAND: .bugcheck ; kb

FOLLOWUP_IP:
ati3duag+9c437
bf17b437 8b00 mov eax,[eax]

FAULTING_SOURCE_CODE:


SYMBOL_STACK_INDEX: 0

FOLLOWUP_NAME: MachineOwner

SYMBOL_NAME: ati3duag+9c437

MODULE_NAME: ati3duag

IMAGE_NAME: ati3duag.dll

DEBUG_FLR_IMAGE_TIMESTAMP: 44d11f6d

FAILURE_BUCKET_ID: 0x8E_ati3duag+9c437

BUCKET_ID: 0x8E_ati3duag+9c437

Followup: MachineOwner
---------

this one happened in the early hours, again while playing WoW.

*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

KERNEL_MODE_EXCEPTION_NOT_HANDLED_M (1000008e)
This is a very common bugcheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Some common problems are exception code 0x80000003. This means a hard
coded breakpoint or assertion was hit, but this system was booted
/NODEBUG. This is not supposed to happen as developers should never have
hardcoded breakpoints in retail code, but ...
If this happens, make sure a debugger gets connected, and the
system is booted /DEBUG. This will let us see why this breakpoint is
happening.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: bf17b437, The address that the exception occurred at
Arg3: a4ba2aac, Trap Frame
Arg4: 00000000

Debugging Details:
------------------


EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

FAULTING_IP:
ati3duag+9c437
bf17b437 8b00 mov eax,[eax]

TRAP_FRAME: a4ba2aac -- (.trap ffffffffa4ba2aac)
ErrCode = 00000000
eax=6174cd40 ebx=00000003 ecx=e1100730 edx=00000004 esi=e14fb010 edi=e2a1c988
eip=bf17b437 esp=a4ba2b20 ebp=e14fd788 iopl=0 nv up ei ng nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010286
ati3duag+0x9c437:
bf17b437 8b00 mov eax,[eax] ds:0023:6174cd40=????????
Resetting default scope

CUSTOMER_CRASH_COUNT: 2

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0x8E

LAST_CONTROL_TRANSFER: from 00000000 to bf17b437

STACK_TEXT:
e14fd788 00000000 00000001 00000003 00000003 ati3duag+0x9c437


STACK_COMMAND: .bugcheck ; kb

FOLLOWUP_IP:
ati3duag+9c437
bf17b437 8b00 mov eax,[eax]

FAULTING_SOURCE_CODE:


SYMBOL_STACK_INDEX: 0

FOLLOWUP_NAME: MachineOwner

SYMBOL_NAME: ati3duag+9c437

MODULE_NAME: ati3duag

IMAGE_NAME: ati3duag.dll

DEBUG_FLR_IMAGE_TIMESTAMP: 44d11f6d

FAILURE_BUCKET_ID: 0x8E_ati3duag+9c437

BUCKET_ID: 0x8E_ati3duag+9c437

Followup: MachineOwner
---------
 
Is that the same video card? If so I would definately swap it out for something else.
 
Yes its the ATI, system unchanged.

Thanks for advice, I think thats the only way forward atm.
 
Another thing I have noticed is that when computer is playing games the +12V rail varies alot between 12.07V and 11.87V while when computer is idling it hardly changes. Is this normal behavior?

EDIT:
Been playing this game now for about 20mins and its dropped to a constant 11.85V, hmmmmm.

UPDATE:
After about 2hours gaming in WoW its dropped to 11.80V and theres a few spikes to 11.72V. And now ive quit the game its gone back up to 12V.
 
Those sound like symptoms of a power supply that is having trouble keeping up with the power demands of your system. Do you have a spare to swap? What is its amps on the +12v?
 
Hi mailpup,

Direct from the label on the side, the +12V is rated at 15A.

PSU model: Chieftec HPC-340-201 ATX12V (with PFC). Max load of 340W.

Unfortunatly I dont have another, it's about 3 years old kept from my older system of an AMD +1800XP, Geforce5600.
 
15A for a system like that (pretty good actually), is not enough. The x800XL is a somewhat power-hungry card.

I'd recommend a new PSU like THIS for your computer.
 
Hi. I've been happily living with this problem so some time now. I've been checking the PSU rails using both speedfan and my multimeter under load. the +12V rail goes to about 11.85 when playing games, I think this is acceptable.

I havent got a new psu yet as im in the process of getting a new case.

However I attach a screenshot from speedfan regarding my 3.3V rail. This was during normal operation (i.e browsing the internet, word processing). Is that fluctuation acceptable? It didnt cause a bsod, but dont agp graphics card use the 3.3V mobo line? Even when running in normal operation i feel it is low at 3.10V.

Thanks.
DF
 
Status
Not open for further replies.
Back