TechSpot

Intermittent BSOD Problem

By DocF
Aug 27, 2006
  1. Hello.

    I have had an intermittent BSOD problem for a while now and I would like to get to the bottom of what is causing it, as it is slowly driving me mad.

    The Situation
    I build up my own pcs gradually when certain parts become old or too slow. My current system is good spec:
    AMD Athlon 64 3200+
    1GB PC3200 DDR
    MSI K8N Neo mobo
    X800XL AGP 256Mb Graphics
    Mobo Onboard sound feature Realtek AC97

    The problem is this:
    When playing 3dgames (noticeably World of Warcraft and Half-Life 2) I sometimes get a BSOD which is BLANK, i.e. no text just a blue screen.

    I have to reboot and load up my mini-dumps using WinDbg to actually find out what caused the BSOD. I have had about 30 crashes over the past 3 months and they all either point to ati2dvag.dll or ati3duag.dll.

    As far as I know these are ATI driver files in their cat drivers. These BSOD’s have been ongoing since I purchased the card and first installed cat 5.5v. Every time I install a new version I do it by the book going into safemode, deleting old versions etc. But it doesn’t seem to help, eventually after a reinstall a blank BSOD is inevitable.

    What I Have Tried
    First of all I wanted to check out the ram since I know BSOD’s are a common result of bad memory. I have run both memtest64 and prime95 for 8 hours each and both have come up with zero errors.

    Another possibility was my temperatures. As im writing this my pc has been on for about 10hours, played many games and the core is at about 50C.

    I have been through the power calculation tool and it says I need no more than 300W for my setup. I have a Chieftec 350W PSU Model: HPC-340-201 (about 3 years old from a AMD 1800+ system) still in place, and after checking voltages using SystemFan the +12V seems normal going between 12.07 and 11.85.

    I have tried switching the ram sticks around but no joy.

    What is curious is the BLANK BSOD, and is this a symptom of a specific problem? I am at a loose end now and any advice would be much appreciated. As you can imagine I would like to find the problem and not resort to buy a brand new system to relieve my torment :). As stated earlier this is intermittant. I could go days without a BSOD then get three within 3 hours. When the computer is idling is runs like a dream, only problem is when I game.

    Thanks in advance,

    DF
     
  2. fastco

    fastco TS Booster Posts: 1,122

    Going into safe mode is good but try using Driver Cleaner (in safe mode) to remove the old drivers completely, if that is what is causing the crashes. Could actually be the Video Card memory that is faulty.
    http://www.drivercleaner.net/professional.php
     
  3. DocF

    DocF TS Rookie Topic Starter

    Hi fastco, thanks for the swift reply. I have used Driver Cleaner with the more recent versions of Cat drivers, alas to no avail. *scrathes head* Is there any way to test the memory of the card?
     
  4. shinobi101

    shinobi101 TS Rookie

     
  5. DocF

    DocF TS Rookie Topic Starter


    Thanks for th reply shinobi101. the diffrence between mine and yours is that it's not every time i play a game. I'm guranteed at least an hour (normaly more) of game play before a BSOD occurs, and like i said i could go days without one occuring.

    If it was a PSU problem I'm thinking that the BSOD's would be more predictable and frequent. I think my best option is install the card in another machine and see how it goes. I can also revert back to my nvidia5600 (yucky :( ) and see if i get BSOD's with that. I'm hoping my mobo isnt a factor in this, I dont see how i can diagnose a problem with that, btw mobo is up to date with latest drivers and BIOS, it uses nForce3 chipset.
     
  6. DocF

    DocF TS Rookie Topic Starter

    Been a bad day today, had 3 reboots.

    This one occured while playing World of Warcraft.

    *******************************************************************************
    * *
    * Bugcheck Analysis *
    * *
    *******************************************************************************

    KERNEL_MODE_EXCEPTION_NOT_HANDLED_M (1000008e)
    This is a very common bugcheck. Usually the exception address pinpoints
    the driver/function that caused the problem. Always note this address
    as well as the link date of the driver/image that contains this address.
    Some common problems are exception code 0x80000003. This means a hard
    coded breakpoint or assertion was hit, but this system was booted
    /NODEBUG. This is not supposed to happen as developers should never have
    hardcoded breakpoints in retail code, but ...
    If this happens, make sure a debugger gets connected, and the
    system is booted /DEBUG. This will let us see why this breakpoint is
    happening.
    Arguments:
    Arg1: c0000005, The exception code that was not handled
    Arg2: bf17b437, The address that the exception occurred at
    Arg3: f4b5eaac, Trap Frame
    Arg4: 00000000

    Debugging Details:
    ------------------


    EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

    FAULTING_IP:
    ati3duag+9c437
    bf17b437 8b00 mov eax,[eax]

    TRAP_FRAME: f4b5eaac -- (.trap fffffffff4b5eaac)
    ErrCode = 00000000
    eax=65faa428 ebx=00000004 ecx=e1ceb8a8 edx=00000004 esi=e15ac010 edi=e614e988
    eip=bf17b437 esp=f4b5eb20 ebp=e15ae7ec iopl=0 nv up ei ng nz na po nc
    cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010286
    ati3duag+0x9c437:
    bf17b437 8b00 mov eax,[eax] ds:0023:65faa428=????????
    Resetting default scope

    CUSTOMER_CRASH_COUNT: 3

    DEFAULT_BUCKET_ID: DRIVER_FAULT

    BUGCHECK_STR: 0x8E

    LAST_CONTROL_TRANSFER: from e614a4d0 to bf17b437

    STACK_TEXT:
    WARNING: Stack unwind information not available. Following frames may be wrong.
    e15ae7ec e614a4d0 00000001 00000003 00000003 ati3duag+0x9c437
    e15ae7f0 00000000 00000003 00000003 00000001 0xe614a4d0


    STACK_COMMAND: .bugcheck ; kb

    FOLLOWUP_IP:
    ati3duag+9c437
    bf17b437 8b00 mov eax,[eax]

    FAULTING_SOURCE_CODE:


    SYMBOL_STACK_INDEX: 0

    FOLLOWUP_NAME: MachineOwner

    SYMBOL_NAME: ati3duag+9c437

    MODULE_NAME: ati3duag

    IMAGE_NAME: ati3duag.dll

    DEBUG_FLR_IMAGE_TIMESTAMP: 44d11f6d

    FAILURE_BUCKET_ID: 0x8E_ati3duag+9c437

    BUCKET_ID: 0x8E_ati3duag+9c437

    Followup: MachineOwner
    ---------

    Again while playing World of Warcraft a few hours earlier.

    *******************************************************************************
    * *
    * Bugcheck Analysis *
    * *
    *******************************************************************************

    KERNEL_MODE_EXCEPTION_NOT_HANDLED_M (1000008e)
    This is a very common bugcheck. Usually the exception address pinpoints
    the driver/function that caused the problem. Always note this address
    as well as the link date of the driver/image that contains this address.
    Some common problems are exception code 0x80000003. This means a hard
    coded breakpoint or assertion was hit, but this system was booted
    /NODEBUG. This is not supposed to happen as developers should never have
    hardcoded breakpoints in retail code, but ...
    If this happens, make sure a debugger gets connected, and the
    system is booted /DEBUG. This will let us see why this breakpoint is
    happening.
    Arguments:
    Arg1: c0000005, The exception code that was not handled
    Arg2: bf17b437, The address that the exception occurred at
    Arg3: a3b8aaac, Trap Frame
    Arg4: 00000000

    Debugging Details:
    ------------------


    EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

    FAULTING_IP:
    ati3duag+9c437
    bf17b437 8b00 mov eax,[eax]

    TRAP_FRAME: a3b8aaac -- (.trap ffffffffa3b8aaac)
    ErrCode = 00000000
    eax=61754d40 ebx=00000002 ecx=e10cf8a8 edx=00000004 esi=e4c23010 edi=e5d45988
    eip=bf17b437 esp=a3b8ab20 ebp=e4c25724 iopl=0 nv up ei ng nz na po nc
    cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010286
    ati3duag+0x9c437:
    bf17b437 8b00 mov eax,[eax] ds:0023:61754d40=????????
    Resetting default scope

    CUSTOMER_CRASH_COUNT: 2

    DEFAULT_BUCKET_ID: DRIVER_FAULT

    BUGCHECK_STR: 0x8E

    LAST_CONTROL_TRANSFER: from e5c0a988 to bf17b437

    STACK_TEXT:
    WARNING: Stack unwind information not available. Following frames may be wrong.
    e4c25724 e5c0a988 00000001 00000003 00000003 ati3duag+0x9c437
    e4c25728 00000000 00000003 00000003 00000001 0xe5c0a988


    STACK_COMMAND: .bugcheck ; kb

    FOLLOWUP_IP:
    ati3duag+9c437
    bf17b437 8b00 mov eax,[eax]

    FAULTING_SOURCE_CODE:


    SYMBOL_STACK_INDEX: 0

    FOLLOWUP_NAME: MachineOwner

    SYMBOL_NAME: ati3duag+9c437

    MODULE_NAME: ati3duag

    IMAGE_NAME: ati3duag.dll

    DEBUG_FLR_IMAGE_TIMESTAMP: 44d11f6d

    FAILURE_BUCKET_ID: 0x8E_ati3duag+9c437

    BUCKET_ID: 0x8E_ati3duag+9c437

    Followup: MachineOwner
    ---------

    this one happened in the early hours, again while playing WoW.

    *******************************************************************************
    * *
    * Bugcheck Analysis *
    * *
    *******************************************************************************

    KERNEL_MODE_EXCEPTION_NOT_HANDLED_M (1000008e)
    This is a very common bugcheck. Usually the exception address pinpoints
    the driver/function that caused the problem. Always note this address
    as well as the link date of the driver/image that contains this address.
    Some common problems are exception code 0x80000003. This means a hard
    coded breakpoint or assertion was hit, but this system was booted
    /NODEBUG. This is not supposed to happen as developers should never have
    hardcoded breakpoints in retail code, but ...
    If this happens, make sure a debugger gets connected, and the
    system is booted /DEBUG. This will let us see why this breakpoint is
    happening.
    Arguments:
    Arg1: c0000005, The exception code that was not handled
    Arg2: bf17b437, The address that the exception occurred at
    Arg3: a4ba2aac, Trap Frame
    Arg4: 00000000

    Debugging Details:
    ------------------


    EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

    FAULTING_IP:
    ati3duag+9c437
    bf17b437 8b00 mov eax,[eax]

    TRAP_FRAME: a4ba2aac -- (.trap ffffffffa4ba2aac)
    ErrCode = 00000000
    eax=6174cd40 ebx=00000003 ecx=e1100730 edx=00000004 esi=e14fb010 edi=e2a1c988
    eip=bf17b437 esp=a4ba2b20 ebp=e14fd788 iopl=0 nv up ei ng nz na po nc
    cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010286
    ati3duag+0x9c437:
    bf17b437 8b00 mov eax,[eax] ds:0023:6174cd40=????????
    Resetting default scope

    CUSTOMER_CRASH_COUNT: 2

    DEFAULT_BUCKET_ID: DRIVER_FAULT

    BUGCHECK_STR: 0x8E

    LAST_CONTROL_TRANSFER: from 00000000 to bf17b437

    STACK_TEXT:
    e14fd788 00000000 00000001 00000003 00000003 ati3duag+0x9c437


    STACK_COMMAND: .bugcheck ; kb

    FOLLOWUP_IP:
    ati3duag+9c437
    bf17b437 8b00 mov eax,[eax]

    FAULTING_SOURCE_CODE:


    SYMBOL_STACK_INDEX: 0

    FOLLOWUP_NAME: MachineOwner

    SYMBOL_NAME: ati3duag+9c437

    MODULE_NAME: ati3duag

    IMAGE_NAME: ati3duag.dll

    DEBUG_FLR_IMAGE_TIMESTAMP: 44d11f6d

    FAILURE_BUCKET_ID: 0x8E_ati3duag+9c437

    BUCKET_ID: 0x8E_ati3duag+9c437

    Followup: MachineOwner
    ---------
     
  7. fastco

    fastco TS Booster Posts: 1,122

    Is that the same video card? If so I would definately swap it out for something else.
     
  8. DocF

    DocF TS Rookie Topic Starter

    Yes its the ATI, system unchanged.

    Thanks for advice, I think thats the only way forward atm.
     
  9. DocF

    DocF TS Rookie Topic Starter

    Another thing I have noticed is that when computer is playing games the +12V rail varies alot between 12.07V and 11.87V while when computer is idling it hardly changes. Is this normal behavior?

    EDIT:
    Been playing this game now for about 20mins and its dropped to a constant 11.85V, hmmmmm.

    UPDATE:
    After about 2hours gaming in WoW its dropped to 11.80V and theres a few spikes to 11.72V. And now ive quit the game its gone back up to 12V.
     
  10. mailpup

    mailpup TS Special Forces Posts: 6,979   +362

    Those sound like symptoms of a power supply that is having trouble keeping up with the power demands of your system. Do you have a spare to swap? What is its amps on the +12v?
     
  11. DocF

    DocF TS Rookie Topic Starter

    Hi mailpup,

    Direct from the label on the side, the +12V is rated at 15A.

    PSU model: Chieftec HPC-340-201 ATX12V (with PFC). Max load of 340W.

    Unfortunatly I dont have another, it's about 3 years old kept from my older system of an AMD +1800XP, Geforce5600.
     
  12. wolfram

    wolfram TechSpot Paladin Posts: 1,967   +9

    15A for a system like that (pretty good actually), is not enough. The x800XL is a somewhat power-hungry card.

    I'd recommend a new PSU like THIS for your computer.
     
  13. DocF

    DocF TS Rookie Topic Starter

    Hi. I've been happily living with this problem so some time now. I've been checking the PSU rails using both speedfan and my multimeter under load. the +12V rail goes to about 11.85 when playing games, I think this is acceptable.

    I havent got a new psu yet as im in the process of getting a new case.

    However I attach a screenshot from speedfan regarding my 3.3V rail. This was during normal operation (i.e browsing the internet, word processing). Is that fluctuation acceptable? It didnt cause a bsod, but dont agp graphics card use the 3.3V mobo line? Even when running in normal operation i feel it is low at 3.10V.

    Thanks.
    DF
     
Topic Status:
Not open for further replies.

Similar Topics

Add New Comment

You need to be a member to leave a comment. Join thousands of tech enthusiasts and participate.
TechSpot Account You may also...