GPU Driver or Hardware Issue?

Hello folks,



Hopefully I am in the right section as this issue appeared after my GPU upgrade from a RTX 2080Ti to RTX 3090.

I am encountering an issue with one my most played games (World of Tanks), the client is crashing into a black screen at random times (sound and other programs still run in the background - can use TeamSpeak for example).

This issue only happens in World of Tanks and is reproductible while in battle, does not happen while in idle or just browsing around in the garage interface. No other game I have tried so far gave me any crash - DX11 or 12.

*Did notice a weird thing that went away after the last driver update (v.461.09), the in-game gamma settings were greyed out - so I thought I’d mention this anyways.

I am at a loss here as I have run out of ideas and I can’t seem to figure out if this is an issue with hardware or is it somehow tied to the way the game works.

The crashes began mid-November and since then I have contacted the player and technical support teams @Wargaming – the company behind World of Tanks creating multiple tickets that lead nowhere so far.

My rig config – pc part picker link here.

Age of the computer (hardware) ranging from new – GPU to 1 and a half years old for most of the system hardware.

OS wise the system was installed around 08/19.



Here is a list of things I tried after reading around on the internet on various forums and asking all my tech-savvy friends for help:

Updated/Rolled back the GPU drivers using DDU in Safe mode – Not connected to the internet / Windows Drivers Updates off > tried the versions (Studio and Game Ready): 442.74*did not recognize my GPU as expected* - 456.38 - 457.30 - 460.97 (beta) and the latest drivers as well - now running Game Ready 461.09.

Made sure my temps are low and there is no overheating - at an ambient temp of roughly 23-4 C degrees - I get to a maximum 71C on GPU and 68/70 CPU side during stress test or even lower temps when just under intense gaming (CoD – Cold War / Metro Exodus for example) - and high 50 C CPU / low 60C on the GPU.

Cleaned the registry using CCleaner.

Modified the page file system to a higher value and reverted to windows default (32 GB of physical RAM memory should be more than enough- but hey I gave it a shot anyway)

Updated the latest chipset drivers from AMD site.

Updated the GPU firmware and latest Aorus software.

Updated to the latest BIOS (now running F31j)

Disabled the XMP Profile in BIOS / and using all stock settings in BIOS.

Updated to the latest Creative Sound Blaster sound card drivers

Used the windows default recommended sound settings (suggested by the Wargaming Support team)

Rebooted my router & network devices for a few minutes (suggested by the Wargaming Support team)

In Nvidia Control Panel set the game to use Maximum Performance mode as some user recommended in a similar issue with Apex Legends.

Disabled G-Sync + gave it a go with V Sync on and off (Using a G-Sync monitor @ 1440p 165hz)

Downclocked my GPU using the Aorus tool to the reference RTX 3090 specs.

Enabled and tested Low Latency mode

Updated to the latest chipset from the AMD site.

Tried running the game with no Antialiasing / Off in NVCP as well.

Launched the game in Windowed - Windowed borderless - Fullscreen

Lowered the resolution from 1440p to 1080p and 720p

Tried to run the game with the lowest graphical settings in all the above resolutions

Since the game client has the option – tried both x32/x64 clients

Tried Compatibility mode Windows 7 & 8

Launched the game in safe mode

Used both HD and SD client (lower end machines version)

Ran the game integrity check (suggested by the Wargaming Support team)

Added the game as a firewall exception (suggested by the Wargaming Support team)

Removed the Appdata WoT folder that contains all the game settings (suggested by the Wargaming Support team)

Reinstalled the game multiple times

Made sure the DirectX Suite is up to date

Uninstalled Easy Tune Engine Service/ Team Viewer (as suggested by the Wargaming Support team)

Disabled & Uninstalled iCue / Aorus Software / RGB Fusion 2 (suggested by the Wargaming Support team)

Clean booted - Windows Default Only

Deleted the cache Nvidia Shader Cache

Disabled onboard sound in Device Manager

Used a TDR Manipulator and disabled / reenabled it.

Updated my Windows 10 to the latest build 20H2

Rather recently replaced my PSU unit (end of 2019)

Ran a 3D Mark benchmark (Firestrike Demo) while logging with Hwinfo64 * Sensors Only – Log here (3D Mark Results + Hwinfo64 Log) (advised by the user Greybear on Nvidia Forums)

After the latest tests (here) advised by the user Greybear in my post on the Nvidia Forums the readings in the log file he points out there is an anomaly, you can see it in the Images in the 12v LOW column "" 0 "" .

Open the Google Drive Link above access the .csv file and scroll across to column LV - MD - see all the 0 entries.... notice that the 12 readings appear in the wrong column on some of the 0 reading in the LV column.

He recommended that I set the RAIL performance to SINGLE RAIL on my PSU.



My knowledge on PSU’s is quite limited - Any thoughts on that?



*I will redo the benchmarks and log the info as soon as I get home from work. *

After digging for any leads in my crash logs, I found the following information into the game’s crash log and Windows Event logger:



@python Log here World of Tanks client:

INFO: [Info] FATAL ERROR: RenderSystemDeviceAccess::genericInterfaceCallFailureHandler - device in unsupported state: The device has been removed.”



This seems to be synchronized time stamp wise - with the report from the Windows Event logger that points to the nvlddmkm file stopping to work:



" Event id - 1062600691 in Source "nvlddmkm" cannot be found. The local computer may not have the necessary registry information or message DLL files to display the message, or you may not have permission to access them. The following information is part of the event: \Device\Video3 Graphics Exception on (GPC6, PPC 0)"

I have tried capturing a screenshot right after the black screen crash to see if it pops any information put the screenshots are black.



I am aware that this is a long post, but I am still having hope that there is a way to find a fix for this.

Thank you very much for your time and patience.

Any help is highly appreciated.
 

Kshipper

Posts: 523   +116
TechSpot Elite
Extensive trial and error. It's worth just setting everything in BIOS to default..but turn off Fast Boot ..turn on UEFI only (CSM off). Run ram at 2133Mhz for the test ..put a fresh drive in and rebuild windows on it...do chipset driver first from AMD (reboot) then do any and all drivers from the motherboard maker...then do Nvidia driver and load the GeForce Experience. If the Geforce Experience detects you WoT...let it set your settings. Turn off Fast Startup in Windows.

I had a crashing GTAV on a brand new RTX 2080 rig and the GeForce Experience set my graphics settings and all that went away. It was magical.

I only suggest the fresh build on a different drive for a test only..so it doesn't have to be a fancy fast expensive drive...anything will do..I'm just hoping to preserve your current drive and have you do a fresh build...fresh driver.

I'm also wondering about PSU...is it large enough for the 3090. I hear it draws a lot..so I'm thinking you want to be 850 or more...
 
Hello Kshipper, thank you for your input.
I just ordered an SSD for this test, had just old IDE Hard drives laying around nothing avaiable.
Will follow your recommandation this weekend, hopefully it will help, it didnt down on me to turn off fast boot or CSM and revert to lower ram speeds.
Very interesting lead, fingers cross.

On the subject of the PSU, do you think I need a bigger one? - I am currently using the Corsair's RM 1000X.
Cheers Kshipper,
 

Avro Arrow

Posts: 1,256   +1,386
TechSpot Elite
Hello folks,



Hopefully I am in the right section as this issue appeared after my GPU upgrade from a RTX 2080Ti to RTX 3090.

I am encountering an issue with one my most played games (World of Tanks), the client is crashing into a black screen at random times (sound and other programs still run in the background - can use TeamSpeak for example).

This issue only happens in World of Tanks and is reproductible while in battle, does not happen while in idle or just browsing around in the garage interface. No other game I have tried so far gave me any crash - DX11 or 12.

*Did notice a weird thing that went away after the last driver update (v.461.09), the in-game gamma settings were greyed out - so I thought I’d mention this anyways.

I am at a loss here as I have run out of ideas and I can’t seem to figure out if this is an issue with hardware or is it somehow tied to the way the game works.

The crashes began mid-November and since then I have contacted the player and technical support teams @Wargaming – the company behind World of Tanks creating multiple tickets that lead nowhere so far.

My rig config – pc part picker link here.

Age of the computer (hardware) ranging from new – GPU to 1 and a half years old for most of the system hardware.

OS wise the system was installed around 08/19.



Here is a list of things I tried after reading around on the internet on various forums and asking all my tech-savvy friends for help:

Updated/Rolled back the GPU drivers using DDU in Safe mode – Not connected to the internet / Windows Drivers Updates off > tried the versions (Studio and Game Ready): 442.74*did not recognize my GPU as expected* - 456.38 - 457.30 - 460.97 (beta) and the latest drivers as well - now running Game Ready 461.09.

Made sure my temps are low and there is no overheating - at an ambient temp of roughly 23-4 C degrees - I get to a maximum 71C on GPU and 68/70 CPU side during stress test or even lower temps when just under intense gaming (CoD – Cold War / Metro Exodus for example) - and high 50 C CPU / low 60C on the GPU.

Cleaned the registry using CCleaner.

Modified the page file system to a higher value and reverted to windows default (32 GB of physical RAM memory should be more than enough- but hey I gave it a shot anyway)

Updated the latest chipset drivers from AMD site.

Updated the GPU firmware and latest Aorus software.

Updated to the latest BIOS (now running F31j)

Disabled the XMP Profile in BIOS / and using all stock settings in BIOS.

Updated to the latest Creative Sound Blaster sound card drivers

Used the windows default recommended sound settings (suggested by the Wargaming Support team)

Rebooted my router & network devices for a few minutes (suggested by the Wargaming Support team)

In Nvidia Control Panel set the game to use Maximum Performance mode as some user recommended in a similar issue with Apex Legends.

Disabled G-Sync + gave it a go with V Sync on and off (Using a G-Sync monitor @ 1440p 165hz)

Downclocked my GPU using the Aorus tool to the reference RTX 3090 specs.

Enabled and tested Low Latency mode

Updated to the latest chipset from the AMD site.

Tried running the game with no Antialiasing / Off in NVCP as well.

Launched the game in Windowed - Windowed borderless - Fullscreen

Lowered the resolution from 1440p to 1080p and 720p

Tried to run the game with the lowest graphical settings in all the above resolutions

Since the game client has the option – tried both x32/x64 clients

Tried Compatibility mode Windows 7 & 8

Launched the game in safe mode

Used both HD and SD client (lower end machines version)

Ran the game integrity check (suggested by the Wargaming Support team)

Added the game as a firewall exception (suggested by the Wargaming Support team)

Removed the Appdata WoT folder that contains all the game settings (suggested by the Wargaming Support team)

Reinstalled the game multiple times

Made sure the DirectX Suite is up to date

Uninstalled Easy Tune Engine Service/ Team Viewer (as suggested by the Wargaming Support team)

Disabled & Uninstalled iCue / Aorus Software / RGB Fusion 2 (suggested by the Wargaming Support team)

Clean booted - Windows Default Only

Deleted the cache Nvidia Shader Cache

Disabled onboard sound in Device Manager

Used a TDR Manipulator and disabled / reenabled it.

Updated my Windows 10 to the latest build 20H2

Rather recently replaced my PSU unit (end of 2019)

Ran a 3D Mark benchmark (Firestrike Demo) while logging with Hwinfo64 * Sensors Only – Log here (3D Mark Results + Hwinfo64 Log) (advised by the user Greybear on Nvidia Forums)

After the latest tests (here) advised by the user Greybear in my post on the Nvidia Forums the readings in the log file he points out there is an anomaly, you can see it in the Images in the 12v LOW column "" 0 "" .

Open the Google Drive Link above access the .csv file and scroll across to column LV - MD - see all the 0 entries.... notice that the 12 readings appear in the wrong column on some of the 0 reading in the LV column.

He recommended that I set the RAIL performance to SINGLE RAIL on my PSU.



My knowledge on PSU’s is quite limited - Any thoughts on that?



*I will redo the benchmarks and log the info as soon as I get home from work. *

After digging for any leads in my crash logs, I found the following information into the game’s crash log and Windows Event logger:



@python Log here World of Tanks client:

INFO: [Info] FATAL ERROR: RenderSystemDeviceAccess::genericInterfaceCallFailureHandler - device in unsupported state: The device has been removed.”



This seems to be synchronized time stamp wise - with the report from the Windows Event logger that points to the nvlddmkm file stopping to work:



" Event id - 1062600691 in Source "nvlddmkm" cannot be found. The local computer may not have the necessary registry information or message DLL files to display the message, or you may not have permission to access them. The following information is part of the event: \Device\Video3 Graphics Exception on (GPC6, PPC 0)"

I have tried capturing a screenshot right after the black screen crash to see if it pops any information put the screenshots are black.



I am aware that this is a long post, but I am still having hope that there is a way to find a fix for this.

Thank you very much for your time and patience.

Any help is highly appreciated.
Honestly, you've done far more than you should have bothered with. I would be talking to Gigabyte at this point because if I know that if I were to pay €2049 (OMFG!) on a video card, I can tell you that it had better work perfectly out of the box or I'd be on the phone with Gigabyte about an RMA immediately.

You want to know the easiest way to tell if it's the new video card? Put the RTX 2080 Ti back in. If the problems cease, then it's the new card because faulty drivers would cause instability on the RTX 2080 Ti as well. If the problems continue, then it's not the new card, it's the drivers. I had a problem with the RX 5700 XT that I bought. The way that I knew it was the card itself was the fact that when I popped one of my R9 Furies back in, all problems immediately ceased even though the driver package was the same.

That was all I had to tell XFX for them to approve the RMA.
 

Kshipper

Posts: 523   +116
TechSpot Elite
I hope it helps too...no one likes to RMA anything and one more clean build try should at least settle the question that something is borked in all that trying and testing ....and 1000w should be enough =)
 
Honestly, you've done far more than you should have bothered with. I would be talking to Gigabyte at this point because if I know that if I were to pay €2049 (OMFG!) on a video card, I can tell you that it had better work perfectly out of the box or I'd be on the phone with Gigabyte about an RMA immediately.

You want to know the easiest way to tell if it's the new video card? Put the RTX 2080 Ti back in. If the problems cease, then it's the new card because faulty drivers would cause instability on the RTX 2080 Ti as well. If the problems continue, then it's not the new card, it's the drivers. I had a problem with the RX 5700 XT that I bought. The way that I knew it was the card itself was the fact that when I popped one of my R9 Furies back in, all problems immediately ceased even though the driver package was the same.

That was all I had to tell XFX for them to approve the RMA.
Unfortunetly I no longer have my 2080Ti - I am having a RX5700XT that has some driver issues in my secondary system, I would crash ocassionaly but since it is not my main rig and its out of warranty by now I didnt take this route. But yeah will give that too a good. At this point nothing to lose anymore.
 

Kshipper

Posts: 523   +116
TechSpot Elite
Unfortunetly I no longer have my 2080Ti - I am having a RX5700XT that has some driver issues in my secondary system, I would crash ocassionaly but since it is not my main rig and its out of warranty by now I didnt take this route. But yeah will give that too a good. At this point nothing to lose anymore.

It's been a couple weeks now...you get it all sorted out?
 
Hey there Kshipper, I know it has been a long time since my last post, unfortunately the issue is not fixed and the Gigabyte support team redirected me to my local vendor whom offered me a refund but not a replacement due to the low stocks so I'd be at a loss given the price hike.

Meanwhile after following the recommendations of a fellow user JoeRambo @ Anandtech Forums to use Riva Tunner Statistics Server and even more testing over the past 3 months with the Aorus Engine I did end up having a stable GPU while playing World of Tanks.

While using RTSS I only ended up having partial success, I kept trying to find a solution and ran different clocks and configurations with the Aorus Engine.

Stable settings I am currently using with an alternate profile in Aorus Engine – GPU Clock 1625 / Memory Clock 19502 / Fan Speed Auto / Power Target 100 @ a Target temp of 83 degrees Celsius.

The issue occurred only once since then while playing League of Legends but didn’t reproduce since and I was using this custom World of Tanks profile not the standard Aorus Extreme OC – (GPU Boost 1860Mhz / Memory Clock 19500).

No other games or workloads like rendering crashed my GPU, so I decided to keep the card a while more, at least until the availability issues are resolved.

Hopefully this workaround will help somebody else out there.