RTX 4090 has a meltdown after proper installation and only one year of use

Cal Jeffrey

Posts: 4,181   +1,427
Staff member
A hot potato: It's hard to believe that Nvidia's RTX 4090 turned one year old today. Maybe that's because every few months, Meltgate rears its ugly head. Yes. Another user has reported that his 4090 committed suicide after working just fine for a year.

Just when you thought it was safe to use your 12VHPWR cable with your RTX 4090 again, another incident of GPU meltdown pops up in the forums. Redditor Byogore reports that he bought an Asus 4090 in Germany a year ago because of US supply issues. The card worked totally fine until it self-immolated two days ago. This incident is unusual because typically failures have happened much sooner.

Other Redditors quickly questioned whether Byogore had the connectors seated securely since "user error" was one of Nvidia's excuses when the issue arose shortly after the RTX4090 launch. Byogore defended his ability to properly attach a computer component.

"Clearly, I did..." he pointed out. "If this was a seating issue, it would've died ages ago. I've used it a lot."

He also pointed to a video from a California repair shop, NorthridgeFix, that said it has seen many 4090s spontaneously combusting (below). Based on internal tests, NorthridgeFix is adamant that the problem is not the cables, connectors, bending, PSUs, or user error.

"The fact that the 90-degree cable mod adapter [to prevent bending] was plugged in fully to the connector and the connector still melted, then we know that we have a problem with the card. We have a problem with the card. We do not have a problem with the cable. It's not user error, it's not a cable not plugged in properly problem, it's not a cable mod problem, but it's a problem with the design and the engineering of the card."

NorthridgeFix also points out that even if the problem was caused by user error, it's still Nvidia's fault. For instance, if a user is plugging in a cable as they do with any other, past or present, and it leaves a 1mm gap that causes failure, that is not the user's fault. When there is a tolerance issue like that, it is the manufacturer's responsibility to fix it or design a mechanism to prevent it, not blame the user for not plugging it in correctly and doing nothing more about it.

Byogore says that the meltdown happened while he was playing Battlefield 2042. The screen turned black, but the audio continued. Then, the computer rebooted itself. As it started up, he could smell burnt plastic. The card still worked, but only briefly before crashing and burning again. Byogore mentioned that he has a 1,000W Corsair PSU and that the 4090 was undervolted at the time of the catastrophic failure. However, he didn't note whether his PSU was ATX 3.0. The problem only seems to occur on older ATX 2.0 power supplies.

Fortunately, Asus has been very empathetic and generous to Byogore's plight. When he contacted customer service, they offered to upgrade him with a new Strix card (from a TUF model) or give him a full refund. Presumably, the card was still under warranty since it was two days shy of its first anniversary.

Nvidia has been reluctant to accept any blame for the 4090's hot-button woes. It initially blamed users for not securing the cable tightly to the socket. Later, it said that "poorly designed" 12VHPWR adapters were to blame. The last time the issue popped up enough to make news was last May. The ongoing problem has even sucked Nvidia into a class-action lawsuit.

Permalink to story.

 
Who would have thought that pumping more power through connectors with fewer cables and less surface area on the contacts would lead to an increase in heat.

All you need is a tiny bit of thermal expansion to create a feedback loop of increased resistance and heat.
 
Who would have thought that pumping more power through connectors with fewer cables and less surface area on the contacts would lead to an increase in heat.

All you need is a tiny bit of thermal expansion to create a feedback loop of increased resistance and heat.
Well I don't understand what u are saying but it sounds like I don't understand, what I understand is 2000 euros burned in a few seconds like the first day he bought the card !! Not me !! Not buying!! I paid 930 for a 2080 rtx 4 years àgo it is still working !!poor fellow had a very bad day !! If my gpu burns for sure won't buy something that expensive to get an answer "user fault" ,I am waiting when 4090 goes to 900 euros, maybe never but I will never have to explain to my wife what the f..ck is burning and cost 2000 euros !! Ouch!!!
 
New 12 volt 16 pin power replacement cables that voids your warranty should fix the problem fingers crossed yet to be determined we promise 🤪.
 
Well I don't understand what u are saying but it sounds like I don't understand, what I understand is 2000 euros burned in a few seconds like the first day he bought the card !! Not me !! Not buying!! I paid 930 for a 2080 rtx 4 years àgo it is still working !!poor fellow had a very bad day !! If my gpu burns for sure won't buy something that expensive to get an answer "user fault" ,I am waiting when 4090 goes to 900 euros, maybe never but I will never have to explain to my wife what the f..ck is burning and cost 2000 euros !! Ouch!!!
I'll put it less sarcastically. The only way the 12 pin connector will be safe is if they put a breaker on it for over current protection. Resistance goes up, voltage goes down but amperage(current) goes up because the card still needs the same wattage to function. Watts=Volts*Amps.

But when I was writing that something occurred to me. Why haven't we seen any failures of the 12 pin connector on the side of the powersupply?
 
Anyone that has any ATX3.0 12VHPWR connectors deserves factory recalls on them. They all need replaced, free of charge, with newer ATX3.1 update. Including adapter cables.
 
I'll put it less sarcastically. The only way the 12 pin connector will be safe is if they put a breaker on it for over current protection. Resistance goes up, voltage goes down but amperage(current) goes up because the card still needs the same wattage to function. Watts=Volts*Amps.

But when I was writing that something occurred to me. Why haven't we seen any failures of the 12 pin connector on the side of the powersupply?
I remember there was 1 reported incident from the psu side.

 
I wouldn't even buy a used 4090 before this is finally resolved.

"...it's a problem with the design and the engineering of the card."

However, he didn't note whether his PSU was ATX 3.0. The problem only seems to occur on older ATX 2.0 power supplies.

I'm confused. Is the card the problem or the PSU/spec?
 
I wouldn't even buy a used 4090 before this is finally resolved.

"...it's a problem with the design and the engineering of the card."

However, he didn't note whether his PSU was ATX 3.0. The problem only seems to occur on older ATX 2.0 power supplies.

I'm confused. Is the card the problem or the PSU/spec?
The new version of the connector checks to sees if the resistance has gone up and reduces the amount of wattage going to the card, basically throttling it. They didn't actually fix the problem, it just checks to see if there is a problem and throttles the card so the connector doesn't overheat.
 
4090 is a real liability, spend all that money and if it dies outside of warranty due to a well known issue...
Yeah; for me I'm never buying top end GPU from nvidia again x80 is my max now; not getting burned like this (literally)
 
A bit unrelated
I wouldn't even buy a used 4090 before this is finally resolved.

"...it's a problem with the design and the engineering of the card."

However, he didn't note whether his PSU was ATX 3.0. The problem only seems to occur on older ATX 2.0 power supplies.

I'm confused. Is the card the problem or the PSU/spec?
A wild guess: cheap cable. Those cables are the things these companies like to save on to the max.
Maybe it was badly made, stayed hot, and eventually died.
I recently found that one of my CPU cables is super hot.Nothing else, just one cable.
I suspect a low quality as well
 
Does anyone know if the tpu's gpuz 12 volt 16 pin monitor which reads 12.4 volts on my 4090 suprim liquid ( With Seagate Vertex 1000 atx 3.0 psu) is a healthy reading?. I finally overclocked the beast @ +150 mhz on the core and +150 mhz on the memory. No voltage tweaks or power adjustments. Without oc it's 12.3 volts with no deviation. Us 4090 owners are the beta testers for future power gpu management and pathtracing technology in games like Cyberpunk2077 2.0 Phantom liberty dlc with tweaks to get a subjectively playable framerate. 😑
The overclock improved my timings in vermitide 2 from 5 ms to 4 ms even with vsync and frame capped at 120 hz using only 270 watts of power.
I am just curious if 12.4 volts on the 12 volt 16 pin reading is healthy?
 
Does anyone know if the tpu's gpuz 12 volt 16 pin monitor which reads 12.4 volts on my 4090 suprim liquid ( With Seagate Vertex 1000 atx 3.0 psu) is a healthy reading?. I finally overclocked the beast @ +150 mhz on the core and +150 mhz on the memory. No voltage tweaks or power adjustments. Without oc it's 12.3 volts with no deviation. Us 4090 owners are the beta testers for future power gpu management and pathtracing technology in games like Cyberpunk2077 2.0 Phantom liberty dlc with tweaks to get a subjectively playable framerate. 😑
The overclock improved my timings in vermitide 2 from 5 ms to 4 ms even with vsync and frame capped at 120 hz using only 270 watts of power.
I am just curious if 12.4 volts on the 12 volt 16 pin reading is healthy?
Check the standby voltage and then check the voltage underload. +/- 5% is usually considered safe
 
It’s unplanned obsolesce, not planned - if it was the latter, and obsolesce div didn't screw up, all connectors should have blown up simultaneously while leather jacket was taking brand new 5090 from his oven
 
A bit unrelated
A wild guess: cheap cable. Those cables are the things these companies like to save on to the max.
Maybe it was badly made, stayed hot, and eventually died.
I recently found that one of my CPU cables is super hot.Nothing else, just one cable.
I suspect a low quality as well

Ya know? That is a great point. I have never thought to check my cables while in use for heat, especially after a few months. Next time I open my rig up I will check that.

Thanks, toooooot!
 
"Other Redditors quickly questioned whether Byogore had the connectors seated securely since "user error" was one of Nvidia's excuses when the issue arose shortly after the RTX4090 launch."

It's not an excuse when it's the only way anyone has been able to recreate the failure, this being a failure that affects less than 1% of owners of the product. If it was somethign inherent to the connector, in the sense it can fail when properly seated, we would see more instances of this and someone would have been able to reproduce the failure with one plugged in. Anyone trying to call user error an "excuse" has nothing but supposition and speculation to defend their argument. Get some concrete proof.

We don't know what this user's computer has gone through, if it's moved, if he's messed around inside the case, etc. and just have his word to go off of, when people often don't want to admit they could have made a mistake or been the cause of a failure. It may not even be the end user avoiding responsiblity and simply they forgot something they did with the system or didn't realize they did something that could affect the system. There's too many variables at play that are relying on someone's recollections.


"The fact that the 90-degree cable mod adapter [to prevent bending] was plugged in fully to the connector and the connector still melted, then we know that we have a problem with the card. We have a problem with the card. We do not have a problem with the cable. It's not user error, it's not a cable not plugged in properly problem, it's not a cable mod problem, but it's a problem with the design and the engineering of the card."

Then reproduce the error NorthridgeFix.


"NorthridgeFix also points out that even if the problem was caused by user error, it's still Nvidia's fault."

Convenient. User abuses the product, but it's not their fault. The only difference is that there is enough power flowing through the connector for a 1mm gap to matter unlike in the past. It's still on the user since they didn't properly verify it was assembled correctly. If I don't fully tighten/torque down the bolts holding my brake caliper guide in place and the caliper malfunctions, it's not the car manufacturer's fault that my brakes failed.


"Fortunately, Asus has been very empathetic and generous to Byogore's plight."

Of course. After their complete customer service failure earlier this year, they're going to chase as much goodwill as they can.


"Nvidia has been reluctant to accept any blame for the 4090's hot-button woes."

Because no one has proven any other failure mode other than user error. If you want to hold their feet to the fire over the connector, you need to reliably prove a different failure mode.


"The ongoing problem has even sucked Nvidia into a class-action lawsuit."

Waste of time. Even if there is a judgement against Nvidia, the sum will be too small to matter and only the lawyers are getting paid anything of significance.

I'll put it less sarcastically. The only way the 12 pin connector will be safe is if they put a breaker on it for over current protection. Resistance goes up, voltage goes down but amperage(current) goes up because the card still needs the same wattage to function. Watts=Volts*Amps.

But when I was writing that something occurred to me. Why haven't we seen any failures of the 12 pin connector on the side of the powersupply?
Realistically, they need to add a circuit trip that will trigger over a certain amount of draw, but the amount is also varies so from AIB to AIB that you'd have to make it adjustable and that just puts the failure back into the hands of the user. Or it would have to be a certain expected window and have options for PSUs that don't have the circuit breaker for OC-specific and other higher draw boards.

I've seen one failure reported PSU side and that was an ATX 3.0 I think, though I'm having trouble finding the article at the moment.
 
Last edited:
"Other Redditors quickly questioned whether Byogore had the connectors seated securely since "user error" was one of Nvidia's excuses when the issue arose shortly after the RTX4090 launch."

It's not an excuse when it's the only way anyone has been able to recreate the failure, this being a failure that affects less than 1% of owners of the product. If it was somethign inherent to the connector, in the sense it can fail when properly seated, we would see more instances of this and someone would have been able to reproduce the failure with one plugged in. Anyone trying to call user error an "excuse" has nothing but supposition and speculation to defend their argument. Get some concrete proof.

We don't know what this user's computer has gone through, if it's moved, if he's messed around inside the case, etc. and just have his word to go off of, when people often don't want to admit they could have made a mistake or been the cause of a failure. It may not even be the end user avoiding responsiblity and simply they forgot something they did with the system or didn't realize they did something that could affect the system. There's too many variables at play that are relying on someone's recollections.


"The fact that the 90-degree cable mod adapter [to prevent bending] was plugged in fully to the connector and the connector still melted, then we know that we have a problem with the card. We have a problem with the card. We do not have a problem with the cable. It's not user error, it's not a cable not plugged in properly problem, it's not a cable mod problem, but it's a problem with the design and the engineering of the card."

Then reproduce the error NorthridgeFix.


"NorthridgeFix also points out that even if the problem was caused by user error, it's still Nvidia's fault."

Convenient. User abuses the product, but it's not their fault. The only difference is that there is enough power flowing through the connector for a 1mm gap to matter unlike in the past. It's still on the user since they didn't properly verify it was assembled correctly. If I don't fully tighten/torque down the bolts holding my brake caliper guide in place and the caliper malfunctions, it's not the car manufacturer's fault that my brakes failed.


"Fortunately, Asus has been very empathetic and generous to Byogore's plight."

Of course. After their complete customer service failure earlier this year, they're going to chase as much goodwill as they can.


"Nvidia has been reluctant to accept any blame for the 4090's hot-button woes."

Because no one has proven any other failure mode other than user error. If you want to hold their feet to the fire over the connector, you need to reliably prove a different failure mode.


"The ongoing problem has even sucked Nvidia into a class-action lawsuit."

Waste of time. Even if there is a judgement against Nvidia, the sum will be too small to matter and only the lawyers are getting paid anything of significance.
Connectors are meant to eliminate user error. It's why you don't wire your lamp directly into an electrical socket.

If user error is so common then why isn't the 8pin connector having as many incidents of "user error"?
 
Back