AMD confirms Radeon RX 7900 XTX temperature issue is related to cards' thermal solution

midian182

Posts: 9,742   +121
Staff member
What just happened? Following weeks of complaints, AMD has finally confirmed that the high temperatures and unexpected throttling in the Radeon RX 7900 XTX reference cards are being caused by issues related to their thermal solution.

AMD's reputation has been tarnished by reports of some Radeon RX 7900 XTX reference graphics cards experiencing thermal issues involving GPU hotspot temperatures, or the maximum temperature read by the sensor, reaching as high as 110C.

AMD initially said 110C is the normal junction temperature, which is why it refused some buyers' RMA requests and advised those users who are experiencing unexpected thermal throttling to contact AMD support.

A recent investigation by Roman 'der8auer' Hartung led to the overclocker placing the blame for the overheating issues on the Radeon RX 7900 XTX's vapor chamber; it's believed that some batches of the cards lack sufficient fluid levels.

Igor Wallossek of Igor'sLAB fame received an email from a system integrator stating that 4 to 6 batches of Radeon RX 7900 XTX MBA (Made By AMD) cards, which covers thousands of units, are affected by the issue.

AMD has finally given an official statement that seems to confirm the investigations. "We are working to determine the root cause of the unexpected throttling experienced by some while using the AMD Radeon RX 7900 XTX graphics cards made by AMD. Based on our observations to-date, we believe the issue relates to the thermal solution used in the AMD reference design and appears to be present in a limited number of the cards sold," the company wrote.

"We are committed to solving this issue for impacted cards. Customers experiencing this unexpected throttling should contact AMD Support (https://www.amd.com/en/support/contact-call)."

It does appear that only reference and AMD-manufactured Radeon RX 7900 XTX GPUs sold by AMD and its partners are affected. Aftermarket cards featuring custom cooler designs seem to be safe. And not every one of these MBA cards is overheating—the one we used in our review was fine.

However, an issue impacting thousands of units is not an insignificant problem. We'll have to see if AMD issues a mass recall/RMA.

The controversy comes at a bad time for AMD. Its Zen 4-powered Ryzen 7000 CPUs aren't selling as well as expected, especially in Germany, where sales of Zen 3 chips are outpacing their successor by around five to one. The reports have also taken the spotlight away from rival Nvidia's melting 16-pin 12VHPWR adapter issue, which AMD gleefully mocked in a tweet before reports of overhearing Radeon RX 7900 XTX cards arrived.

Permalink to story.

 
It sounds rather serious. As in potentially way, way more people affected than the 50-odd who did not insert their 12VHPWR connectors carefully.

We probably won't see anyone trying to downplay this one.
 
These companies dont care about quality, they wanna get their expensive baubles out asap because they know no matter how expensive it is their gullible consumers will buy them immediately...then defend them so they dont feel bad about not waiting for any info on how the product actually functions.
 
How did they not catch this? I know that AMD contracts with someone to Manufacture these coolers for them and they aren't directly responsible but a little QC would go a long way

Atleast the partner cards don't have this issue but AMD has this weird following with their reference card design
 
Luckily mine (Sapphire MBA7900XTX) appears to be fine... but I am keeping a close eye on it and have torture tested with Furmark using the same method as der8auer.

The concern I have is over resale value in a few years - even if my particular card is fine, the stigma around it will make it more difficult to sell on.
 
How did they not catch this? I know that AMD contracts with someone to Manufacture these coolers for them and they aren't directly responsible but a little QC would go a long way

Atleast the partner cards don't have this issue but AMD has this weird following with their reference card design
QC usually does a random, limited quantity test. Grab 10 cards run them up to temp, ok no problem. Let's ship them! The question is, was this a manufacturing defect (I think so) or did they not spec enough fluid in the vapor chamber in the first place? Or maybe there is something that allows the fluid to leak or evaporate out. Hard to say for sure at this point.
 
I feel so bad for them....I mean they were selling it for such a reasonable price.
 
QC usually does a random, limited quantity test. Grab 10 cards run them up to temp, ok no problem. Let's ship them! The question is, was this a manufacturing defect (I think so) or did they not spec enough fluid in the vapor chamber in the first place? Or maybe there is something that allows the fluid to leak or evaporate out. Hard to say for sure at this point.
I'm going to hazard a guess that there isn't enough fluid to match the specs. It's just water and methyl alcohol in a vapor chamber, it's not like it's expensive so having less wouldn't be a cost cutting measure. It would take some extreme carelessness for an engineer to mess this up. Also, the fact that this isn't impacting all cards leads me to believe it's a manufacturing defect.

However, I would like to note that AMD likely went with the cheapest contractor and there are costs associated with going with the cheapest shop.
 
I'm going to hazard a guess that there isn't enough fluid to match the specs. It's just water and methyl alcohol in a vapor chamber, it's not like it's expensive so having less wouldn't be a cost cutting measure. It would take some extreme carelessness for an engineer to mess this up. Also, the fact that this isn't impacting all cards leads me to believe it's a manufacturing defect.

However, I would like to note that AMD likely went with the cheapest contractor and there are costs associated with going with the cheapest shop.
I agree its likely a defect. It's rather odd, AMD has been making vapor chamber coolers since the 7970, why would they screw this one up? It's not complicated tech. Leads me to believe its chinesium rearing its ugly head again.
 
I agree its likely a defect. It's rather odd, AMD has been making vapor chamber coolers since the 7970, why would they screw this one up? It's not complicated tech. Leads me to believe its chinesium rearing its ugly head again.
It's important that we look at the 7900xt, too. It's not having these issues while have a nearly identical interconnect with the temperature specs. It uses a different cooler so the tooling used in the factory is likely to be 100% different
 
Assembly line people aren't engineers. They're organic machines bereft of all independent thought trained to do a task, much like insects in a hive. If something was out of calibration, albeit marginally such as the apparatus that fills and pressurizes the vapor chamber, it will only be noticed either on its next calibration (if it happens) or a proper QC regimen. It seems that the boards are assembled by a 3rd party and another 3rd party makes the vapor chamber assembly. It would seem that there is a definite lack of QC going on, and between more than one vendor in the chain. The end partner that gets the parts to assemble should be the last line of defense, but alas.
 
When people want to blame AMD for everything, they surely ignore facts.

1- So far, looks like a good amount of XTX are affected, but NOT all of them.
2- So far, nobody has reported the same issue with the 7900 XT (stupid name).
3- It took 2 weeks but they are taking responsibility.

So, bad batch is the possible cause?

Maybe they are using more than one provider and one of them screwed up.
If thats the case (most likely) maybe they should use that as an excuse since nvidia tried that route and were forgiven right away.

Two things really suck in this situation:

1- The recall will be hard for the affected users, since they might be without a GPU for a while.

2- Media bias keeps showing that AMD deserves all hate and nvidia all forgiveness.
 
Least AMD is stepping up - this reporting is better that clickbait titles in YT - ie probably not affecting aftermarket ones.
With CPU heatsinks - we know pea size , broad bean size , lines, dots, spread evenly - no real diff.

You would think the engineers would know the critical range for these vapor tubes.
Would weighing a card - pick it up? - as everything else should be fixed ( well contact paste as well I suppose )

Same reason you want a skilled person to set up your heat pump /air conditioning etc - else huge efficiency loss
 
It sounds rather serious. As in potentially way, way more people affected than the 50-odd who did not insert their 12VHPWR connectors carefully.

We probably won't see anyone trying to downplay this one.
More people affected to me does not affect the seriousness of the issue. Any teething issue is not desirable for sure. But when you have a connector that is melting due to whatever reasons, that's a fire hazard. On the other hand, a cooler malfunction is annoying, but modern chips are designed to shutdown or throttle if it exceeds any threshold.
In any case, teething issue at launch is not uncommon, regardless of product. And such issue may happen from time to time due to bad batches. It is uncommon to hear of defect with air cooling solution, but in this case, it just shows it is not an impossible scenario.
 
Least AMD is stepping up - this reporting is better that clickbait titles in YT - ie probably not affecting aftermarket ones.
With CPU heatsinks - we know pea size , broad bean size , lines, dots, spread evenly - no real diff.

You would think the engineers would know the critical range for these vapor tubes.
Would weighing a card - pick it up? - as everything else should be fixed ( well contact paste as well I suppose )

Same reason you want a skilled person to set up your heat pump /air conditioning etc - else huge efficiency loss

It is a bit too early to say definitely how it's going, but AMD do seem to be handling this well.

I must say, I'm disappointed that the likes of der8auer have gone so clickbaity on this. Yes, it's a serious issue and AMD should be held to account for it and pressure kept up to make sure they address it, but showing a GPU in flames, etc. is just poor journalism.
 
I'm going to hazard a guess that there isn't enough fluid to match the specs. It's just water and methyl alcohol in a vapor chamber, it's not like it's expensive so having less wouldn't be a cost cutting measure. It would take some extreme carelessness for an engineer to mess this up. Also, the fact that this isn't impacting all cards leads me to believe it's a manufacturing defect.

However, I would like to note that AMD likely went with the cheapest contractor and there are costs associated with going with the cheapest shop.
I agree, hard to think this is a cost-cutting measure and not just a manufacturing defect. The problem is, how do you know which cards are impacted? Are all these cards made in the same factory by the same company or are there multiple suppliers? And assuming it's a manufacturing problem, how hard is it going to be to go back and determine when the problem first occurred and whether it's still going on? I'd bet good money they've shut down the production like and are inspecting the vapor chamber assembly portion of the build.
 
Back