Nvidia addresses failing GeForce RTX 2080 Ti cards

LemmingOverlrd

Posts: 86   +40
Why it matters: When you dole out $1,200 for a graphics card and it fails within a month, even if the vendor replaces it in record time, you're bound to give the vendor some grief. This is what's happening with failing RTX 2080 Ti Founders Edition cards sold by Nvidia right now, and is yet another contributing argument towards consumers looking elsewhere or skipping the series entirely.

Over the past few weeks, reports have popped up of dying GeForce RTX cards without any particular reason whatsoever. The problem revolved mostly around Founders Edition RTX 2080 Ti cards, although some AIB partner cards were also met with similar issues. The issue manifested with graphics artifacts on-screen just before freezing up the user's PC and requiring a cold boot to get it back up and running.

Several media outlets ran their own investigations into the matter (including Gamers Nexus who has asked owners of dead RTX 2080 Tis to ship them their way), with many users pointing an accusing finger at Micron's GDDR6 memory modules as the culprit. This was further (accidentally) fueled by Nvidia who, in the meantime, started shipping a new batch of RTX 2080 Ti using Samsung-sourced GDDR6. This was written off later as 'business as usual', as the supply of GDDR6 chips from Micron was now non-exclusive, allowing Samsung (and maybe SK Hynix, down the line) to join the party. As time went on, it became clearer that the issue related almost exclusively to Founders Edition cards, which -- adding insult to injury -- are the ones that carry a $200 premium over the reference pricing.

With nothing happening on the Nvidia side and users screaming on Reddit and GeForce Forums until their faces turned red, not to mention the humiliation of spending $1,200 on a borked gaming GPU, it's finally addressed the issue... sort of.

In a rather short and low-key forum post, Nvidia has addressed the issue and posted the following:

"Limited test escapes from early boards caused the issues some customers have experienced with RTX 2080 Ti Founders Edition. We stand ready to help any customers who are experiencing problems. Please visit www.nvidia.com/support to chat live with the NVIDIA tech support team (or to send us an email) and we’ll take care of it."

While fairly succinct we can read a bit into the statement: First of all, it's a sure bet Nvidia knows, by now, what is going on and new batches will no longer have the issue, otherwise it would just be inviting itself a world of hurt.

According to users on Reddit, their RMAs are arriving with new RTX 2080 Ti cards and they work just fine. Secondly, the issue seems restricted to the RTX 2080 Ti, which makes it unique to some feature that is not shared with the 2080. Since the RTX 2080 Ti and non-Ti cards share the same memory controller and chips, it's highly unlikely this was caused by a hardware defect in those components (I have a feeling these are power issues). Third, "test escapes" are a way of saying that the manufacturing missed some point of the quality assurance process and a batch of cards went untested (most likely due to machine error, as human error would end up affecting a smaller-than-reported number of cards).

We've reached out to Nvidia for comment on the situation, but so far no luck, so check in later for possible updates. If you're one of the unlucky souls who've had their RTX 2080 Ti crash on them, only to find yourself on the phone with Nvidia tech support, let us know your experience in the comments. Nvidia seems to be resolving the issue for customers without much fanfare, and these things do happen when something is rushed out of the door.

Permalink to story.

 
I'm still happy with my 1080ti over here. I'm feeling even happier knowing I made the correct decision to not upgrade. It probably won't be another year or two at the earliest before I upgrade.
I'm with you. I upgraded last year and I'm not regretting it one bit.
 
I wonder if Nvidia was wondering why their yields were so high for the 2080 Ti vs expected yields from the 2080 and 2070?

Well, honestly, It seems that it was an internal error while manufacturing. Like the Stage II binning process was backward. After all, it's all the same chip that the RTX Quadro's are using just cut down by having select micro components disabled by laser to make the lower bracket products through the entire product line. They are using the same process for 2070 and 2080 respectively. However, If the automatic indexing Calibration was off by the time the assembly got to the microlaser. It would be modifying the chip as if it was faulty using the improper index number and progressively they would get bigger and bigger. Since they would be binned with the incorrect index number for internal references they put them for use as products that do not match the capabilities, it would work initially and pass every test except if they torture tested it over long-term, which they do not do because of manufacturing time. So I'm sure what we're seeing now, Is the initial runs before they caught the Internal Error. But, you think if they had to do a recall. It's no longer just those cards, it would be every TU104 chip they manufactured until they fixed the issue. I'm compiling lists of Defective Batches; For sure Batch 0323 and 0324 are all faulty in some way. That means they have two batches of cards that have faulty chips minimum. So you can probably say safely Batch 0322 and 0325 can be ruled as defective, as well for reassurance. So minimum we know without a doubt, there are four Batches of TU104 chips need to be replaced. I'll try to cross reference Manufacture dates of those batches to other TU104 cards using around the same date. Luckily, these cards aren't used for anything actually noteworthy yet. Imagine, installing a complete Tesla server for medical and scientific research, and find out all the cards were from those batches so it would invalidate all research and development that was done.
 
I know this is all fresh and evolving news but guys if you did a little research you'd find that the unusual issue outside of regular failures was solved last week via a driver update, it was not a hardware problem at all. It was patched by Nvidia and that was that. It seemed be an issue with certain monitors (or monitor features) and the bad driver. So yes, it could have and should have been caught in testing by Nvidia, but it was solved so quickly this is already fading into the past. This article is worded so weirdly
 
I know this is all fresh and evolving news but guys if you did a little research you'd find that the unusual issue outside of regular failures was solved last week via a driver update, it was not a hardware problem at all. It was patched by Nvidia and that was that. It seemed be an issue with certain monitors (or monitor features) and the bad driver. So yes, it could have and should have been caught in testing by Nvidia, but it was solved so quickly this is already fading into the past. This article is worded so weirdly
software fixes hardware dude. driver updates are software that modifies the hardware. hardware doesn't work without software. therefore it's a hardware issue, caused by the settings in said software that controls the hardware. if you did a little research................oh you didn't.
 
Seems too many products these days ship faulty, if I started a list of tech items shipping with fairly major problems it'd be too long to post...
 
I'm still happy with my 1080ti over here. I'm feeling even happier knowing I made the correct decision to not upgrade. It probably won't be another year or two at the earliest before I upgrade.
With you on this one. So glad I got the 1080Ti when I did. Sad they're not available anymore as they now look like the best performance/price buy around. Max every game even at 1440p. Just a beast of a card.
 
Seems too many products these days ship faulty, if I started a list of tech items shipping with fairly major problems it'd be too long to post...

Not just small tech, but all modern production. For example - look at how many cars get recalled because of faulty products (eg - recent worldwide recall of many different brand cars for faulty and potentially lethal airbags).
It seems like in the rush to cater to our insatiable desire to have the latest and greatest right now, fueled by corporate marketing (ad examples - "Be the first one on the street" or "Don't be left behind") and credit cards (and of course greed), is creating a culture of release it ASAP and deal with the problems later.
 
Back