Alienware vBIOS abducts CUDA cores from the RTX 3070

Molematt

Posts: 36   +2
What just happened? A number of Alienware m15 R5 owners have found fewer cores than expected in their RTX 3070s when checked with GPU-Z and other monitoring tools. Users claim to have fixed it by flashing the vBIOS of the R4 model—but Dell are advising against doing so until they're able to put together a proper fix.

Things aren't going well with Dell's Alienware gaming notebooks as of late. With the Alienware Graphics Amplifier external GPU recently killed off in favor of Thunderbolt 4 solutions, and the class-action lawsuit now being brought against the advertising of their Area 51m R1 laptops, they could be forgiven for wanting some good press right now.

Unfortunately, users seeming to find the GeForce RTX 3070 graphics cards in their Alienware m15 R5 gaming laptops being distinctly below-par is anything but that.

The laptop version of the RTX 3070 should sport 5,120 CUDA cores—already cut down from the desktop model by 768—but a number of m15 R5 owners have seen their laptops reporting even fewer than that, with only 4,608 cores showing up in both GPU-Z, HWiNFO, and even Nvidia's own control panel.

That's a loss of 512 CUDA cores, and a remaining total that's lower than the 4,864 of the desktop RTX 3060 Ti.

Compounding things, the RTX 3070 also seems to only have 144 of the expected 160 TMUs and tensor cores, and 36 out of 40 expected ray tracing cores, giving the appearance that a full 10% of the entire graphics card is simply unavailable—or even outright unusable. Contrasting with those numbers, however, are the 96 ROPs reported in GPU-Z and HWiNFO; more than the expected 80, and the maximum possible on a GA104-based GPU, which should only be seen on the mobile RTX 3080 and desktop RTX 3070.

In search for a fix, some users have taken to flashing across the vBIOS from the previous model, the m15 R4. This does appear to have restored the full RTX 3070 CUDA core count, not to mention the other missing sub-components, although the abnormally high ROP count remains.

However, the user that produced the screenshots shown above later reported instability and hanging issues that forced him to revert to the stock vBIOS, potentially due to the higher TDP of the R4's vBIOS—and in any case, a fix like this is always risky business. Flashing the laptop's vBIOS comes with the danger of bricking the GPU, especially when it's one from an entirely different unit, and the soldered nature of laptop GPUs doesn't leave much room for second chances.

Dell seems to agree, telling Hot Hardware...

We have been made aware that an incorrect setting in Alienware’s vBIOS is limiting CUDA Cores on RTX 3070 configurations. This is an error that we are working diligently to correct as soon as possible. We’re expediting a resolution through validation and expect to have this resolved as early as mid-June. In the interim, we do not recommend using a vBios from another Alienware platform to correct this issue. We apologize for any frustration this has caused.

We'll have to see how Dell goes about fixing this issue, but with graphics TDP varying between units, and the mobile GPU silicon being more heavily cut-down from desktop compared to previous generations, another case of laptops not quite performing as expected is unlikely to go down well with the gaming crowd.

Permalink to story.

 
Let me guess - it only affects their 10th gen Intel and Ryzen based models but not 11th gen based laptops?
 
Did pre-release review units have a different, correct vBIOS? If so I wonder how and when it got switched to the wrong one. If not why didn't all the reviewers notice and ask about it? I think I saw a Dave2D review for these units. He didn't love them for other reasons but didn't mention this issue as far as I remember.

Either way you go, not a great look for Dell/Alienware. Either they don't test that each unit can hit its pre-defined performance target, or they accepted a qualification target that was too low for the hardware they were selling.
 
Probably affects the performance less than all the Dell malware on the machines!

I had Dell laptops all the way back to 2006. My last two laptops were JUNK. Screen/hinge broken on one and the other the stupid audio kept screwing up. Ditched it and bought another brand.
Yeah, the dell crapware! First thing I would do with a new laptop was to set it up, then, pull the hard drive, put a label on it with the warranty expiration date, stick in a new drive and set it up minus all the crapware.
 
You could sum up the entire article as: "Dell makes another junk product and screws it up with broken firmware."
Except it wasn't a screw up and never is, that's just how Dell and companies like it roll. Was possibly a test to see whether or not they'd get any push back or if anyone would even notice. They will absolutely deliver the least they can get away with specifically as their business model all behind their bs brand whose name hasn't held any weight in years.
 
No doubt Dell tries to cost engineer every last penny out of the build cost, but this is a strange case. None of the possible scenarios I can think of make sense to me:

1. They bought and paid for real 3070s - but if so this plot doesn't involve engineering that cost out.

2. Nvidia was in on it and they mutually agreed Dell would try to sell chips that didn't actually qualify as 3070s, as 3070s. But this seems too legally risky for even a stingy penny pincher, and even if it wasn't, Dell would not be in a position to promise a vBIOS fix in two weeks.

3. They are real 3070s, but Dell realized too late that it's cooling solution wouldn't hold up and disabled 10% of the cores to make it work. This feels like a possible problem for an OEM to encounter but again they wouldn't be able to promise a vBIOS fix in two weeks. And there's more subtle ways to address that scenario anyway.

4. The rogue-engineer (or Dell intentionally) stealing cycles for crypto mining theory above. This might actually explain the significant performance drop Gamers Nexus attributed to the pre-installed bloatware on their Dell prebuilt. Like #2, while a fun conspiracy theory, it doesn't feel like something a sane person would actually attempt in the clear light of day.

If it's not those four, I'm out of theories that don't involve some sort of mistake...
 
I had Dell laptops all the way back to 2006. My last two laptops were JUNK. Screen/hinge broken on one and the other the stupid audio kept screwing up. Ditched it and bought another brand.
Yeah, the dell crapware! First thing I would do with a new laptop was to set it up, then, pull the hard drive, put a label on it with the warranty expiration date, stick in a new drive and set it up minus all the crapware.
Had an Alienware 17 R4 (2016 model) die on me after 4 years (so out of warranty) - motherboard problem that is the price of a new laptop to replace (even on ebay I can't find a replacement for less than $700). Yet my 17xR4 (2012) is still going strong. Seems they've got worse over the last few years.
 
My MSI laptop is still going strong after 7 years. I don't know if their quality has changed but I'll be in the market for a new laptop soon. I'm going to wait for prices to come down and until there are 12 &, 16-core options which will hopefully bring down the price of the 8-core models I'll be looking at.
 
Hopefully this is an accident, and not a deliberate move. I feel the latter is possible in an attempt to try and keep the GPU cooler given that these high end gaming laptops are losing too much weight and size over the years.
 
Hopefully this is an accident, and not a deliberate move. I feel the latter is possible in an attempt to try and keep the GPU cooler given that these high end gaming laptops are losing too much weight and size over the years.
It only affects their AMD CPU based model - Intel ones are fine. Since this is Dell, „the best friend money can buy“, I very, very, very much doubt this was an accident.
 
DELL's software, shovelware and BIOS are all abominations. Worse than most viruses.
If you insist on buying a DELL PC/Laptop (and I'd strongly recommend you don't), first thing you should do is reinstall windows from scratch.
 
It only affects their AMD CPU based model - Intel ones are fine. Since this is Dell, „the best friend money can buy“, I very, very, very much doubt this was an accident.
Not arguing, but asking out of genuine curiosity, what could the motivation be? A deal with Intel to make total system performance look better on the Intel flavor than the AMD flavor? I don't put it past Intel to try to advantage itself, but I don't see how such a deal would do that if consumers don't know about the difference ahead of time (and as soon as they do find out about it, post-purchase, they are just going to demand a fix as indeed happened here.)
 
No doubt Dell tries to cost engineer every last penny out of the build cost, but this is a strange case. None of the possible scenarios I can think of make sense to me:

1. They bought and paid for real 3070s - but if so this plot doesn't involve engineering that cost out.

2. Nvidia was in on it and they mutually agreed Dell would try to sell chips that didn't actually qualify as 3070s, as 3070s. But this seems too legally risky for even a stingy penny pincher, and even if it wasn't, Dell would not be in a position to promise a vBIOS fix in two weeks.

3. They are real 3070s, but Dell realized too late that it's cooling solution wouldn't hold up and disabled 10% of the cores to make it work. This feels like a possible problem for an OEM to encounter but again they wouldn't be able to promise a vBIOS fix in two weeks. And there's more subtle ways to address that scenario anyway.

4. The rogue-engineer (or Dell intentionally) stealing cycles for crypto mining theory above. This might actually explain the significant performance drop Gamers Nexus attributed to the pre-installed bloatware on their Dell prebuilt. Like #2, while a fun conspiracy theory, it doesn't feel like something a sane person would actually attempt in the clear light of day.

If it's not those four, I'm out of theories that don't involve some sort of mistake...
How about
5.) They intentionally wanted to make the laptop less good than it could be. Disabling GPU cores via a special vBios should actually involve more work than using the standard one that does not disable cores.

This is after all Dell who seems to historically put great value on good relations with one vendor in particular.
 
OK, let's hypothesize that Intel paid Dell enough money to make this worth Dell's while. That leaves the question of how does this actually help Intel?

- Dell does not publish any of this up front, so it's not going to impact Intel vs AMD purchase ratio at Dell.

- Intel did not get other manufacturers to play along, so gamer/market perception (and hard data reviews) are going to correctly show this is an Alienware problem and not an AMD problem. Customers whose research or preferences led them to want AMD will just buy from another vendor.

- Consumers with no information/research were already being steered to Intel anyway

So bottom line for this hypothesis Intel gets an extra expense, an extra PR risk, few to no additional units sold, and no overall change in its industry wide rankings/reviews. Why would they bother?

I'd also ask where Nvidia stands in all this. A typical master contract would contain standard protections for Nvidia's brand and product names, so that Dell couldn't trash them by using the 3070 brand on a product that was configured to be something less. Or is Intel paying Nvidia too in this hypothesis?

I'm starting to lean towards my #3, where they found they had a heat and/or power problem until they turned off some cores. That plays into Dell's original response saying their engineers made considered platform decisions (later walked back with the "no its a mistake we've got a patch coming"). It may be the unit will not perform any better (or even worse) with the extra cores turned back on, but at least it won't smell as bad to the public.

 
Hopefully this is an accident, and not a deliberate move. I feel the latter is possible in an attempt to try and keep the GPU cooler given that these high end gaming laptops are losing too much weight and size over the years.
It is much easier to lower TGP than to disable 10% of the Cuda cores.
 
Back