Enthusiast proves AMD's RX 7900 XTX can reach RTX 4090-like performance levels

That's a 4090 that's had its BIOS flashed to one with a 500W power limit -- standard 4090s are capped at 450W, before throttling kicks in. TechPowerUp's power analysis of the Founders Edition version shows how such a card behaves.
Some BIOS will let you set power limits above 100%. The point is the card is being artificially throttled to 450W. It can draw more power, and average power doesn't take into account any spikes that may occur, as we have seen with GPUs in the past.
The 3090 Ti has a 450W TDP so the jump isn't a large as it might seem at face value.
No question that series 40 is more power efficient than series 30, but power requirements are increasing at the GPU, CPU and system level. Less than 5 yrs ago I was running a 450W power supply for a top-end system. Now I'm looking at a minimum 750 and a 1000W to ensure I have enough for the next-gen devices. Since the 16 Series, released in 2019, we have seen high-end GPU power go from 120W to 450W. That's nearly a quadruple increase in power consumption and we've seen almost exactly a quadruple in performance. So, it doesn't seem that efficiency has really improved that much.

1660Ti versus 4090

Average 1080p Performance
365.0 FPS
96.3 FPS

Average 1440p Performance
279.3 FPS
71.1 FPS

(Ultrawide) Average 1440p Performance
243.5 FPS
60.8 FPS

Average 4K Performance
176.2 FPS
42.5 FPS
 
Some BIOS will let you set power limits above 100%. The point is the card is being artificially throttled to 450W. It can draw more power, and average power doesn't take into account any spikes that may occur, as we have seen with GPUs in the past.

No question that series 40 is more power efficient than series 30, but power requirements are increasing at the GPU, CPU and system level. Less than 5 yrs ago I was running a 450W power supply for a top-end system. Now I'm looking at a minimum 750 and a 1000W to ensure I have enough for the next-gen devices. Since the 16 Series, released in 2019, we have seen high-end GPU power go from 120W to 450W. That's nearly a quadruple increase in power consumption and we've seen almost exactly a quadruple in performance. So it doesn't seem that efficiency has really improved that much.

1660Ti versus 4090

[TABLE]
[TR]
[TH]Average 1080p Performance[/TH]
[TD]365.0 FPS[/TD]
[TD]96.3 FPS[/TD]
[TD][/TD]
[/TR]
[TR]
[TH]Average 1440p Performance[/TH]
[TD]279.3 FPS[/TD]
[TD]71.1 FPS[/TD]
[TD][/TD]
[/TR]
[TR]
[TH](Ultrawide) Average 1440p Performance[/TH]
[TD]243.5 FPS[/TD]
[TD]60.8 FPS[/TD]
[TD][/TD]
[/TR]
[TR]
[TH]Average 4K Performance[/TH]
[TD]176.2 FPS[/TD]
[TD]42.5 FPS[/TD]
[TD][/TD]
[/TR]
[/TABLE]
OK why didn't my table display properly? When editing it looked fine.
 
Some BIOS will let you set power limits above 100%. The point is the card is being artificially throttled to 450W. It can draw more power, and average power doesn't take into account any spikes that may occur, as we have seen with GPUs in the past.
All cards are 'artificially' limited and could easily consume more power without a limit. Not sure why this is relevant, though.

Since the 16 Series, released in 2019, we have seen high-end GPU power go from 120W to 450W.
The 16 series isn't high-end, though. If one is going to make a comparison against the 4090s performance-per-watt, it would surely make sense to pick something like the 2080 Ti -- that has a TDP of 250W. If one uses TechPowerUp's analysis again, in the gaming test (a run of CP2077 at 1440p with Ultra settings and no RT), the 4090 averaged 346W and the 2080 Ti was 265W.

The performance figures for that test were 134 fps for the 4090 and 57 fps for the 2080 Ti. So that's an fps/W ratio of 0.39 for the 4090 and 0.22 for the 2080 Ti -- so the 'efficiency' has improved by 77%.

Oh, and tables aren't supported in the forum.
 
All cards are 'artificially' limited and could easily consume more power without a limit. Not sure why this is relevant, though.


The 16 series isn't high-end, though. If one is going to make a comparison against the 4090s performance-per-watt, it would surely make sense to pick something like the 2080 Ti -- that has a TDP of 250W. If one uses TechPowerUp's analysis again, in the gaming test (a run of CP2077 at 1440p with Ultra settings and no RT), the 4090 averaged 346W and the 2080 Ti was 265W.

The performance figures for that test were 134 fps for the 4090 and 57 fps for the 2080 Ti. So that's an fps/W ratio of 0.39 for the 4090 and 0.22 for the 2080 Ti -- so the 'efficiency' has improved by 77%.

Oh, and tables aren't supported in the forum.
Well, the 1660, was the high end of the time. I'm trying to show a historical perspective not a single point in time. However, I did look up the 2080Ti versus 4090. You have about a 2:1 performance increase at a power increase of about 1.73, so a small gain. But that only means the slope of the increase has declined, it's still increasing. Let's see what AMD has to do to get 4090 performance. I'll hazard a guess it will be more than the 4090.
 
Well, the 1660, was the high end of the time.
Can't agree with that statement. I couldn't imagine any review of it, at the time, classed it as a high-end graphics card -- mid-range to lower end of mid-range, sure, but in the Turing era, the likes of the 2070, 2070 Super, 2080, 2080 Super, etc were the high-end cards.

I'm trying to show a historical perspective not a single point in time.
A sensible approach, but you've selected one graphics card from one period in Nvidia's history to compare the 4090's efficiency against. If one is looking for a historical perspective, then a far broader sample, comprising multiple cards, is required.

However, I did look up the 2080Ti versus 4090. You have about a 2:1 performance increase at a power increase of about 1.73, so a small gain.
Are you just looking at the stated TDPs or measured power consumption for the cards, during those specific performance tests? At the risk of repeating myself, here are the results from the direct measurement of power against performance for multiple cards:


One can throw all kinds of valid criticism at the RTX 4090, compared to anything in the Turing era, but suggesting that it only has a small gain in efficiency isn't one of them.

Let's see what AMD has to do to get 4090 performance.
In terms of raw metrics, the FP32 throughput and texture fill rate of the 4090 is 34% higher than the 7900 XTX's; it's only ahead in pixel fill rate and only by a few percent. AMD's solution is simple -- the GCD simply needs to be bigger (I.e. comprise more compute units) and if power consumption is concern, then it needs to shift fabrication onto a more efficient version of TMSC's N5 family (e.g. use N4, the same as Nvidia) or, if it can get sufficient wafers, N3.
 
Here's an interesting article regarding 4090 power draw. As you can see from the chart some games are pushing well above 400W, and close to 500W on average. Look at the peak power draw and you have games drawing over 600w and in some cases over 700w. Even Tom's Hardware stated that to get that last 5% out of the 4090 you're increasing power consumption 15-20% higher. Average power draw is all well and good, but a good design accounts for those peaks, otherwise your PSU could shut down.

This is not to say the 4090 is a bad GPU, I'm just trying to show the correlation between power and performance and high-eng GPUs are drawing a lot of power. Nvidia seems focused on monolithic, chip designs which may be optimized, but still draw significant power. AMD went the chiplet route, but they don't seem to have optimized the power draw yet.

I think to get that next level of performance we are going to see a jump in power. We saw that from the 3090 (350W) to the 4090 (450W) and I can only imagine what the 5090 is going to require. I don't think this is a sustainable path for GPUs.
4090 is great isn't it? 350W. Lower if undervolted. Something you don't see everyday. Good chat.
 
Can't agree with that statement. I couldn't imagine any review of it, at the time, classed it as a high-end graphics card -- mid-range to lower end of mid-range, sure, but in the Turing era, the likes of the 2070, 2070 Super, 2080, 2080 Super, etc were the high-end cards.
You are right, I got my release dates reversed and thought the 16 series came before the 20. But, Even with the 20 series you see much lower power compared to 40 series.
A sensible approach, but you've selected one graphics card from one period in Nvidia's history to compare the 4090's efficiency against. If one is looking for a historical perspective, then a far broader sample, comprising multiple cards, is required.
Well I looked at the high-end GPU of 20, 30 and 40 series, so not a single card, just a single model out of the series. Didn't have time to plot out every card in the family.
Are you just looking at the stated TDPs or measured power consumption for the cards, during those specific performance tests? At the risk of repeating myself, here are the results from the direct measurement of power against performance for multiple cards:
If the card is rated as being capable of drawing x Watts, I would not design my PSU for a lower anticipated power draw. Also, while most people here talk about gaming performance, I also consider non-gaming performance and I know that different apps, as well as different games will draw more or less power. We see that in some of the benchmarks that will push the 4090 up to the 450 W limit and beyond, if allowed.

One can throw all kinds of valid criticism at the RTX 4090, compared to anything in the Turing era, but suggesting that it only has a small gain in efficiency isn't one of them.
I'm not criticizing the 4090, just pointing out it draws a lot of power and the trend, so far, is that more performance appears to require more power. The 3090 to 3090Ti was a 30% jump in power, but wasn't the Ti version only about 10-15% faster at 1080P?
In terms of raw metrics, the FP32 throughput and texture fill rate of the 4090 is 34% higher than the 7900 XTX's; it's only ahead in pixel fill rate and only by a few percent. AMD's solution is simple -- the GCD simply needs to be bigger (I.e. comprise more compute units) and if power consumption is concern, then it needs to shift fabrication onto a more efficient version of TMSC's N5 family (e.g. use N4, the same as Nvidia) or, if it can get sufficient wafers, N3.
We will see what AMD comes up with. I suspect we will see higher power consumption with the 7950XTX (assuming that is a real product).
 
the trend, so far, is that more performance appears to require more power.
Which has pretty much always been the case, despite process node improvements. For example, if one compares TMSC's claims for power improvements between N16 to N4, the latter consumes 80% less than the starting node, for the same clocked circuit. On the other hand, if one keeps the power usage the same, N4 is supposed to offer 69% better clocks.

Naturally, GPU designers will aim for a combination of the two improvements but favor power gains because it gives one more scope for adding transistors. Faster clocks are good to have, of course, but GPUs are all about improving the amount of parallelism.

For example, the biggest N16 chip Nvidia made was the GP102 and in the form of the Titan Xp, it had a boost clock of 1.58 GHz for a TDP of 250 W. The current AD102 in the 4090 is 2.52 GHz for a TDP of 450 W. That's a clock-per-W of 6.3 MHz/W for the GP102 and 5.6 MHz/W for the AD102, but the latter contains 6.5 times more transistors, resulting in 4 times more CUDA cores, 11 times more L1 cache, and 24 times more L2 cache.

450 W might sound a lot (and, well, it is!) but historically it's not awful, relative to the competition -- AMD's water-cooled Radeon Vega had a TDP of 375 W at a time when Nvidia's 1080 Ti was just 250 W.
 
Which has pretty much always been the case, despite process node improvements. For example, if one compares TMSC's claims for power improvements between N16 to N4, the latter consumes 80% less than the starting node, for the same clocked circuit. On the other hand, if one keeps the power usage the same, N4 is supposed to offer 69% better clocks.

Naturally, GPU designers will aim for a combination of the two improvements but favor power gains because it gives one more scope for adding transistors. Faster clocks are good to have, of course, but GPUs are all about improving the amount of parallelism.

For example, the biggest N16 chip Nvidia made was the GP102 and in the form of the Titan Xp, it had a boost clock of 1.58 GHz for a TDP of 250 W. The current AD102 in the 4090 is 2.52 GHz for a TDP of 450 W. That's a clock-per-W of 6.3 MHz/W for the GP102 and 5.6 MHz/W for the AD102, but the latter contains 6.5 times more transistors, resulting in 4 times more CUDA cores, 11 times more L1 cache, and 24 times more L2 cache.

450 W might sound a lot (and, well, it is!) but historically it's not awful, relative to the competition -- AMD's water-cooled Radeon Vega had a TDP of 375 W at a time when Nvidia's 1080 Ti was just 250 W.
The point I'm trying to make, somewhat ineffectively, is that I do not think we can continue down a path of ever increasing power requirements just to get the next increase in performance.
 
The point I'm trying to make, somewhat ineffectively, is that I do not think we can continue down a path of ever increasing power requirements just to get the next increase in performance.
No GPU engineer just wants to throw more power at their designers in order to improve performance -- it increases the amount of heat the system has to dissipate, increasing the costs of the associated cooling and power management/delivery systems. However, this has to be balanced against die fabrication costs.

AMD, Intel, and Nvidia could easily create an enormous chip, packed full of cores and cache, and topped off with huge amounts of memory bandwidth, and then set the clocks to be very conservative to keep the power levels right down. But then that would reduce wafer yields, making the whole thing less profitable. All three companies charge a small fortune for their data center GPUs, which are precisely what's been described above, but nobody in their right mind is going to pay $5k+ for a standard graphics card.

The consumer market is far more concerned over prices and performance than it is about power consumption. If you want the first two to be within the range that one desires, then the last aspect has to be accepted as the caveat.
 
snip

The consumer market is far more concerned over prices and performance than it is about power consumption. If you want the first two to be within the range that one desires, then the last aspect has to be accepted as the caveat.
I sense this is changing, perhaps more outside the US than within, but I do see people mentioning this in various forums. It also matters for DIY builders since it means either starting with a larger PSU or having to upgrade the PSU more often than in the past. Plus there is a cost element as well.
 
Well, I saw one of those "Is this too good to be true?" youtube videos where someone saw a suspiciously cheap (like $190-200) RTX 4xxx card and decided to get one. They assumed it'd be a total scam. Well, yes and no -- turned out they put the *mobile* GPU onto a PCIe card! He pointed out the performance was decent, and it was nice to have FAR lower power draw and heat disappation -- he double-checked they did not mention "mobile" anywhere on the description and were therefore lying, but suggested they should probably list it HONESTLY as what it is and they would probably have plenty of takers who don't want the high power draw and heat dissipation of the "full fat" card.

For me, *I* was concerned about power consumption for 1 simple reason -- I had my previous (Sandy Bridge) system croak, got a replacement (Ivy Bridge) system for $40 (so I could move the RAM and HDDs etc. straight over), only to find there were NO extra power connectors when I was done so I could NOT plug in the extra power connector on the GTX650! So I got the GTX1650 variant that will run solely off 75W PCIe power (I think the GTX1650 naturally peaks at like 80W, so a few vendors underclock it by like 1% and it stays below 75W.) No complaints! Frankly the GPU is overpowered for this system, other than GravityMark I haven't had anything get the GPU above like 80% usage (games either hit the FPS/monitor refresh cap, or are CPU limited), but the FPS has been fine in every game I've run so far anyway so OK. Well, OK, second reason, I didn't want to have it sound like there's a vacuum cleaner running in my case or invest in water cooling...this thing has a big ol' set of fans that barely turn over (I can't hear them at all over the very quiet system fans) while keeping the GPU under 40C.

Agreed, maybe 75W isn't a hard limit, but I was surprised when I looked for something that would keep power low enough my choices were A) GTX1650. B)... that's about it! There are still like Geforce4-era cards being sold for like $20-40 (don't know if they are "new old stock" or actually still producing a few new)... clearly meant for the "my GPU/integrated GPU died and I need something to plug my monitor into" market though!
 
Back