Nvidia RTX 5090 reportedly has 32GB of RAM on a 512-bit bus

Daniel Sims

Posts: 1,877   +49
Staff
Rumor mill: The rumored specifications of Nvidia's upcoming next-generation flagship graphics card have shifted back and forth over the last several months. However, the latest report from a verified leaker suggests that it will feature more memory than prior estimates. The new performance king might also draw unprecedented wattages.

Trusted tipster @kopite7kimi recently shared the basic specifications for the top two products in Nvidia's next series of desktop GPUs, codenamed Blackwell. Both appear to present substantial upgrades over their predecessors in multiple areas.

The lineup's top card, presumably called the GeForce RTX 5090, will include 32GB of GDDR7 VRAM on a 512-bit memory bus. This configuration is significantly larger and faster than the most powerful GPU currently available – the 4090 – which features 24GB of GDDR6X VRAM on a 384-bit bus. Earlier reports suggested that the 5090 might have 28GB on a 448-bit bus, and whether future reports could shift again remains unclear.

Kopite's projected CUDA core count offers another dramatic leap, increasing to 21,760 over the 4090's 16,384. The leaker also reaffirmed his previous report that the 5090 will consume 600W. The 4090 was feared to be a 600W card before its launch but only draws 450W. When MSI's new high-end power supplies were discovered to feature two 16-pin power connectors, speculation arose that the 5090 might need both, necessitating a new PSU purchase for all consumers.

Meanwhile, the second Blackwell GPU, the RTX 5080, will feature 10,752 CUDA cores to the 4080's 9,728. As previously rumored, its TDP grows to 400W from its predecessor's 320W.

However, for some models, Blackwell might see Nvidia continue its unpopular practice of skimping on VRAM. The 5080, like the 4080, will include only 16GB of memory on a 256-bit bus, with the shift to GDDR7 being the only upgrade. Prior reports indicated that the RTX 5070 will also employ GDDR7 RAM, but lower-tier variants will continue using GDDR6X.

Other important details that remain under wraps include clock speeds, tensor cores, ray-tracing cores, price, and more. Overall, the 5090 is rumored to offer a 70 percent performance improvement over the 4090.

Nvidia is expected to unveil Blackwell at CES 2025 in January. AMD might also introduce its next-generation RDNA 4 GPUs at the event, but the lineup will only feature mid-range and mainstream products, which sell more units. Meanwhile, Intel might begin shipping its upcoming Arc Battlemage series in late 2024.

Permalink to story:

 
Based on the specs and TSMC'S own 4N numbers I am guesstimating the 5080 performance is going to be about 90% of the 4090 vanilla. The delta between the 4080 and 4090 is 25% on average in gpu bound resolutions.


update look out for paid trolls postulating high prices for likes in the comments.
1000050174.png
 
Last edited:
Can it play Crysis?
It depends on resolution, but it can probably play 20 instances of Crysis simultaneously on max settings with 100 fps on 1080p and 4 to 6 on 4k. I believe a 4090 can play 2 or 3 instances on 4k. Oh, wait, you jest...!
 
Perhaps the 5080 is being cut down so much from the 5090 in order to comply with export restrictions? If Nvidia wants to sell it in China the 5080 would have to be on par or slower than the 4090D.
 
Perhaps the 5080 is being cut down so much from the 5090 in order to comply with export restrictions? If Nvidia wants to sell it in China the 5080 would have to be on par or slower than the 4090D.
Just a reminder this is the same leaker that said the 5090 is going to sip 600 watts at a 2 slot design 🤪!
 
I'm really sorry to say that with only 32GB of VRAM it's not going to be very useful for AI interference. For example, if we interfere with an LLM with 70 billion parameters in Q6 quantisation, it will be about 60GB in size. So in every token interference (a token is a little word like "the") we've got all that size moving through memory. So if we need to process 10 tokens per second, we'll need a memory bandwidth of around 600GB per second. Unfortunately, DDR5 only has a maximum of 60GB/sec, which means that it's not possible to interfere with an LLM without VRAM (it's practically impossible in RAM).

So basically, the availability of VRAM is pretty much synonymous with the ability of people to access the information.

The A100 had 80GB of vram by 2020! It is technically possible to put 512GB of vram on the 5090 (which has a memory bandwidth of 1800GB/sec, which is enough for 10t/sec on a 400B model at q3 with a size of ~200GB), so it can interfere with models with ~400B parameters. They don't do it because they want to sell the cards at a higher price to the companies that write them off as an expense and deduct them from their taxes, so they get them for free. I understand why they do it, but I think we need some local intervention because it will increase the efficiency of the economy. I don't think playing with paper (money) is very effective.

They are leaving a big strategic hole. The competitors have a great opportunity to engage with the demand and boost their sales. All they have to do is put over 256 GB of cheap DDR7 VRAM on their cards. They'll sell an inferior card for 3000$ when the competitors offer three times more performance for one third of the price. I think we can all agree that these cards are mostly about AI interference(most people run games at 1080p 60Hz which a 3060 is more than enough). This market opportunity exists because of the vram paradox, which is basically that each token needs the whole model to go through the memory. It's not about advancing technology, it's about the nature of the data structure, which only needs fast vram in close proximity.
 
Perhaps the 5080 is being cut down so much from the 5090 in order to comply with export restrictions? If Nvidia wants to sell it in China the 5080 would have to be on par or slower than the 4090D.
I saw people speculating that GB202 is just two GB203 chips spliced together, either in MCM (like the RX 7900 cards) or some other form of interconnect. That would explain the huge disparity in sizes.
 
Meanwhile AMD (and Intel for that matter) are focused on releasing their next Gen cards to compete with past Gen 4060/70 Cards. nVidia could (doubt they will though) drop a 5070 (or 5070Ti) at solid price point with ease if they (nVidia) want to hold the middle ground even more-so painful to AMD and Intel as it is today. AMD and Intel don't need to match or beat 90 series (or titan), they do though need to be in a class with 80's series (and not the gen before) and like or not for a gen or two be slightly better priced. AMD can, they don't want to, the interest is to sell that grade of GPU for a lot more cash as A.I. Card, while costly, at least nVidia continues to provide cards for gaming and enthusiasts.
 
I think the sole purpose of the xx90 cards is about brand image, the mid-upper tier has always been the sweet spot over the decades for price vs performance.
 
Meanwhile AMD (and Intel for that matter) are focused on releasing their next Gen cards to compete with past Gen 4060/70 Cards. nVidia could (doubt they will though) drop a 5070 (or 5070Ti) at solid price point with ease if they (nVidia) want to hold the middle ground even more-so painful to AMD and Intel as it is today. AMD and Intel don't need to match or beat 90 series (or titan), they do though need to be in a class with 80's series (and not the gen before) and like or not for a gen or two be slightly better priced. AMD can, they don't want to, the interest is to sell that grade of GPU for a lot more cash as A.I. Card, while costly, at least nVidia continues to provide cards for gaming and enthusiasts.
The blind spot with AMD's rdna4 strategy on not competing in high end/ enthusiast is for every 1 card they sold thus far their are almost 9 Nvidia cards sold. With this market share AMD will not only competing with Intel from the bottom, AMD'S own rdn3 parts but a significant second hand (used) market with Nvidia's parts. Will gamers purchase used 4080/super, 4070ti/super, 4070/super or a new rdna4 gpu?
In order to gain market share they have to have competing software stack, and underline price competetive advantage.
Another blind spot is giving free advertisement to Nvidia's flagship when showing off its gaming cpus like 4090/7800X3D and soon 5090/9800x3d. ( I wonder if AMD is launching Zen5 x3d sooner rumored to line up with 5090 🤔).
 
Meanwhile AMD (and Intel for that matter) are focused on releasing their next Gen cards to compete with past Gen 4060/70 Cards.

That makes no sense and is contradicted by statements from AMD as they said they'll be competing in the midrange segments. You will notice that the 4060 is entry level and the 4070 is low midrange.

For the previous generation.

People can imagine what they want but a reasonable option from a market perspective will be like RDNA 1, no halo product like the 7900 XT/X Series but release models for all the segments below that.
 
I'm really sorry to say that with only 32GB of VRAM it's not going to be very useful for AI interference. For example, if we interfere with an LLM with 70 billion parameters in Q6 quantisation, it will be about 60GB in size. So in every token interference (a token is a little word like "the") we've got all that size moving through memory. So if we need to process 10 tokens per second, we'll need a memory bandwidth of around 600GB per second. Unfortunately, DDR5 only has a maximum of 60GB/sec, which means that it's not possible to interfere with an LLM without VRAM (it's practically impossible in RAM).

So basically, the availability of VRAM is pretty much synonymous with the ability of people to access the information.

The A100 had 80GB of vram by 2020! It is technically possible to put 512GB of vram on the 5090 (which has a memory bandwidth of 1800GB/sec, which is enough for 10t/sec on a 400B model at q3 with a size of ~200GB), so it can interfere with models with ~400B parameters. They don't do it because they want to sell the cards at a higher price to the companies that write them off as an expense and deduct them from their taxes, so they get them for free. I understand why they do it, but I think we need some local intervention because it will increase the efficiency of the economy. I don't think playing with paper (money) is very effective.

They are leaving a big strategic hole. The competitors have a great opportunity to engage with the demand and boost their sales. All they have to do is put over 256 GB of cheap DDR7 VRAM on their cards. They'll sell an inferior card for 3000$ when the competitors offer three times more performance for one third of the price. I think we can all agree that these cards are mostly about AI interference(most people run games at 1080p 60Hz which a 3060 is more than enough). This market opportunity exists because of the vram paradox, which is basically that each token needs the whole model to go through the memory. It's not about advancing technology, it's about the nature of the data structure, which only needs fast vram in close proximity.

The A100 is in Nvidia's Tesla series of GPUs, which are meant for enterprise. The 5090 is a consumer-grade card and fills an entirely different purpose. If you want high amounts of VRAM, buy a bunch of H200s and use NVLINK to combine all of their RAM together, and stay away from the 5000 series altogether.
 
I am calling 30-45% uplift from the 4090.

Unlike the last time, Nvidia is not coming from Samsung to TSMC, they are going from TSMC to TSMC.
 
Last edited:
Meanwhile AMD (and Intel for that matter) are focused on releasing their next Gen cards to compete with past Gen 4060/70 Cards. nVidia could (doubt they will though) drop a 5070 (or 5070Ti) at solid price point with ease if they (nVidia) want to hold the middle ground even more-so painful to AMD and Intel as it is today. AMD and Intel don't need to match or beat 90 series (or titan), they do though need to be in a class with 80's series (and not the gen before) and like or not for a gen or two be slightly better priced. AMD can, they don't want to, the interest is to sell that grade of GPU for a lot more cash as A.I. Card, while costly, at least nVidia continues to provide cards for gaming and enthusiasts.
RDNA 4 main focus was the PS5 Pro, which is surprising.

The PS5 Pro is about a 4070 Super in terms of performances, so I doubt we are going to see anything stronger than a 7900XT on RDNA 4 for dGPU.

AMD cannot do chiplets design GPU since they need the CoWoS allocation at TSMC for MI300 family.
 
I am calling 30-45% uplift from the 4090.

Unlike the last time, Nvidia is not coming from Samsung to TSMC, they are going from TSMC to TSMC.

Probably less than that as going from the 4080 to the 4090 is 68% more cores and 50% more bandwidth for a 25% FPS improvement. The 5090 will only get about 28% more cores and ~50% bandwidth so maybe...

20% faster.
 
The blind spot with AMD's rdna4 strategy on not competing in high end/ enthusiast is for every 1 card they sold thus far their are almost 9 Nvidia cards sold. With this market share AMD will not only competing with Intel from the bottom, AMD'S own rdn3 parts but a significant second hand (used) market with Nvidia's parts. Will gamers purchase used 4080/super, 4070ti/super, 4070/super or a new rdna4 gpu?
In order to gain market share they have to have competing software stack, and underline price competetive advantage.
Another blind spot is giving free advertisement to Nvidia's flagship when showing off its gaming cpus like 4090/7800X3D and soon 5090/9800x3d. ( I wonder if AMD is launching Zen5 x3d sooner rumored to line up with 5090 🤔).
They already are, but the mind share of Nvidia is absolutely impossible to impregnate. The driver interface of catalyst is way ahead of Nvidia Control Panel.

As for drivers, it is just stupid fanboy gaslighting. I have been using both brands for years and there is absolutely no difference in terms of stability or functionality.

Finally the 7900XTX is making the whole 4070 TI, 4080 and 4080 Super looking like a joke, and it is not even a value proposition. But even there, it didn't matter, people are brainwashed to buy Nvidia so they do even if, beside the 4090 (even if the price is worst than a disaster), the whole 4000 series is a disaster.
 
Probably less than that as going from the 4080 to the 4090 is 68% more cores and 50% more bandwidth for a 25% FPS improvement. The 5090 will only get about 28% more cores and ~50% bandwidth so maybe...

20% faster.
Didn't even knew that... you might be right then.

But they do seem to use that full 600W of power envelope...
 
Probably less than that as going from the 4080 to the 4090 is 68% more cores and 50% more bandwidth for a 25% FPS improvement. The 5090 will only get about 28% more cores and ~50% bandwidth so maybe...

20% faster.
Blackwell should naturally bring some ipc gains and 20 to 25% clock improvements via rumors. Rt deltas will probably be the biggest improvements as well as dedicated software exclusive to Blackwell as we have seen with previous generations. I wonder why the performance doesn't scale linearly via core count in the same generation 🤔?
 
This is all cool but where are the games? I feel we don't have AAA games anymore that require this level of power anymore. Even nvidia's website lists only 2 good games it looks like. BMW from this year and Cyberpunk is the only other worthy mention. I could just get the PS5 pro at this point and at least enjoy the sony exclusives at release. Ahh. Gaming has sucked post pandemic.
 
This is all cool but where are the games? I feel we don't have AAA games anymore that require this level of power anymore. Even nvidia's website lists only 2 good games it looks like. BMW from this year and Cyberpunk is the only other worthy mention. I could just get the PS5 pro at this point and at least enjoy the sony exclusives at release. Ahh. Gaming has sucked post pandemic.

Not much recenly other then Cyberpunk (the finally fixed version). I couldn't get into it. So repetitive. Witcher 3 I replayed 3 times so it's not the genre.

Will eventually get to Elder Ring, Dead Space, Resident Evil 4, and Red Dead 2 at some point. I wish they finally released Demon's Souls on PC. Might just get a Switch 2 next year. Not much draw to spending massive amounts of money on a new build PC. The only draw now is a nice 32" OLED as a second monitor. Still waiting for a proper DP2.1 UHBR 20 video card though before making the jump.
 
Good points, let's hope the competition takes the opportunity to finally differentiate and capture the non-elite market.

I'm really sorry to say that with only 32GB of VRAM it's not going to be very useful for AI interference. For example, if we interfere with an LLM with 70 billion parameters in Q6 quantisation, it will be about 60GB in size. So in every token interference (a token is a little word like "the") we've got all that size moving through memory. So if we need to process 10 tokens per second, we'll need a memory bandwidth of around 600GB per second. Unfortunately, DDR5 only has a maximum of 60GB/sec, which means that it's not possible to interfere with an LLM without VRAM (it's practically impossible in RAM).

So basically, the availability of VRAM is pretty much synonymous with the ability of people to access the information.

The A100 had 80GB of vram by 2020! It is technically possible to put 512GB of vram on the 5090 (which has a memory bandwidth of 1800GB/sec, which is enough for 10t/sec on a 400B model at q3 with a size of ~200GB), so it can interfere with models with ~400B parameters. They don't do it because they want to sell the cards at a higher price to the companies that write them off as an expense and deduct them from their taxes, so they get them for free. I understand why they do it, but I think we need some local intervention because it will increase the efficiency of the economy. I don't think playing with paper (money) is very effective.

They are leaving a big strategic hole. The competitors have a great opportunity to engage with the demand and boost their sales. All they have to do is put over 256 GB of cheap DDR7 VRAM on their cards. They'll sell an inferior card for 3000$ when the competitors offer three times more performance for one third of the price. I think we can all agree that these cards are mostly about AI interference(most people run games at 1080p 60Hz which a 3060 is more than enough). This market opportunity exists because of the vram paradox, which is basically that each token needs the whole model to go through the memory. It's not about advancing technology, it's about the nature of the data structure, which only needs fast vram in close proximity.
 
Back