PCI Express Bandwidth Test: PCIe 4.0 vs. PCIe 3.0 Gaming Performance & Limited VRAM Memory...

We have to see. The cheapest 1650 on Newegg is $324 on Newegg right now but it performs a good bit below an RX 570.

So in the end what matters is a) street price vs performance and b) what you want to use the card for. We‘ll see very shortly.
GTX 1650 4GB GDDR6 have appeared on Amazon quite many times over the past few months. I've even managed to snap an Asus TUF Gaming OC model at around about 220 USD before taxes. There are cheaper models with no extra 6 pin power connector going for close to 200 bucks.
I expect the RX 6500 XT to run faster than the GTX 1650 GDDR6, but that is assuming the issue that the PCI-E bottleneck don't happen. I feel AMD engineers mistaken the Infinity Cache to be either really infinite or if they mistook the 16 MB as 16 GB that they decided to gimp the PCI-E to x4. To me, the problem is not that much the x4, but the fact is that you need at least a PCI-E 4.0 x4 for this card to somewhat perform alright. And yet, this is meant to be a budget card, which means most users are going to be using like a PCI-E 3.0 system. It is funny that AMD seems to be on top of backward compatibility for their CPUs, but yet have little consideration about backward compatibility on GPUs with this sorts of cutback. Sure, it will run alright, but with a significant performance hit.
 
I find it amusing that people genuinely believe that AMD was trying to do gamers a favour by cutting their costs and only bundling 4GB of VRAM with the 6500XT. The execs at AMD must be laughing like pirates!

It feels to me that AMD is trying to capitalise on the current GPU shortage situation to test and see how much cost they can cut before people start rejecting their products. While I feel they will certainly be able to sell these GPUs to OEMs which are in dire need for GPUs, but this is a very negative publicity to tech savvy people. And the worst case just like Nvidia, they like to paint a beautiful picture of themselves as knights in shining armour here to "save gamers". But in actual fact, it is always about their bottom line that they are concerned about. After all, they are profit seeking companies, but they should save their money by not marketing their pretence. In this case, it is like saying, "Let me find gamers some GPUs by making it so bad, that miners won't want it." Ironically, miners are getting the better cards.
 
The reason for this release is fairly simple. OEM's or system builders. They wont be packing a 2x00 series of CPU in there with only PCI-E 3.0 but a more modest 3x00 or even 5x00 series (AMD) or Intel equivalent with PCI-E 4.0.

 
I love AMD and Lisa Su long time PC builder and shareholder but I'm shocked she let this go through and who is that woman in that picture on stage? She new to AMD? If she was hired because of diversity and that stupid lib crap then I understand the stupid reason and path of this failure of a product.

Come on INTEL give us something! No matter what INTEL makes they really can't fail in this GPU market lol! Just wow with how politics are going in America and this,THIS HAS TO BE A BAD DREAM.
 
I wish reviewers will recall, when covering this debacle, that AMD chose to prematurely discontinue driver updates for Fiji cards during the chip shortage. I am the not-so-proud owner of a Fury X that should be more performant but has been locked out of contention since July.

Those parts are faster than these terrible 4GB things being unleashed into the market like AMD's management is just as crass and duplicitous as it was with the FX 9000 series CPUs.

Tech enthusiasts should never forget how much those cost and how they were sold on nonsense. Motherboards that caught fire were only part of that joy. The famous overclocker 'The Stilt' said that GF would have had to send the chips that became the 9000 series to the crusher for having leakage that exceeded even the low expectations specification of first-generation Piledriver. Instead, AMD created a new much worse spec, a CPU that cost something like $600 in 'back then' money, and companies like AsRock gave us motherboards that caught fire.

The 'Polaris Forever' strategy was bad enough (a device used to inflate prices with Nvidia and consoles) but now they're actually releasing parts weaker than weakened Polaris? Oh, the inhumanity of corporate love.
 
In my opinion, there is nothing wrong with 4GB VRAM cards, especially if the price is right.
Large gaming companies have a special relationship with chip makers like Nvidia and AMD. It is in their interest to inflate VRAM requirements. Is it accidental that some of the games tested require just over 4GB—just enough to cause problems?

A good developer would be sure to make it possible to have medium settings fit into 4GB. But money talks.

It talks enough for AMD to release something as outrageous as these things, things that don't even take advantage of old PCI-e 3... things worse than Polaris, which wasn't impressive when it debuted.

The GPU situation for gamers can be summed with one word: facepalm.
 
AMD taking another dump on consumers in order to save a few dollars on manufacturing. Everyone who ends up with a 6500XT is going to end up hating Radeon/AMD when they get it home and get hit with dire performance alongside the usual Radeon driver issues. AMD fans should be upset that this abomination is being unleashed unto the unsuspecting public and inevitably tarnishing their own brand.
 
What we learned - do not ever buy a 4GB card. Honestly can't believe it's a thing in 2022 on a card with RRP of $200US. Even worse the 64 bit bus of the 6500XT is a sad pathetic joke IMO. I'm amazed the 6400 doesn't have a 32 bit bus, 2GB, and 4MB IC.

No, what we've learned is gimping a card by reducing it's buses bandwidth will cripple a 4GB card. Give a 4GB card a full PCIe 3.0 16x slot's bandwidth and it's still quite viable for the majority of games, as long as you're willing to dial the visuals back of course.

This is simply the time honoured tradition of gimping entry level cards to the point where they're useless for gaming, and has been going on since both Nvidia and AMD started producing different tiers for each generation. Usually it was by giving the entry card a ridiculously low memory bus speed making utilizing the card's VRAM impossible. I guess they couldn't do that this time and limited the PCIe bus speed instead.

So IMHO it's not the reason to avoid cards with only 4GB of VRAM, and more the reason why gamers should always avoid the lowest tier cards. Again IMHO those cards are meant more for OEM builders so they can say their systems come with a genuine AMD/Nvidia craphics card then anything else...
 
They could have at least made the foot smaller so that it could fit in a wide variety of PCI-Express slots. :laughing:
 
I'm wondering how much influence infinity cache has on this. Theoretically, it should reduce data transfer over the PCI-e bus, so, the performance drop should be reduced.

Additionally, there might be an unusually large difference between resizable bar enabled and disabled for this card. You should test that too.
This was entirely wrong apparently xD
 
"It’s been widely reported that the 6500 XT is restricted to PCI Express 4.0 x4 bandwidth" This statement in the review implies 8GB/sec bandwidth restriction not a lane restriction which is very different.

This review is much to do about nothing. Chicken little and the sky is falling when its not.

Pci 3.0 x 4 = 4 GB/sec which yes since 2012 that can bottle neck.

But no one runs their cards at that speed because all the MBs since 2012 or so have had PCi2.0x 16 (8GB/sec) or PCI 3.0 x 16 = 16 GB sec.

They are comparing 4GB/sec to 8 GB/sec and yes in a 4GB card the slower 4GB/sec transfer can bottle neck if you run out of VRAM nothing new here folks.

Lets say you have a B450 Pro4 MB (Their example of horror) as its only PCIx 3.0.
That board has 2 PCIe 3.0 x16 slots. Or each slot does 16GB/sec not 4GB/sec so why would you run your card at 4GB/sec when the MB supports 16GB/sec ????

6500 XT is restricted to PCI Express 4.0 x4 bandwidth (8GB/sec) In their testing they did not use 8GB/sec on PCIe 3.0 x 8 they used PCIe3.0 x 4 (4GB/sec) Why would they do this?

Anyone that has a older 2012 PCI 3.0 MB it can do PCIe x 8 (8 GB sec) and even PCIe 3.0 x 16 = 16 GB/sec.

Real test should have been PCIe 4.0 x4 = (8GB/sec) vs PCIe 3.0 x 8 = (8GB/sec) then you have apples to apples.

Again those older MBs can all do PCIe 3.0 x 8 and PCIe 3.0 x 16 so why test at PCIe 3.0 x 4?????????????????

The new AMD card is limited to 8GB/sec on PCI slot. Most any MB made from 2012 to newer can do that.

The Sky is not falling. Unless..........

Their initial statement is in error and rather than a bandwidth limitation its a physical lane restriction. That is far different.
If the card were to only have 4 Physical lanes. Then you are looking at a physical connection restriction and not bandwidth. So then you need to look at what 4 lanes delivers on various MBs.
4 lanes
PCIX 5.0 = 16 GB/sec
PCIX 4.0 = 8 GB/sec
PCIX 3.0 = 4 GB/sec
PCIX 2.0 = 2 GB/sec

Anything 8 GB/sec and higher does not seem to hurt or bottle neck performance.
 
Last edited:
Could someone explain to me why the card with less memory suffers with less bus bandwidth?

I'm under the impression that the 4 GB card wouldn't be able to saturate the bus as well because its own throughput would be inherently lower. I would expect a higher performing card to suffer from reduced bandwidth sooner.
 
Could someone explain to me why the card with less memory suffers with less bus bandwidth?

I'm under the impression that the 4 GB card wouldn't be able to saturate the bus as well because its own throughput would be inherently lower. I would expect a higher performing card to suffer from reduced bandwidth sooner.
The reasons for this will vary from game to game, but if we take Shadow of the Tomb Raider as an example, we can get a good idea of what's going on.

For each frame presented on the monitor, there are an awful lot of separate processes that have to take place. Let's assume all of the required assets (vertex buffers, index buffers, and texture buffers) from the frame are already loaded into the local memory - the graphics card's RAM. SotTR doesn't do much in the way of asset streaming, which is why this game is a good example to use here.

To render the frame, the GPU first draws out the scene's depth values, at the monitor's resolution, into a buffer that's stored in the RAM. The scene is then rendered again, generating two textures in the process - one to generate a texture that contains the scene's normal map and a shadow mask, the other is a velocity map. Both of these are also stored in the RAM but the important thing to note here is that these are larger than the monitor resolution.

Then the primary lighting and shadowing are done, all of which gets rendered into another set of buffers, again stored in the RAM (the shadow map buffer is especially large). After that, the scene is rendered for the third time, where three additional textures (HDR, albedo + roughness, normal + metallic). A forth rendering pass follows to generate the SSAO (screen space ambient occlusion) and yet more buffers are required for this.

Rendering pass number takes place after the SSAO pass to produce the screen space reflection texture (stored in RAM), before the final pass to do the post-processing (blur, bloom, DoF, TAA, tone mapping, etc) and the UI. All of this produces a final buffer that's actually HDR, so if the monitor is SDR, another buffer is required to sample the HDR on into.

So TLDR version, the available space in the RAM is under serious demand during all of the rendering to create a single frame. The buffers generated during the passes can't be stored in system RAM (well, they can, but no developer would ever let this happen) so if there isn't sufficient room, then assets need to be swapped out - they're not sent back anywhere, as they're always stored in system RAM. It's simply a case of the video RAM is flagged as being available and if an asset is required, and it's not in local memory, another asset may need to be reallocated and then copied over.

This is why low RAM cards suffer the most because once assets start flying about during gameplay, the PCIe interface then gets hit.
 
The reasons for this will vary from game to game, but if we take Shadow of the Tomb Raider as an example, we can get a good idea of what's going on.

For each frame presented on the monitor, there are an awful lot of separate processes that have to take place. Let's assume all of the required assets (vertex buffers, index buffers, and texture buffers) from the frame are already loaded into the local memory - the graphics card's RAM. SotTR doesn't do much in the way of asset streaming, which is why this game is a good example to use here.

To render the frame, the GPU first draws out the scene's depth values, at the monitor's resolution, into a buffer that's stored in the RAM. The scene is then rendered again, generating two textures in the process - one to generate a texture that contains the scene's normal map and a shadow mask, the other is a velocity map. Both of these are also stored in the RAM but the important thing to note here is that these are larger than the monitor resolution.

Then the primary lighting and shadowing are done, all of which gets rendered into another set of buffers, again stored in the RAM (the shadow map buffer is especially large). After that, the scene is rendered for the third time, where three additional textures (HDR, albedo + roughness, normal + metallic). A forth rendering pass follows to generate the SSAO (screen space ambient occlusion) and yet more buffers are required for this.

Rendering pass number takes place after the SSAO pass to produce the screen space reflection texture (stored in RAM), before the final pass to do the post-processing (blur, bloom, DoF, TAA, tone mapping, etc) and the UI. All of this produces a final buffer that's actually HDR, so if the monitor is SDR, another buffer is required to sample the HDR on into.

So TLDR version, the available space in the RAM is under serious demand during all of the rendering to create a single frame. The buffers generated during the passes can't be stored in system RAM (well, they can, but no developer would ever let this happen) so if there isn't sufficient room, then assets need to be swapped out - they're not sent back anywhere, as they're always stored in system RAM. It's simply a case of the video RAM is flagged as being available and if an asset is required, and it's not in local memory, another asset may need to be reallocated and then copied over.

This is why low RAM cards suffer the most because once assets start flying about during gameplay, the PCIe interface then gets hit.
Wow! Thanks for taking the time to write such a detailed reply. I'm still not sure I'm totally clear. Is this right: Because there's less ram on the video card, more data has to be sent back and forth to system memory, and that eats up bandwidth? Like, the video card runs out of memory and has to send stuff to system memory over the PCIe bus to free up space to continue working on the scene?

Also, I just have to say, I've dabbled in understanding the 3D rendering process over the years, each time I end up cross-eyed :) Kudos to all the smart folks who come up with this stuff. You make the rest of us look like chimps :D
 
Last edited:
Is this right: Because there's less ram on the video card, more data has to be sent back and forth to system memory, and that eats up bandwidth?
Yes, that’s exactly what’s going on. Games with complex rendering can really struggle with low amounts of VRAM, but with clever asset management (e.g. Doom Eternal) the problem can be greatly reduced.
 
I know this article is almost two years old, but I found it to be very informative still today.

I have a secondary machine running a Ryzen 3 3100 (Matisse 4C/8T) with the aforementioned GPU, Radeon RX 6500 XT (4GB, PCIe v4.0 x4), and was looking for a little CPU upgrade on a $100 budget as this machine has been doing more video encoding than previously. I almost pulled the trigger on the Ryzen 5 5500 (Cezanne 6C/12T) after just comparing the static benchmarks: single thread ratings of 2422 vs 3061, and CPU Marks of 11633 vs 19532, respectively. On paper, it seemed like a decent CPU upgrade.

Then I realized that the 5500 (and all budget Cezannes) only supports PCIe v3.0. This article does a great job to illustrate the video performance hit my machine would likely take by downgrading from PCIe v4.0 to PCIe v3.0, especially with only 4GB GDDR6 VRAM. Although this machine does a lot of unattended video encoding, I might play a vintage game of CS:S, L4D, etc., while waiting for a job to render. So, video performance on this machine does matter to me a little.

I'm going to run my own benchmarks by limiting the PCIe to v3.0 in BIOS as Steven mentions in the article, but have no reason to believe I'll see anything vastly different from the benchmarks above and then decide if the 50% price difference warrants stepping up to the Ryzen 5 5600 (Vermeer).
 
The Sky is not falling. Unless..........

Their initial statement is in error and rather than a bandwidth limitation its a physical lane restriction. That is far different.
If the card were to only have 4 Physical lanes. Then you are looking at a physical connection restriction and not bandwidth. So then you need to look at what 4 lanes delivers on various MBs.
4 lanes
PCIX 5.0 = 16 GB/sec
PCIX 4.0 = 8 GB/sec
PCIX 3.0 = 4 GB/sec
PCIX 2.0 = 2 GB/sec

Anything 8 GB/sec and higher does not seem to hurt or bottle neck performance.
I have the Radeon RX 6500 XT and this is the case with this card. It is x16 form factor, but x4 electrical. So, yep, only 4 lanes or, as you point out, about 8 GB/s on PCIe v4.0.

As you also mention, 8 GB/s and above should not be a bottleneck, but dropping to PCIe v3.0 around 4 GB/s puts you below that threshold which could have you trading quality settings for gameplay frame rates.
 
I have the Radeon RX 6500 XT and this is the case with this card. It is x16 form factor, but x4 electrical. So, yep, only 4 lanes or, as you point out, about 8 GB/s on PCIe v4.0.

As you also mention, 8 GB/s and above should not be a bottleneck, but dropping to PCIe v3.0 around 4 GB/s puts you below that threshold which could have you trading quality settings for gameplay frame rates.
Agree Maggot! Its makes it confusing for a person with a older MB PCI v3.0 not realizing they will not get full bandwidth. Its not terrible but there is a difference in gameplay around 10fps give or take which like you said might cause one to drop a setting or two.
Take a look at this video comparing them in games.
 
Back