Nvidia GeForce RTX 3060 8GB: Why You Should Avoid It

When it wastes die area? More cost and less availabilty for almost nothing since APU's it was meant to be paired with supported PCIe 4.0 anyway.
But those same APUs (e.g. Ryzen 6000M series) have 8 PCI Express lanes exclusively for GPU connections (20 in total -- 8x GPU, 4x chipset, 4x NVMe, 4x NVMe or SATA). The integrated GPU in those processors uses the Infinity Fabric system to communicate with the rest of the system, so it's not like some of those PCIe lanes are needed for that.

Admittedly, the Navi 24 is very small (107 mm2 in size) so AMD clearly decided to cut the number of lanes used, because of the restriction the perimeter length creates. You can see just how tight it all is in the launch promo images:

Navi_24.png


That said, the amount of die area the PCIe system takes up is tiny compared to the rest of the components. One set of 4 lanes is less than half the area of a single memory controller (the two green bars on the far bottom left of the above die).
 
Well, the problem in this case is that, as Steve pointed out, the card suffers considerably with PCI-Express 3.0 motherboards.

Preventing terrible reviews which result in low sales and a bad brand image is NEVER a waste of die area, especially when AMD's I/O silicon is made on an older (thus cheaper) process node.

The I/O silicon should have been modified for x8. Steve and Tim would both agree. Note that I'm only talking about modding the original mobile GPU for use on a discrete desktop video card. In craptops, the x4 is fine because it won't encounter an older version of PCI-e. If the card had been x8, it would've sold damn well.
It does suffer but again, it was made for PCIe 4.0 systems for start.

6500XT was only released on desktops because at that time there was huge shortage of GPUs and AMD decided to release Something. Problem with that "modding" is that it would have taken at least around 9 months. And releasing 6500XT somewhere like 2 months ago would have been too late, since 6600 basically took that spot.
But those same APUs (e.g. Ryzen 6000M series) have 8 PCI Express lanes exclusively for GPU connections (20 in total -- 8x GPU, 4x chipset, 4x NVMe, 4x NVMe or SATA). The integrated GPU in those processors uses the Infinity Fabric system to communicate with the rest of the system, so it's not like some of those PCIe lanes are needed for that.

Admittedly, the Navi 24 is very small (107 mm2 in size) so AMD clearly decided to cut the number of lanes used, because of the restriction the perimeter length creates. You can see just how tight it all is in the launch promo images:

Navi_24.png


That said, the amount of die area the PCIe system takes up is tiny compared to the rest of the components. One set of 4 lanes is less than half the area of a single memory controller (the two green bars on the far bottom left of the above die).
Of course that die saving is pretty small. But if GPU supports PCIe x8, you would expect system also uses x8 connection between it and CPU. It also adds cost. If you are pairing 6500XT with PCIe 4.0 capable system, what difference does it make if it's x4 or x8? I doubt it makes anything more than very minor one. Basically:

Engineer: We want to make this GPU to support PCIe x8
Marketing: But why?
E: Because we are pairing it with CPU that support x8
M: Is it cheaper to make it x4?
E: Yes.
M: Does it make any performance difference if it's x4 and not x8?
E: No.
M: Then make it x4.

Just like 6600 supports just PCIe x8, not x16. Since x8 saves on chip size and card is bit simpler to manufacture while performance difference is around 1-2%, it makes sense not to support x16.

Question remains: why 6500XT should have supported PCIe x8 when it was intended to use on PCIe 4.0 systems where performance difference between x4 and x8 would have been minimal? Making predictions like "when chip is ready like 1 year from now, we will be using this chip on desktops also so let's put some more stuff here" are pretty much 🙄
 
By those specs Intel cards are the best. By far.
Indeed, but because Intel is using a SIMD8 arrangement, compared to AMD's SIMD32 and Nvidia's SIMD16/32 (depending on the data format), all that hardware only gets a chance to shine when there's a shed ton of threads, maximizing the occupancy of the shaders. And even then, issues with bandwidth scaling do their best to hobble all of that silicon. Forget drivers and whatnot, this is what Intel really needs to fix for the Battlemage architecture if they want to compete on an even ground with AMD and Nvidia.
 
It does suffer but again, it was made for PCIe 4.0 systems for start.

6500XT was only released on desktops because at that time there was huge shortage of GPUs and AMD decided to release Something. Problem with that "modding" is that it would have taken at least around 9 months. And releasing 6500XT somewhere like 2 months ago would have been too late, since 6600 basically took that spot.

Of course that die saving is pretty small. But if GPU supports PCIe x8, you would expect system also uses x8 connection between it and CPU. It also adds cost. If you are pairing 6500XT with PCIe 4.0 capable system, what difference does it make if it's x4 or x8? I doubt it makes anything more than very minor one. Basically:

Engineer: We want to make this GPU to support PCIe x8
Marketing: But why?
E: Because we are pairing it with CPU that support x8
M: Is it cheaper to make it x4?
E: Yes.
M: Does it make any performance difference if it's x4 and not x8?
E: No.
M: Then make it x4.

Just like 6600 supports just PCIe x8, not x16. Since x8 saves on chip size and card is bit simpler to manufacture while performance difference is around 1-2%, it makes sense not to support x16.

Question remains: why 6500XT should have supported PCIe x8 when it was intended to use on PCIe 4.0 systems where performance difference between x4 and x8 would have been minimal? Making predictions like "when chip is ready like 1 year from now, we will be using this chip on desktops also so let's put some more stuff here" are pretty much 🙄
The thing is, the 6500 XT's GPU wasn't originally meant to be used on a discrete video card. For use as an IGP, the x4 was fine, as you rightly pointed out, but that's only because they controlled the rest of the components around it. The issue came with putting it on a discrete video card because they lose that control as it can be put into anything. As a bottom-end card, the odds of it being put on a PCI-Express 3.0 motherboard become quite high because PCI-Express was brand-new at the time (A520, B550 & X570) and people who were buying brand-new motherboards with Ryzen 5000-series CPUs were far more likely to be able to afford video cards that cost more than the RX 6500 XT (To this day, I can't understand what the XT is for on that card :laughing: ).

When I first read the review, I too was confused. I had to mull it over before I realised that it's the x4 that's the problem here because the card's x4 makes the mobo run at PCI-Express 3.0 x4 which is like PCI-Express 2.0 x8 or PCI-Express 1.0 x16, a bandwidth that would even limit an old Radeon HD 7870. While I'm sure that 7.9GB/s was just fine, it was also pretty close to the minimum bandwidth required. Being forced to use PCI-Express 3.0 reduced that to 3.9GB/s, a bandwidth that was clearly insufficient.

Cut that in half to less than 8GB/s What the engineers didn't stop to ponder was "What if this card goes into a motherboard that's older than X570?" which is something that they should have thought of because that's what happens when people are using the AM4 platform as advertised. Three out of the four AM4 motherboards only use PCI-Express 3.0 and this is what the engineers should have accounted for but didn't. Now, if this wasn't being presented as a gaming card but as the glorified display adapter that it was, it wouldn't be so bad but the cost was high and the performance was weak on PCI-Express 4.0 and pretty much unusable with the bandwidth cut in half.

At the end of the day, if the card supported x8 connectivity (which is still half of standard), none of these problems would have existed and Steve probably would've given it a better review than he did. In the end, not only was the card over-priced and offer inferior value compared to the RX 6600, it was also hamstrung by the decision made for its connectivity level. It was just a bone-headed mistake by ATi's engineers, one that I hope they never make again.
 
The thing is, the 6500 XT's GPU wasn't originally meant to be used on a discrete video card. For use as an IGP, the x4 was fine, as you rightly pointed out, but that's only because they controlled the rest of the components around it. The issue came with putting it on a discrete video card because they lose that control as it can be put into anything. As a bottom-end card, the odds of it being put on a PCI-Express 3.0 motherboard become quite high because PCI-Express was brand-new at the time (A520, B550 & X570) and people who were buying brand-new motherboards with Ryzen 5000-series CPUs were far more likely to be able to afford video cards that cost more than the RX 6500 XT (To this day, I can't understand what the XT is for on that card :laughing: ).
You are missing the point from beginning: 6500 XT was Never meant to be used in discrete cards. Not originally meant but never meant. It is missing almost everything when it comes to video encoding/decoding. Why? Because of exact same reason it only supports PCIe x4: it was meant to be paired with APU that has PCIe 4.0 support AND video decoders/encoders and so putting any of those is complete waste of silicon.
When I first read the review, I too was confused. I had to mull it over before I realised that it's the x4 that's the problem here because the card's x4 makes the mobo run at PCI-Express 3.0 x4 which is like PCI-Express 2.0 x8 or PCI-Express 1.0 x16, a bandwidth that would even limit an old Radeon HD 7870. While I'm sure that 7.9GB/s was just fine, it was also pretty close to the minimum bandwidth required. Being forced to use PCI-Express 3.0 reduced that to 3.9GB/s, a bandwidth that was clearly insufficient.
Again, PCIe 4.0 x4 is enough and that's only situation 6500XT was ever designed for. It means 6500XT is not bad by design but looks bad if used other ways it was meant to be used.
Cut that in half to less than 8GB/s What the engineers didn't stop to ponder was "What if this card goes into a motherboard that's older than X570?" which is something that they should have thought of because that's what happens when people are using the AM4 platform as advertised. Three out of the four AM4 motherboards only use PCI-Express 3.0 and this is what the engineers should have accounted for but didn't. Now, if this wasn't being presented as a gaming card but as the glorified display adapter that it was, it wouldn't be so bad but the cost was high and the performance was weak on PCI-Express 4.0 and pretty much unusable with the bandwidth cut in half.
Engineers were told 6500XT will be paired with APUs that support PCIe 4.0. Then they designed it that way. Nothing more complicated than that.
At the end of the day, if the card supported x8 connectivity (which is still half of standard), none of these problems would have existed and Steve probably would've given it a better review than he did. In the end, not only was the card over-priced and offer inferior value compared to the RX 6600, it was also hamstrung by the decision made for its connectivity level. It was just a bone-headed mistake by ATi's engineers, one that I hope they never make again.
Like I pointed out, Steve had no clue that card was never meant to be discrete solution (x4 interface, lack of media capabilities are clear sign of that) and missed totally availability issues. Like Techspot later admitted, there is simply nothing better available same price 6500XT goes. Therefore 6500XT was much better than nothing.

AMD engineers did basically nothing wrong. You just don't design chip for certain purposes and put something extra there just because there might be minimal possibility chip will also have some other uses. 6500XT is OK chip for it's intended use. I totally agree with AMD's design choices.
 
Indeed, but because Intel is using a SIMD8 arrangement, compared to AMD's SIMD32 and Nvidia's SIMD16/32 (depending on the data format), all that hardware only gets a chance to shine when there's a shed ton of threads, maximizing the occupancy of the shaders. And even then, issues with bandwidth scaling do their best to hobble all of that silicon. Forget drivers and whatnot, this is what Intel really needs to fix for the Battlemage architecture if they want to compete on an even ground with AMD and Nvidia.

Ah, now I see why is that grouping so important.
 
Thank you Techspot for keeping it real and bringing this to consumers' attention! I would've just thought the difference was in memory size, but this is really shady behaviour on Nvidia's part. As a shareholder it worries me that they're resorting to these tactics to increase sales. It's a strong indication that things aren't going well in the company & the quarterly report will probably reflect this...seems like its time to sell Nvidia shares.
 
Thank you Techspot for keeping it real and bringing this to consumers' attention! I would've just thought the difference was in memory size, but this is really shady behaviour on Nvidia's part. As a shareholder it worries me that they're resorting to these tactics to increase sales. It's a strong indication that things aren't going well in the company & the quarterly report will probably reflect this...seems like its time to sell Nvidia shares.
It's things like this that have made me really hate nVidia over the years. I personally never got fleeced by them because the last GeForce card I ever owned was a Palit 8500 GT 1GB. After that, I realised that Radeons were always superior at every price point they had a card for. Since I've never been one to buy halo products, that meant I always bought Radeons. I have however seen nVidia fleecing so many people that I actually started to hate them. This latest situation with the RTX 3060 8GB is just another chapter in the horrible history of nVidia. Check this out, I can guarantee you that it's 100% accurate based on my own experiences over time:
 
This is the first source I’ve seen with 3060 L2 cache at 2MB. Can you provide an onscreen specs confirmation from your system? For example techpowerup database has all its 3060 8gb cards, both 104 and 106 at 3MB for L2 cache.
https://www.techpowerup.com/gpu-specs/geforce-rtx-3060-8-gb.c3937
You'd have to ask Steve directly and I'm not sure he'd have the time to put the 3060 8GB card into a test machine and run the relevant CUDA device enumeration to get the exact L2 cache figure.

It's worth noting that Nvidia's document on the Ampere architecture, specifically page 11, says that "512 KB of L2 cache is paired with each 32-bit memory controller", which is done via a crossbar. The image below shows the L2 cache layout in the GA102, which I've quickly added some markings to:

GA102_L2cacheslice.jpg

(Source for original die shot).

It's possible that AIB vendors were using GA106 chips with a complete 3MB L2 cache but with two MCs disabled, but that seems unlikely. The 3080 10 GB, for example, has 10 MCs and 5 MB of L2 cache, whereas the 3090 has 12 MCs and 6 MB of L2 cache. Another reason why it's unlikely is that if the 3060 8G B dies had 6 L2 cache slices but just 4 MCs, the controllers would need to be contending with read-write operations from 6 data ports, which would cause all kinds of horrible thread stalls.
 
Back