For the dissatisfied customers, let’s re-analyze this with ZERO speculation. (Which is a reasonable request.)
Navi’s cores are floating-point Stream Processors (SPs), sixty-four of which make a Compute Unit (CU). There are two of those in a Workgroup Processor (WGP) and four or five of those in an Asynchronous Compute Engine (ACE) depending on the GPU’s configuration. Then there are two ACEs per Shader Engine (SE) and varying amounts of those per GPU.
The biggest Navi GPU, the Radeon RX 5700 XT, has two SEs and five WGPs per ACE. This makes for a total of forty CUs or 2560 cores.
Each Navi core in a 5700 XT is more or less equivalent to an Nvidia CUDA core, as in TechSpot’s testing, the 5700 XT is only 2% slower than the RTX 2070 Super, which has the same core count. This would imply that for an AMD GPU to be ~30% better than an RTX 2080 Ti, it would need over 30% more cores than the Ti’s 4352. Potentially even more, if it has engineering sample clock speeds.
The Navi architecture is obviously designed to be scalable, so let’s scale it up. To get to roundabouts 6000 cores, there are a couple of possible configurations. With six SEs and four WGPs per ACE, you get to 6144 cores. Or with five SEs (an unusual number) and five WGPs per ACE, you get 6200 cores.
That’s literally triple the amount of silicon the 5700 XT has. How likely is it that AMD will literally triple the size of its GPUs? In the six months since the 5700 XT was released?
We can take the same approach with Nvidia. Turing puts sixty-four CUDA cores into a Streaming Multiprocessor (SM), and two of those in a Texture Processing Cluster (TPC). There are either four or six TPCs in a Graphics Processing Cluster (GPC), and varying amounts of them per GPU.
The RTX 2080 Ti has six GPCs and six TPCs per GPC, for sixty-eight SMs. If Ampere has similar performance per core to Turing (but cheaper prices, better RTRT, whatever to make it marketable) the again, we need ~30% more cores. Well, just add two more GPCs and boom, 6144 cores.
That’s just going from six to eight GPCs, which is not an outrageous leap. Particularly since Nvidia has had fifteen months to develop new GPUs. Or a new architecture coupled with the 7nm node could also improve performance and lessen the number of required cores to reach this performance level.
So, without doing any guesswork at all, compare the likelihood of AMD and Nvidia to have a GPU this powerful. Then you can factor that into your personal probability assessment of who’s more likely to be testing with an unreleased AMD APU.
Navi’s cores are floating-point Stream Processors (SPs), sixty-four of which make a Compute Unit (CU). There are two of those in a Workgroup Processor (WGP) and four or five of those in an Asynchronous Compute Engine (ACE) depending on the GPU’s configuration. Then there are two ACEs per Shader Engine (SE) and varying amounts of those per GPU.
The biggest Navi GPU, the Radeon RX 5700 XT, has two SEs and five WGPs per ACE. This makes for a total of forty CUs or 2560 cores.
Each Navi core in a 5700 XT is more or less equivalent to an Nvidia CUDA core, as in TechSpot’s testing, the 5700 XT is only 2% slower than the RTX 2070 Super, which has the same core count. This would imply that for an AMD GPU to be ~30% better than an RTX 2080 Ti, it would need over 30% more cores than the Ti’s 4352. Potentially even more, if it has engineering sample clock speeds.
The Navi architecture is obviously designed to be scalable, so let’s scale it up. To get to roundabouts 6000 cores, there are a couple of possible configurations. With six SEs and four WGPs per ACE, you get to 6144 cores. Or with five SEs (an unusual number) and five WGPs per ACE, you get 6200 cores.
That’s literally triple the amount of silicon the 5700 XT has. How likely is it that AMD will literally triple the size of its GPUs? In the six months since the 5700 XT was released?
We can take the same approach with Nvidia. Turing puts sixty-four CUDA cores into a Streaming Multiprocessor (SM), and two of those in a Texture Processing Cluster (TPC). There are either four or six TPCs in a Graphics Processing Cluster (GPC), and varying amounts of them per GPU.
The RTX 2080 Ti has six GPCs and six TPCs per GPC, for sixty-eight SMs. If Ampere has similar performance per core to Turing (but cheaper prices, better RTRT, whatever to make it marketable) the again, we need ~30% more cores. Well, just add two more GPCs and boom, 6144 cores.
That’s just going from six to eight GPCs, which is not an outrageous leap. Particularly since Nvidia has had fifteen months to develop new GPUs. Or a new architecture coupled with the 7nm node could also improve performance and lessen the number of required cores to reach this performance level.
So, without doing any guesswork at all, compare the likelihood of AMD and Nvidia to have a GPU this powerful. Then you can factor that into your personal probability assessment of who’s more likely to be testing with an unreleased AMD APU.