Memory Performance Tests

Okay so why was the 2990WX so disappointing in a lot of these tests? Well as I alluded to earlier it is 100% down to memory bandwidth, much more so than core to core latency or memory latency.

Here we can see the sustained memory bandwidth for each processor. You’ll notice that the 2990WX is a little down on the 2950X and that’s due to the added latency the dies without memory controllers incur. It’s a 7% drop in bandwidth but that alone doesn’t explain the performance issues we’ve seen.

Just to confirm those results I did also test with AIDA64 and here is the memory copy performance, again the 2990WX was down 7% on the but that doesn’t explain the miserable performance in the encoding, compression and encryption benchmarks.

For that we need to look at memory bandwidth per core, not for the entire processor but rather the individual cores. Arranging these results by a single thread or core we see that with just one core active the Ryzen CPUs enjoy tremendous bandwidth.

Now please note the performance of each core within the CPU is measured individually and result you see here is the average bandwidth across all the individual cores. So the 2700X and 2950X, both 2nd gen parts deliver the same 29 GB/s. Then the 1st gen Ryzen parts deliver between 24 - 25 GB/s and then we have the 2990WX at 20 GB/s. This is why we saw a slight drop in total memory bandwidth in the previous test, the margin is amplified here showing the 2990WX to be almost 30% slower as were not limited by the DDR4 memory in this instance.

The reason the single core bandwidth is down is due to the fact that 16 of the 32 cores aren’t connected directly to the memory and therefore suffer increased latency.

Finally we see that almost all the Skylake-X parts are limited to just 14 GB/s, though this is less of an issue as 14 GB/s per core is essentially overkill, and here is why.

If we rearrange this graph by the ‘all-threads active’ result, the arrangement changes quite a bit. Now for these results all CPU cores are actively accessing system memory and we’re showing the average throughput of an individual core. Essentially with the CPU running at full steam in a memory intensive workload, this is the typical amount of bandwidth each core has at its disposal.

This here is the problem. The 2950X enjoys a bandwidth of 4.4 GB/s per core when maxed out and this is why the 14 GB/s we saw with just a single core active on the Intel CPU’s isn't an issue, since the maximum sustained bandwidth of the Skylake-X CPUs is around 64 GB/s, with 5 cores active in an extremely memory intensive workload you’re going to use up all that bandwidth and once you start adding more cores you start to see a drop in efficiency as they aren’t feed data fast enough.

Naturally the more cores you have the worst of you’re going to be in this test without increasing the overall memory bandwidth. With octa-channel memory the 2990WX would indeed be able to match the 4.4 GB/s per core of the 2950X. But with just quad-channel memory that figure is halved, well a little over halved due to the increased latency so it’s a bit of a double whammy. In the end just shy of 2 GB/s of bandwidth per core just isn’t enough and we see the problem this causes when running memory sensitive applications like VeraCrypt for example. Okay so before we move on to overclocking, power consumption and a few other tests, let’s quickly go over gaming performance.