In a recent hardware deep-dive, we took a look at how CPU cores and cache impacted gaming performance. To do that, we used three Intel Core processors and compared their core scaling performance, revealing that in today's games most of the performance gains you see when going from a Core i5-10600K to a Core i7-10700K or even Core i9-10900K are largely due to the increase L3 cache capacity.
This was an interesting analysis because most people who upgrade from an older generation Intel part such as the Core i7-8700K to the newer Core i9-10900K and saw a strong performance uplift in games, tend to believe this comes as the result of the 67% increase in cores, but for the most part it’s actually due to the 67% increase in L3 cache -- at least that's the case for today’s most demanding games.
Coming away from that testing, many of you wanted to know how much difference the L3 cache capacity makes with just 4 cores active, wondering if the margins would be even greater. So we’ve gone back and retested a 4-core, 8-thread configuration while adding three more games to the list along with a fourth processor, the 4C/8T Core i3-10105F.
Because the Core i3 models are locked, we're unable to use a 4.5 GHz clock frequency, instead the 10105F ran at 4.2 GHz which is the spec all-core frequency, so it’s going to be running at a 7% lower frequency than the K-SKU parts. The Core i3-10325 would have been more suitable for this test, but we were unable to secure one in time for this test. Even though the Core i3 part is running at a disadvantage, it should remain an interesting addition given the much more limited 6MB L3 cache.
The fastest quad-core CPU that Intel has produced to date is the Core i7-7700K, or one of the higher clocked Comet Lake Core i3s, such as the Core i3-10325, both of which feature an 8MB L3 cache, or 2MB more than the 10105F. So it’s going to be interesting to see what kind of gains would be possible using a quad-core CPU with 20MB of L3 cache, something we can achieve with the 10900K by disabling half a dozen cores on that part.
To put this test together we used the Gigabyte Z590 Aorus Xtreme motherboard, clocking the three Intel K-SKU CPUs at 4.5 GHz with a 45x multiplier for the ring bus and used DDR4-3200 CL14 dual-rank, dual-channel memory with all primary, secondary and tertiary timings manually configured. The Core i3-10105F used the same spec memory.
The bulk of the testing was ran with the Radeon RX 6900 XT as it’s the fastest 1080p gaming graphics card you can buy, although we've included some results with the RTX 3090 for a look at Nvidia’s overhead, which we suspect will affect the quad-core configurations.
We’ll start with Rainbow Six Siege where we previously saw the most significant performance differences between cache capacities. For example, with all CPUs locked to 6 cores, we saw a massive 18% increase from 12MB to 20 MB of L3 cache.
Then with only 4 cores enabled, the margin from the 10600K to the 10900K is actually reduced to 13%, quite a bit less than the margin seen with 6 cores enabled. This gives us a hint that the larger L3 cache becomes less effective at boosting performance with fewer cores available to utilize it.
This is best illustrated by the Core i3-10105F, which was just 9% slower than the 10600K, and remember it is clocked 7% lower, so presumably at least half that margin is down to the difference in clock speed. It's still interesting to note that it was possible to improve performance from the 4-core Core i3 by 24%, when using the 10900K with just 4 cores active. That’s a massive increase given both CPUs use the same Comet Lake architecture with the only difference being a 7% clock frequency variation and the L3 cache capacity, which is over 3x larger for the i9 part.
The takeaway here is that not all cores are equal, even if the cores are physically the same, a difference in cache capacity can make all the difference. That and cramming more cache into a CPU is still beneficial with just 4 cores active, but it’s less effective than what we saw with 6 cores.
Next up we have Assassin’s Creed Valhalla which is a heavily GPU-bound title, especially when using the ‘Ultra High’ quality preset, even with a Radeon RX 6900 XT at 1080p. As a result the benchmark relies almost entirely on the GPU and only the Core i3 sees a small dip in 1% low performance, suggesting that frame time performance isn’t as good with the 6MB L3 cache, but not significantly worse either.
A new addition to this testing is Battlefield V. Here we can see that the 20 MB L3 cache of the 10900K is very beneficial, even when limited to 6 cores. Dialing down to 4 cores creates a CPU bottleneck that can’t be solved with more cache. 1% low performance was improved by 9% when going from 16 to 20 MB which isn’t nothing, but it’s less than the 13% gain we saw with 6 cores enabled.
Now, with only 4 cores enabled, the 10900K was comparable to the 10600K stock. In fact, the frame time performance was better, improving 1% lows by a 12% margin. When it came to 1% low performance the Core i3 struggled a little, dipping down to 71 fps making it 19% slower than the 10600K and a massive 30% slower than the 10900K. Again that’s an incredible difference that’s almost entirely down to the difference in L3 cache capacity.
The F1 2020 results are similar to what we’ve seen so far, where the L3 cache paucity made more of a difference with 6 cores, though we do see some difference with 4 cores active. The 10700K was just 3% slower than the 10900K, while the 10600K was a further 3% slower than the 10700K. So pretty consistent scaling there. The Core i3 part though was 8% slower than the 10600K, or 11% slower if we look at the 1% low results.
We can estimate that at most ~5% of that margin could be due to the difference in clock speed, and if that's the case, scaling is still roughly inline with that we saw for the K-SKUs.
Hitman is a CPU intensive title, and we can see a 9% increase in performance when going from a 12MB to 20MB L3 cache with 6 cores enabled. That margin was reduced with just 4 cores enabled, this time to 6%. Then we saw a 9% drop in performance from the 10600K to the 10105F.
Horizon Zero Dawn is a lot like Assassin’s Creed Valhalla in the sense that it’s primarily a GPU bound title, even at 1080p with a 6900 XT. As a result, the 4-core configurations weren’t a great deal slower than what we saw with 6, 8 and 10 cores enabled. The Core i3 part also performed well relative to the higher-end Core i5, i7 and i9 models.
Cyberpunk 2077 is a very CPU intensive game and you will see a huge benefit when upgrading to a modern 6-core processor, such as the Core i5-10600K. That said, had the 10600K been armed with a 20MB L3 cache it would be 13% faster in this title when looking at the 1% low performance. That margin remained fairly consistent with just 4 cores enabled as the 10900K configuration was 12% faster than the 10600K.
Interestingly, the 4-core 10600K configuration was just 6% faster than the Core i3-10105F on average, but 19% faster when comparing the 1% low performance and this is where the Core i3 really struggled with just 64 fps.
When looking at the Core i3 part we see a massive 56% performance disparity between the 1% low and average frame rate, whereas the 10600K sees just a 39% margin and this suggests much more consistent frame time performance.
It’s worth noting that the 10900K with just 4 cores active played very well, offering smooth and consistent performance, despite being 25% slower than the 6, 8 and 10-core configurations. The 10105F though suffered from noticeable stuttering and again that was reflected in the much weaker 1% low performance. So while a 10th-gen quad-core with a fat L3 cache can play the game just fine, it is still much slower than a 6-core equivalent, while going to 8 cores offers no improvement in average frame rate performance or frame time performance.
Next up we have Shadow of the Tomb Raider which is another CPU demanding game but again, with just 4 cores enabled, the 10th-gen CPUs aren’t able to take advantage of that extra L3 cache like they are with 6-cores. It’s not until the L3 cache is set to 6MB with the Core i3 part that performance starts to fall away, dropping the 1% low performance by 16% when compared to the 10600K.
Testing with the RTX 3090
As we were wrapping up the quad-core testing, we thought it might be interesting to re-run a few of these using the GeForce RTX 3090. With 6-cores enabled in SoTR, the 10600K was 13% slower than the 10900K using the Radeon and 15% slower with the GeForce, so not a huge change there.
Then with 4 cores enabled, the 10600K was 4% slower than the 10900K with the Radeon, while we see radically different results with the RTX 3090. Here the 10600K is 14% slower than the 10900K for the average frame rate and 13% slower for the 1% low. This is the result of Nvidia’s added overhead by using the CPU for much of its GPU scheduling.
However, it’s the Core i3 part that's truly crippled by the software scheduling, tanking 1% low performance to just 54 fps, making it 28% slower than the 10600K, whereas it was 16% slower before. Had we used the RTX 3090 for all testing, the 4-core results in Hitman, Cyberpunk 2077, Battlefield V, and so on might have been much more significant.
We also tested Watch Dogs Legion with both Radeon and GeForce high-end GPUs. Before with the Radeon we didn’t see much of a difference with the K-SKU parts, with 6 cores enabled the results were much the same and then 8 cores only offered a small boost. However, 4 cores drop performance quite considerably. The 10600K, for example, was 17% slower with 4 cores enabled, and we saw similar margins with the 10700K and 10900K. That said, performance between the various 4-core configurations is similar and even the Core i3 part manages to hang in there.
Using the RTX 3090 doesn’t do much to the 4-core results. The 10600K was 19% slower with just 4 cores enabled when compared to its stock 6-core configuration. Unexpectedly though, the margins with 6 and 8 cores enabled are quite different and it seems that the cache plays a larger role with the RTX 3090 installed, presumably because the CPU is having to do more work.
It’s also interesting how we don’t see the same scaling with 4 cores enabled, but it does appear as though this is too few cores to take advantage of the increased L3 cache capacity.
What We Learned
This was an interesting look at CPU performance but probably not what many of you were expecting. We believe the expectation was that with fewer cores, the larger L3 cache of the 10900K would play an even greater role, but for the most part that doesn’t appear to be the case.
It’s clear that for mid to high-end gaming, quad-cores are officially out. We’ve known this for some time, which is why AMD and Intel have stopped producing mid-range quad-core processors. For lower-end systems however, quad-cores still work well, though it should be clarified that when we say quad cores, we mean 4-core/8-thread processors that do support simultaneous multi-threading.
Frame time performance can be seen suffering in demanding titles such as Battlefield V, Shadow of the Tomb Raider and Cyberpunk 2077, for example, but in most games if you’re using a budget GPU, such as the Radeon RX 5500 XT, GeForce GTX 1650 Super, or anything slower, a decent quad-core will enable an acceptable level of performance.
Frame capping to 60 fps will help smooth our frame rates in a title like Cyberpunk 2077 as it reduces CPU load as well as frame to frame variance. So if you’re running into stuttering issues, try capping the frame rate to something more sustainable as that can help.