A few months back we investigated CPU core misconceptions, explaining how overall processor performance is affected not only by how many cores a CPU has, but other factors including cache levels and capacity. This was an interesting and unique look at Intel’s 10th-gen series in an article we titled "How CPU Cores & Cache Impact Gaming Performance." Basically what we did was compare the Core i9-10900K, Core i7-10700K and Core i5-10600K at the same 4.2 GHz frequency, with the same memory, memory timings, ring bus frequency, and so on.
Then compared the three CPUs with only 6 cores / 12 threads enabled to see how much of a difference the L3 cache capacity made when it came to gaming performance. After that, we compared that data to the 10700K and 10900K with 8 cores enabled, and finally the 10900K with all of its 10 cores turned on.
Long story short, it turns out that in almost all games, it's not the core count but the L3 cache capacity that is responsible for the improved performance seen across the higher-end Intel parts. Of course, down the track the extra cores will see those higher-end parts pull even further ahead, but at least on today's games it’s all about the L3 cache.
That investigation later morphed to a quad-core version where we included Core i3 models and a similar take for AMD CPUs, where we looked at 10 years of AMD CPU progress and back to Intel for the same.
To wrap up that content, we thought we should add the new Intel Alder Lake 12th-gen CPUs to the data pool, so here we are, and it's been a more involved process than we first imagined. Whereas all other CPU architectures had one, two, or maybe three different configurations, 12th-gen Core has three per CPU.
For example, the 10th-gen CPUs had a 20MB L3 cache with the Core i9 model, 16MB for i7 and 12MB for the i5 models, Alder Lake’s cache capacity is segmented in a similar fashion, 20MB of L3 for the Core i5, 25MB for i7 and 30 MB for i9. But then on top of that we had to work out what kind of core configuration we should test. Four P-cores, Four E-cores or a mixture of both? The correct answer was of course all three configurations and that’s provided us with a wealth of juicy data to go over.
To be clear, with four P-cores enabled we were using Hyper-Threading, so this is a 4-core/8-thread configuration. Basically SMT was enabled when supported for all test configurations. This means because the E-cores don’t support SMT the four E-core configuration was 4 cores with 4 threads. Then the mixed configuration which featured two P-cores with two E-cores was a 4-core/6-thread configuration.
For testing we’ve used the MSI Z690 Tomahawk Wi-Fi DDR4 as we wanted to use the same DDR4-3200 CL14 low-latency memory that was used to test all the other CPU architectures that support DDR4. In our testing, DDR5-6000 has not shown to be any faster for gaming, but most importantly we wanted to keep the data as apples to apples as possible for this feature. Finally, all configurations were tested using the Radeon RX 6900 XT. Let’s dive into the data.
Starting with Rainbow Six Siege, there’s quite a bit to go over, so bear with me. First let’s just look at the Core i9-12900K, we see with four P-cores enabled and locked at 4.2 GHz that this configuration was good for 510 fps, just 3% faster than AMD’s Zen 3 architecture.
Then with two P-cores and two E-cores enabled, performance dropped by 15% which is a fairly significant reduction, and then with just four E-cores enabled performance drops by a further 12% which isn’t that much and not nearly the decline I was expecting. Quite shockingly, in this title four E-cores were able to match the performance of the Core i9-11900K, though the 11th-gen architecture does suck in this title, but still I didn’t expect to see any results like this.
When comparing the various 12th-gen processors, we see that from the 12600K to the 12700K the additional L3 cache boosts performance by 4% with just the P-cores enabled, or 7% with just the E-cores. Then from the 12700K to the 12900K we’re looking at a further 5% performance boost for the P-cores and a rather substantial 10% boost for the E-cores.
If we compare all the data we have we see that the 12th-gen CPUs with just their E-cores enabled are comparable to Skylake as Intel claimed, at least when looking at the 12900K data. It’s also interesting to note that with two P-cores and two E-cores the 12900K was quite a bit slower than four Zen 3 cores. So this suggests that a part like the 5950X will end up being much faster than the 12900K for gaming, once games heavily utilize 16-cores... in like 10 years from now.
Moving onto Battlefield V results, we gain a few interesting insights. First, is that the E-cores suck big time in this title, not only is the average frame rate almost halved when compared to what we see when just using the P-cores, but the 1% low performance is shattered.
We’re looking at a 22% reduction in performance with the 12900K when going from 4 P-cores to 2 P-cores and 2 E-cores. Then we see a further 31% reduction when moving to E-cores exclusively. Worse, that means the P-cores were 87% faster when looking at the average frame rate and 170% faster when looking at the 1% low. So those efficient cores are devastatingly slow in this game, and anything but efficient.
We also see that when the E-cores are enabled, the larger L3 cache capacity of the i7 and i9 models doesn’t result in any extra performance, or at least very little in the way of extra performance. However with just the P-cores the 12700K was 6% faster than the 12600K and then the 12900K was 7% faster than the 12700K.
If we compare that data with the rest of the CPU architectures we’ve tested, there’s a few noteworthy comparisons to be made. When compared to Zen 3, Alder Lake is up to 12% faster, seen when comparing the 12900K with the 5800X. That said, the smaller 20MB cache of the 12600K meant it was 2% slower while the 25MB i7 was just 4% faster. So it’s that larger 30 MB L3 cache that gets the Core i9 firmly over the line.
That said, if we were to force Intel to utilize the E-cores for gaming, we see that the mixed 2 P-cores/2 E-cores configurations fall behind Zen 3. Then if you were to use E-cores exclusively, performance falls off a cliff and now we’re talking nowhere near Skylake levels of gaming performance, think more Sandy Bridge.
Moving on to F1 2020, we see that the E-cores are nowhere near as bad as what we saw in Battlefield V. We’re looking at a 65% performance increase with the E-cores when looking at the 12900K and a 43% increase with the 12600K. The 12600K does appear to be choked by its smaller 20 MB L3 cache given the 12700K was 18% faster when comparing P-core performance, while the 12900K was just 4% faster than the 12700K.
Compared to Zen 3, Alder Lake is slower when limited to a 20 MB L3 cache, then up to 10% faster with 25 MB and 12% faster with 30 MB. As for the E-core only configuration, Alder Lake is comparable to Ivy Bridge in F1 2020 and a long way behind Skylake, the 7700K for example was 33% faster than the 12900K’s E-core configuration.
The NPC heavy Hitman 2 test crushes the E-cores. This is similar to what was seen when testing with Battlefield V. Performance across all three 12th-gen parts is similar and that means we’re looking at a 41% performance improvement with 2 P-cores and 2 E-cores compared to just using the E-cores. Actually, if we look at the 1% low performance, it’s closer to a 134% jump, which is crazy.
Then we see when just using the P-cores, the average frame rate is improved by 27% when compared to the mixed core configuration.
So again, if we compare the E-core only configurations to older CPU architectures, we see that performance is nowhere near Skylake. The 1% low performance was as bad as what we saw with AMD’s Bulldozer, while the average frame rate was much closer to Ivy Bridge than it was to Skylake.
Even in Horizon Zero Dawn, which isn’t particularly CPU intensive, the E-core only configurations struggled, though it will eat up 4-cores/4-threads, especially if they’re slow. If we look at 1% low performance we see a 104% increase from 4 E-cores to 2 E-cores plus 2 P-cores, while going from the mixed core configuration to 4 P-cores only boosted performance by a further 14%. We’re also seeing very little performance difference between the various L3 cache capacities in this game.
If we compare with the older CPU architectures, we find that Alder Lakes E-cores aren’t much better than AMD’s FX series again. The 1% low performance was almost identical and that means we’re miles away from Skylake here.
Cyberpunk 2077 is yet another game where the E-cores can’t push 1% lows to 60 fps, not even close. As a result we see a 100% performance improvement with the 12900K when comparing E-cores to the mixed core configuration, and then just a further 12% boost when only using the P-cores. Interestingly, the mixed core configuration of the 12900K is quite good, whereas we do see a noticeable drop off with the 12700K and 12600K.
Comparing with past CPU architectures, we see that the E-cores are much slower than even first generation Ryzen, and worlds slower than Skylake. We’re looking at Sandy Bridge level of performance here.
Finally, we have Shadow of the Tomb Raider and here we see very little difference between the various Alder Lake CPUs, so the cache capacity has almost no tangible influence here, at least for these core configurations. We had found previously with the 10th-gen series that the larger L3 cache is of greater utility when more cores are available.
When compared to older CPU architectures, the E-cores struggle with gaming on their own, with average frame rate performance that’s comparable to Ivy Bridge and 1% low performance that’s only comparable to AMD’s FX series. On the other hand, when only using the P-cores, Alder Lake is a beast beating Zen 3 by 11% in this game.
That was eye opening to say the least. Those E-cores don't do well at gaming and there’s a good reason why, which we’ll get to in a moment. For now let’s take a look at the 7 game average we collected.
Across the 7 games tested we see that the 12900K was just 3% faster than the 12700K with just the P-cores active, and 8% faster than the 12600K and those margins are entirely down to the difference in L3 cache capacity. The margins with two P-cores and two E-cores enabled are similar and the same is also true with just four E-cores.
Of course, the interesting story is the difference in performance between the various core configurations on the same CPU. So take the 12900K, for example, we saw a 44% increase in average frame rate when going from 4 E-cores to a mix of P and E cores, and an 81% increase in 1% low performance. Then from the mix of P and E cores to just P-cores, the average frame rate was boosted by a further 20% and the 1% low by 21%.
Obviously, you’d never run a 12th-gen CPU with just the E-cores, which would reduce performance some ~20%, but let's go deeper in the analysis in our conclusion...
What We Learned
Intel’s 12th-gen hybrid core design is really interesting and it does bring some obvious benefits for productivity workloads and will no doubt prove very beneficial in the mobile space. Now you’re probably thinking, "sure, I saw the benchmarks, I get that E-cores don't perform well for gaming on their own, but why?"
The answer is simple, and it’s the same reason why first-gen Ryzen was down on Intel for gaming when matched at the same core count. Core-to-core latency is very weak -- we’re talking about a 54% increase on average.
Typically, P-cores take 37ns to communicate with one another whereas the E-cores take 57ns and this cripples performance in games and for any other workload that relies heavily on core crosstalk.
The reason Intel’s limited interconnect between the E-cores is to make them more efficient, both in terms of power usage and the amount of die space they require. For sequential workloads like what we see with rendering, for example, where there’s very little core-to-core communication, the E-cores work well and this is why Intel used SPECrate2017 to make their Skylake efficiency claim.
If we look at the broader picture, the hybrid design even on the desktop makes sense, at least for Intel. A part like the Core i9-12900K can claim to house "16 total" cores with 24-threads, because technically that's what it packs, even if not all cores are equal.
On paper, the 12900K looks comparable to the Ryzen 9 5950X, and when put to the test in applications that can leverage these core-heavy desktop parts, the 12900K still looks great, as the E-core weakness that is core-to-core communication isn’t emphasized by those workloads, think Blender as one such example.
Then when it comes to gaming, the 12900K still shines because not a single game requires more than 8 Alder Lakes P-cores. Even if a game can spread the load across 16 cores, that won’t be an issue. Even in the case of the 12600K, its cores are more than powerful enough to deal with the load. If they couldn’t, the game would only be playable using a high-end CPU like the 12900K or 5950X, and that’s not going to happen this decade.
Now obviously, you’d never run a 12th-gen CPU with just the E-cores, but there will be a point in time when you’ll have to call on the E-cores for gaming and this could reduce performance by 20% or more, at least based on what we’ve seen here. But again I don’t expect that time will come within the realistic lifespan of this series.
Another reason why E-cores suck for gamers is the compatibility issue with DRM, and I ran into that with this benchmark test. Previously I’d tested all CPU architectures using Watch Dogs Legion and Assassin’s Creed Valhalla, but both games failed for this testing. Watch Dogs Legion worked with just E-cores, or just P-cores, but the mix crashed the game which is strange as stock the 12th-gen CPUs work just fine. Then Assassin’s Creed Valhalla failed to load due to the DRM detection issue with the hybrid 12th-gen architecture.
In short, the E-cores are a mistake for gaming, and if called upon they will reduce frame rates. So for gamers the 12900K and 12700K are 8-core/16-thread CPUs and nothing more. The E-cores might be able to help with background tasks, but frankly on the desktop they'd be better taken care of by two more P-cores. There is no argument gamers can make for the existence of E-cores, you’d always be much better off replacing them with two extra P-cores.