What is aggregate CPU frequency and why is it wrong?

Jos

Posts: 3,073   +97
Staff

Editor’s Note:
Matt Bach is the head of Puget Labs and has been part of Puget Systems, a boutique builder of gaming and workstation PCs, since the early days. This article was originally published on the Puget blog.

While not yet a common term, over the past year or so we have started to see a rise in the usage of the term "Aggregate CPU Frequency" as a way to estimate the performance between different CPU models. This term appears to be used most often when people are discussing high core count Xeon or dual Xeon CPU configurations, but lately we have seen it used when looking at CPUs with as few as just four cores.

At first glance, this term seems reasonable enough: it simply takes the frequency of the CPU (how fast it can complete calculations) and multiplies it by the number of cores (the number of simultaneous calculations it can perform) to arrive at a total or "aggregate" frequency for the processor. For example, below is the number of cores and base frequency for four different CPU models along with their calculated "aggregate frequency":

No. of cores Base frequency Aggregate frequency
Intel Core i7 6700K 4 cores 4.0 GHz 4*4 = 16 GHz
Intel Xeon E5-1650 V3 6 cores 3.5 GHz 6*3.5 = 21 GHz
Intel Core i7 6850K 6 cores 3.6 GHz 6*3.6 = 21.6 GHz
2x Intel Xeon E5-2690 V4 28 cores 2.6 GHz 28*2.6 = 72.8 GHz

Unfortunately, in the majority of cases trying to estimate the relative performance of a CPU in this manner is simply going to give you inaccurate and wrong results. So before this term begins to be used more commonly, we wanted to to explain why "aggregate frequency" should not be used and give some examples showing how (in)accurate it really is.

There are quite a few reasons why "aggregate frequency" is an inaccurate representation of CPU performance, but the largest primary reasons are the following:

It is typically calculated using the advertised base frequency

Most modern Intel CPUs have a wide range of frequencies they run at including the base frequency (the frequency that is advertised in the model name) and the various Turbo frequencies. Turbo Boost allows the CPU to run at higher frequencies depending on three main factors: the number of cores being used, the temperature of the cores, and the amount of power available to the CPU. On modern desktop systems with quality components, however, the cooling and power considerations are pretty much non-factors which means that the frequency of an Intel CPU should only be limited by the number of cores that are being used. In fact, Turbo Boost is so reliable on modern CPUs that, except in a few edge cases, every system that ships out our door is checked to ensure that it is able to maintain the all-core Turbo Frequency even when the system is put under an extremely heavy load.

How big of a difference would it make to use the all-core Turbo Boost frequency instead of the base frequency? If you were to calculate the "aggregate frequency" for an Intel Xeon E5-2690 V4 CPU, you would get a result of 36.4 GHz since that CPU has 14 cores and a base frequency of 2.6 GHz. However, if you instead use the all-core Turbo frequency of 3.2 GHz (which any well-designed and adequately cooled workstation should be able to achieve indefinitely), the aggregate frequency changes to 44.8 GHz which is a difference of 30%.

It does not take the rest of the CPU and system specs into account including the amount of cache, architecture, and chipset

Processors are extremely complex, and just looking at the number of cores and frequency ignores everything else that can make one CPU faster or slower than another. This can include the amount of cache (whether it is L1, L2, L3, or Smart Cache), the bus type and speed, and the type and speed of memory it can use. However, more than almost anything else it ignores the architecture and manufacturing process that was used to produce the CPU.

While the amount of difference all of these other specs can make varies from application to application, as an example we saw up to a 35% difference in SOLIDWORKS between a Skylake CPU and a Haswell-E CPU when both were operating with 4 cores at 4.0 GHz.

It assumes that programs can make perfect use of all the CPU cores

More than anything else, this is the main problem with aggregate frequency. Using the base frequency can throw things off, but in most cases probably only by a maximum of about 10-30%. Likewise, as long as you only compare CPUs from the same product family, the architecture of the CPUs likely won't come into play. But working under the assumption that a program is going to be able to make perfect use of all of the CPUs cores is just so wrong that that it makes using an "aggregate frequency" less accurate in most cases than simply choosing a CPU at random.

It appears that most people who use this term understand that there are some programs that are single threaded (parametric CAD programs are a prime example), but many of our articles have shown over and over that even if a program tries to use all the available cores, how effectively it can do varies wildly. The reason depends on a number of factors including how the program is coded, how well the task lends itself to multi-threading, and how much the other components in the system (including the hard drive, GPU, and RAM) affect performance. There are some programs that are very effective at utilizing multiple CPU cores in parallel, but even the best of them (such as offline rendering) are at best only ~99.5% efficient, and often as low as 90% efficient. This is extremely good, but still low enough that it will throw off any attempt to use an "aggregate frequency" to estimate performance.

Unfortunately, the only way to know how well a program can use multiple cores is to do comprehensive testing on that specific application. We have tested a number of programs including Premiere Pro, After Effects, Photoshop, Lightroom, SOLIDWORKS, Keyshot, Iray, Mental Ray, and Photoscan, but this is only a tiny drop in a giant bucket compared to the number of programs that exist today.

Examples

We can talk about all the reasons why we believe "aggregate frequency" is wildly inaccurate, but there is no substitution for specific examples using actual benchmark data. To help prove our point, we are going to look at a number of different applications and compare the performance between a variety of CPUs using the "aggregate frequency" and the actual performance in reality.

We like to be fair, so to give this term the best chance possible we are only going to use CPUs with same architecture (Broadwell-E/EP). If you were to mix older and newer architectures (such as a Core i7 6700K versus a Core i7 6850K or a Xeon V3 versus a Xeon V4), expect the "aggregate frequency" to become even more inaccurate.

To make it easier to see how close or far from reality the "aggregate frequency" is, whenever the expected performance using the "aggregate frequency" is within 10% of the actual performance, we will color the results in green. Anything that is 10-50% off will be in orange, and anything more than 50% off will be in red.

Example 1: Cinema4D CPU Rendering

Offline rendering of 3D images and animations is among the most efficient tasks you can run on a CPU which makes rendering engines like those found in Cinema4D exceptional at using high numbers of CPU cores. This also makes it a best-case scenario for the term "aggregate frequency":

CineBench R15 Multi CPU Specs Aggregate Frequency Expected Performance Compared to Core i7 6850K Actual Performance Compared to Core i7 6850K
Intel Core i7 6850K 6 Cores, 3.6 GHz
(3.7-4.0 GHz Turbo)
21.6 GHz 100% 100%
Intel Core i7 6950X 10 Cores, 3.0 GHz
(3.4-4.0 GHz Turbo)
30 GHz 139% 156%
(off by 17%)
2x Intel Xeon E5-2630 V4 20 Cores, 2.2 GHz
(2.4-3.1 GHz Turbo)
44 GHz 204% 207%
(off by 3%)
2x Intel Xeon E5-2690 V4 28 Cores, 2.6 GHz
(3.2-3.5 GHz Turbo)
72.8 GHz 337% 358%
(off by 21%)

We run CineBench R15 on nearly every system that goes out our door, and the results above are taken directly from our benchmark logs. Comparing the expected performance to the actual performance between the different CPUs, in every case the "aggregate frequency" ended up expecting lower performance than what each CPU was able to achieve in reality. The most accurate result was the dual Xeon E5-2630 V4 in which the expected performance difference compared to the i7 6850K was only off by about 3% which is actually extremely accurate. However, the other two CPU results were off by about 20% which means that although "aggregate frequency" has a chance of being fairly accurate, it also has a good chance of being off by a moderate amount.

Example 2: Premiere Pro

We have found in previous testing that Premiere Pro is decently effective at using a moderate amount of CPU cores, but with modern hardware there is little need for something like a dual CPU workstation. Still, there are many recommendations on the web to use a dual Xeon workstation for Premiere Pro, so lets take a look at how the actual performance you would see in Premiere Pro compares to what you would expect from the "aggregate frequency":

CineBench R15 Multi CPU Specs Aggregate Frequency Expected Performance Compared to Core i7 6850K Actual Performance Compared to Core i7 6850K
Intel Core i7 6850K 6 Cores, 3.6 GHz
(3.7-4.0 GHz Turbo)
21.6 GHz 100% 100%
Intel Core i7 6950X 10 Cores, 3.0 GHz
(3.4-4.0 GHz Turbo)
30 GHz 139% 123%
(off by 16%)
2x Intel Xeon E5-2630 V4 20 Cores, 2.2 GHz
(2.4-3.1 GHz Turbo)
44 GHz 189% 117%
(off by 77%)
2x Intel Xeon E5-2690 V4 28 Cores, 2.6 GHz
(3.2-3.5 GHz Turbo)
72.8 GHz 337% 111%
(off by 226%)

The results in the chart above are taken from this Adobe Premiere Pro CC 2015.3 CPU Comparison article where we looked at exporting and generating previews in Premiere Pro with a variety of codecs and resolutions. While the results aren't too far off with somewhat similar CPUs, the "aggregate frequency" expected the i7 6950X to be about 16% faster compared to the i7 6850K than it is in reality. This isn't completely out in left field, but the difference between a 39% improvement in performance and a 23% improvement from a CPU that is more than twice as expensive is likely to make quite a big difference if you are trying to decide on which CPU to purchase.

For the dual CPU options, the "aggregate frequency" was much further off from reality being about 77% off on the dual Xeon E5-2643 V4 and a huge 226% off on the dual Xeon E5-2690 V4. In fact, where the "aggregate frequency" predicted the dual E5-2690 V4 CPUs to be the fastest option, they were in fact slower than the dual E5-2643 V4 CPUs (or even the Core i7 6950X) while costing significantly more.

Example 3: 3ds Max

3ds Max is a 3d modeling and animation program that is primarily single threaded, so you would expect it to be a worst case scenario for "aggregate frequency". You may argue that no one should use this term for these types of lightly threaded tasks, but we have started to see this term pop up even when talking about single or lightly threaded tasks so we wanted to show just how inaccurate it may be when someone uses "aggregate frequency" as a catch-all term for CPU performance:

CineBench R15 Multi CPU Specs Aggregate Frequency Expected Performance Compared to Core i7 6850K Actual Performance Compared to Core i7 6850K
Intel Core i7 6850K 6 Cores, 3.6 GHz
(3.7-4.0 GHz Turbo)
21.6 GHz 100% 100%
Intel Core i7 6950X 10 Cores, 3.0 GHz
(3.4-4.0 GHz Turbo)
30 GHz 139% 102%
(off by 37%)
2x Intel Xeon E5-2690 V4 28 Cores, 2.6 GHz
(3.2-3.5 GHz Turbo)
72.8 GHz 337% 89%
(off by 248%)

The results in the chart above are taken from this AutoDesk 3ds Max 2017 CPU Performance article where we looked at animations, viewport FPS, and scanline rendering with a variety of projects. As expected from a mostly single-threaded application, the "aggregate frequency" was very optimistic in each case. Depending on which CPU you look at, the expected performance if you used the "Aggregate frequency" compared to the actual performance ranged from being 37% off to being 248% off! On the extreme end - with a pair of high core count Xeons - this means that instead of more than a 3x increase in performance compared to a i7 6850K, in reality you would actually see a 10% decrease in performance.

Example 4: After Effects

After Effects is an interesting application because it used to be very well threaded and benefited greatly from high core count workstations. However, in the 2015 version Adobe changed it's focus from multi-threading to GPU acceleration. In the long-term this should greatly improve performance for AE users, but the result is that with modern hardware there is little need for a CPU with more than 6-8 CPU cores and often actually a decrease in performance with higher core count and dual CPU setups. So while you may think a single-threaded application like 3ds Max would be the worst case for the term "aggregate frequency", After Effects should be even worse:

CineBench R15 Multi CPU Specs Aggregate Frequency Expected Performance Compared to Core i7 6850K Actual Performance Compared to Core i7 6850K
Intel Core i7 6850K 6 Cores, 3.6 GHz
(3.7-4.0 GHz Turbo)
21.6 GHz 100% 100%
Intel Core i7 6950X 10 Cores, 3.0 GHz
(3.4-4.0 GHz Turbo)
30 GHz 139% 96%
(off by 43%)
2x Intel Xeon E5-2630 V4 20 Cores, 2.2 GHz
(2.4-3.1 GHz Turbo)
44 GHz 189% 90%
(off by 99%)
2x Intel Xeon E5-2690 V4 28 Cores, 2.6 GHz
(3.2-3.5 GHz Turbo)
72.8 GHz 337% 86%
(off by 251%)

The results in the chart above are taken from the 2D Animation portion of our Adobe After Effects CC 2015.3 CPU Comparison article which tested rendering and timeline scrubbing across six different projects. As you can see, trying to use an "aggregate frequency" to estimate the difference between different CPU models is going to be wildly inaccurate. Compared to the i7 6850K, the other CPU choices - which should be anywhere from 39% faster to over 3 times faster - are instead all slower than the Core i7 6850K. In fact, the faster the "aggregate frequency" predicted a CPU configuration to be, the slower it ended up being in reality!

The allure of an all-pervasive specification like "aggregate frequency" is something we completely understand. It would be great if there was an easy way to know which CPU will be faster than another and by roughly how much, but unfortunately there is no magic bullet. To be completely fair, for highly threaded tasks like rendering, the "aggregate frequency" should be close enough that you at least wouldn't end up spending more money for lower performance, but it still isn't going to be great at estimating precisely how much of a performance increase you would see with one CPU over another.

Outside of rendering and a few other highly parallel applications, however, there is no way to know whether the "aggregate frequency" is going to be accurate or not without detailed benchmarking. For example, simulations are often touted as being highly parallel (which means it should be perfect for this term), but we have found that performing simulations in SOLIDWORKS is only moderately efficient - worse in many cases than Premiere Pro! Other simulations packages like ANSYS or COMSOL should be more efficient, but without specific testing there is no way to know for sure.

So if "aggregate frequency" is not accurate, what should people use to decide which CPU to purchase? Like we said earlier, there is no magic bullet for this. If your application is CPU-bound (the GPU, HD, and RAM don't impact performance significantly), you could use Amdahl's Law which taken into account the parallel efficiency of the program to calculate the theoretical performance difference between two CPUs. If you are interested in this, we recommend reading our guide on how to use Amdahl's Law. You are still limited to CPUs of the same architecture, it doesn't take into account things like CPU cache, and you have to do a lot of testing up front to determine the parallel efficiency of the program - but this method should be much more accurate than simply multiplying together a CPU's cores and frequency.

If your application does utilize the GPU to improve performance or if you want to compare CPUs with different architectures, however, there is really no easy way to estimate which CPU will be faster than another and by how much. In these situations, the only reliable method is good old-fashioned benchmarking. Again, we wish there was a better method that was still accurate - it would save us so much time! - but this is simply the fact of reality. This is why we at Puget Systems have started to benchmark different CPUs on as many professional applications as we have the time and expertise to handle to ensure that we are recommending exactly the right CPU to our customers. We unfortunately can't test every program we wish we could (or really even a majority), but keep on an eye on our article list as we expand our testing across more and more applications.

Permalink to story.

 
I thought people stopped using aggregated CPU frequency after we saw the failure that was AMD's QuadFX (also known as 4x4) computers.
 
My boss tried to use it once to explain why he was buying one cpu over the other not realizing he was comparing different models with different cache, turbo boost, etc. Tried to correct him and explain it but not sure he was convinced. Problem is he is a very smart engineer and should have known better to begin with.
 
In my country, it is common for sellers through local channels similar to eBay (Mercado Libre) to advertise the CPUs with aggregate frequency. I thought only stupid people used that to lure ignorant people, I didn't even know it had a name.
Now I know that arguably not stupid people use that and that has a name.
 
Great article but basically you can also tell that the differences when the actual performance is lower than expected are caused by the software used. if you used synthetic test it was the most accurate out of your tests. so your assumption that adobe can use all the cores available to the system was kinda wrong and you should do some more real life tests
 
What a ridiculous article! You may as well simply state that comparing clock frequency on ANY CPU is pointless because some applications are memory or IO bound and hence are unable to even utilise all of a single core.
CPU clock frequency is an indicator of potential performance, and that applies to the aggregate frequency too, it is just an indicator of potential. The same CPU architecture (cores, cache, process etc.) run at a higher frequency WILL always have the potential to do more. However the days of comparing CPUs by GHz alone are long gone, as should this article be.
 
Also, frequencies vary depending on the CPU and how many cores are in use, not only because of turbo but thermal restraints, also not even the base frequency of a CPU is enough to make a comparison.
 
In my country, it is common for sellers through local channels similar to eBay (Mercado Libre) to advertise the CPUs with aggregate frequency. I thought only stupid people used that to lure ignorant people, I didn't even know it had a name.
Now I know that arguably not stupid people use that and that has a name.
No, smart people use it to lure ignorant people because they know bigger numbers look better to the average person.
 
Most Xeon CPU with over 4 core a mostly used for server virtualization. the hyper-v and vmware cpu scheduler can handle all the cpu core . Consider today powerful virtualization host can 150 vm with an average of 4 virtual core per vm for a total 600 core entitled on the host machine. You need a xeon multicore cpu with a bunch of ram. I use host with 4x 18 core cpu and 2 tb of ram to handle about 150 vm per host. Can you imagine a 4 core i7 with a maximum of 64 gig of ram handle server virtualization LOL. Matt high power xeon are not to perform desktop application, remember.
 
Basically it's a marketing term that the majority of the market doesn't understand.

I really miss the days when we just had "Pentium 1", Pentium 2" and so forth and so on.

OK: this Pentium 4 is better than this Pentium 2.

Why?

Because 4 is better than 2.

The average buyer shops with two things in mind: product design and price.

People know Apple Mac stuff is gonna cost considerably more - but they are willing to pay it (if they can) because they know it's powerful, they know it's popular and they know it's exclusive. They may have even used one and decided they want the "luxurious" design.

But when you take a PC and just slap words on it like "dual core" and "core 2 duo" NOBODY understands that nonsense other than computer pros...so what does the consumer do?

They look at the PRICE!

OK - this one is $900 and this one is $500. The one that's 900 has a Core i7, but the one that's $500 has a Core i5.

DUH... 7 is better than 5...amirite?


Games demand more cores and a good GPU. Most buyers don't understand that when they walk into Walmart, BJ's and Costco. Those with more cash will pick the pricier system expecting to get "more".
Those who can barely afford to be there shopping, will take the lowest end model: Acer Mini notebook.
 
Good argument, but I think it misses the point. What you test show is: it is informative to look at aggregate frequency if you are using multi threaded applications. (viewport performance in 3ds max is mostly gpu dependent while scanline rendering is an artifact of the past only still available to maintain compatibility with older scenes). So in essence, aggregate frequency is as useful as base frequency - together with other specifications it can be used to inform an optimal purchase.
 
Yes, I remember this discussion back in teh day with Dual Cores first coming out and then people not seeing the expected % increase...

I thought it was dead. Clearly, it needs to be dead.
 
Lol..... so you're one of the people who falls for this...

So you're thinking the AMD 8370 with 8 cores running at 4Ghz (so an aggregate of 32Ghz) has the potential to outperform an Intel Skylake 6700 running 4 cores at 4Ghz (aggregate of 16Ghz) by a factor of 2?

You show me ANY benchmark showing that and maybe I start believing in aggregate CPU...
 
On the other hand, the Windows, Linux and MacOS operating systems all can take advantage of multiple cores and threads to run many processes and programs more effectively than if there were fewer cores. It is way more difficult to construct a benchmark running lots of programs and active processes than a benchmark of possibly well-designed single programs. So it would seem to me that the next step in examining aggregate frequency as a measure of system power would be to design and run a repeatable multi-program benchmark. I challenge TechSpot to do so, but in the meantime I will not hold my breath. I'm very happy with the smooth performance of my 6-core Xeon system with a lot of memory (yet another factor that influences benchmark results).
 
Back