Apple M1 Macs "feel faster" than they really are thanks to QoS optimizations

nanoguy

Posts: 1,355   +27
Staff member
Why it matters: Apple has received a lot of praise for creating the M1 SoC, whether it be for its performance when compared to Intel and AMD CPUs in the same class, or for its relatively cool and battery-friendly operation. However, it's easy to forget the company is able to achieve this partly thanks to being a vertically-integrated hardware company and using that advantage to prioritize responsiveness over raw performance.

This week, rumors emerged about Apple's much-anticipated successor to the M1 chip, which is set to debut later this year in new MacBooks. The upcoming SoC is said to feature a slightly different architecture and could come in two variants aimed at casual and professional users, respectively.

In the meantime, it's worth looking at the reasons why almost everyone buying the new M1-powered Macs is praising them for feeling faster than their Intel-powered counterparts. As you may remember, Apple only showed a few vague graphs comparing performance and efficiency between the M1 and the "latest PC laptop chip," but later those claims were more or less confirmed by independent tests.

Earlier this year, Intel started an all-out ad campaign against M1-powered Macs in an effort to prove they're not as special as you may have heard. Intel's take revolves around being able to play more games on Intel-powered laptops, which also happen to come in a wide variety of form factors, including hybrids between clamshells and tablets that Apple has no intention to build. Intel also hired Justin Long, the "I'm a Mac" actor from Apple's famous "Get a Mac" campaign.

Howard Oakley, a developer behind several Mac applications has done some digging into the magic sauce that makes the M1 chipset so good, and his conclusion may not surprise long-time Apple fans. The short of it is that Apple is optimizing the software experience using Quality of Service, or intelligent task scheduling.

Intel and AMD typically market their products using claims about throughput, or, simply put -- the number of operations or tasks that can be completed in a given amount of time. In some scenarios like data centers, that's an easy metric that helps companies decide on the best solution for their needs. However, a consumer usually doesn't perceive the raw speed of a device, but rather latency, something that reviewers often describe as "feeling fast."

In his analysis, Oakley compared an M1-powered MacBook Pro and Mac mini with an eight-core Intel Xeon-powered Mac Pro, all of them running macOS Big Sur. The idea was to test how these systems behave when you throw tasks of different priority (QoS) levels at them. By default, macOS is set to decide the importance of a task on its own, but developers can also use four specified QoS levels called background (lowest), utility, userInitiated, and userInteractive (highest).

Oakley used his Cormorant app, which is a compressor-decompressor utility that lets you set the QoS level, to compress a 10-gigabyte test file. What he found is that on an x86 Mac with no other apps running, the compression task will be scheduled across all cores, so that it is completed in the shortest possible time, regardless of the QoS setting. When running two compression tasks, one with a high priority level and one with a low level, the first executed in the normal amount of time while the other took several times longer to complete.

By contrast, an M1 Mac behaves quite differently: macOS will schedule a low-priority compression task across the chipset's high efficiency Icestorm cores, even if there's no competing task. This leaves the higher-performing Firestorm cores free to quickly take on higher priority tasks, but has the side-effect of making the compression task slower on the M1 than it would be on Intel-based Mac.

When Oakley set the priority of the compression task to userInitiated or userInteractive, he found that it would get scheduled across all of M1's eight cores. Progressively adding lower-priority compression tasks resulted in them only being allocated to the high efficiency cores and taking virtually the same amount of time as running them sequentially.

What this means is that on the new Macs, Apple prioritizes responsiveness in a similar way as it does on iPhone and iPad. The company has made it so low-priority tasks will always run on the high-efficiency cores and let the high-performance cores stay idle to save power. When you fire up an app, those high-performance cores are ready to execute it with an almost-imperceptible delay, which is why it will "feel faster" than an Intel-based Mac.

Theoretically, Apple could recreate this behavior to some degree on existing Intel-based Macs if it wanted to, by dedicating some processor cores for background tasks and allowing only high priority tasks to run on the remaining cores. This also speaks to Apple's vertical integration where the software is designed to take maximum advantage of the hardware at hand, as well as the willingness of many developers to replicate its approach when designing their apps.

Many tech companies routinely look at Apple for a sense of direction, so we'll probably see similar optimizations on Windows in the near future. Last year, someone proved an ARM64 build of Windows 10 ran better in a virtual machine on the M1-based Macs than it did on Microsoft's Surface Pro X, even though the latter is equipped with a Snapdragon 8cx SoC with a similar configuration of four high efficiency cores and four high performance cores.

Intel's upcoming Alder Lake CPUs could be the first x86 chips to feature a big.LITTLE architecture, assuming they land in desktop PCs by the end of this year, as promised. These will feature a combination of low-power Gracemont cores and high performance Golden Cove cores built on the company's 10 nm SuperFin process node, which hopefully means they'll be pretty power-efficient even when fully-stressed.

Power consumption will be an important selling point as it will no doubt draw comparisons to that of the Apple M1 SoC, which is up to three times more efficient that the Intel processors it replaced.

AMD could follow in Intel's footsteps with Zen 5, too, which is said to feature eight high performance x86 cores and eight high efficiency x86 cores. The company is even exploring the idea of making a direct, Arm-based competitor to the Apple M1 that could come with integrated RAM, but details on that project are still scarce. Even Microsoft is cooking up a similar concept for its future Surface PCs and Azure servers, so there are exciting times ahead of us.

Permalink to story.

 
I don’t thinking this is “making it feel faster than they are.” I am expecting it to work like this. Not a kind of trick or cheat. My experience is it is not feeling noticeable different than Intel mac, but power usage is much lower. my mac laptop is very much slower than amd radeon 7 desktop, but feeling comparable to intel laptop but it is having better power efficiency and it is silent all the times.
 
This is interesting. I wouldn't have thought the time needed to migrate a running program from the big cores to the little ones would be long enough to be noticeable to the user, and so while optimizations to make the system more responsive are, in general, a good idea, one that wastes so much of the computer's potential ability to do productive work.. is not.
At least not unless the system is running on batteries, where using the big cores on a less important task is worth avoiding.
So I think this optimization... ought to be configurable in the computer's power management settings. Allow low-priority programs to use big cores... always, never, or only if plugged in.
 
Microsoft, Intel and AMD need to get together to optimise the OS, it benefits everyone.
Except that they can't. Considering these are different organizations with different goals and objectives, it will be difficult to collaborate towards a common goal. So it is either impossible, or it will take a long time.
 
Microsoft, Intel and AMD need to get together to optimise the OS, it benefits everyone.
That is implying Windows isn't already optimised. Do we know is Windows optimised, how much better can it even get? It's easy to just say that.

Also, Microsoft pretty much has a monopoly in the market, they don't need to improve to get you on it.

One thing these companies want is profit, spending resources to potentially optimise Windows won't help them do that, short term at least, it will do exactly the opposite.
 
Windows isn't optimized. Not by a long shot. The more cores you add the more clunky it becomes. Extreme case is of course 3990x which does lags behind Linux, not as terribly as 1xxx/2xxx TR, but it still does. And TR3xxx hackintoshes simply obliterate Intel based MacPro running natively. Mano a mano every CPU under Linux is more efficient and better utilized than under Windows.

Windows doesn't really know how to deal with 64 cores not to mention all the threads on top of them.

Except high core issues, NTFS is antiquated file system. Ill suited to deal with NAND and faster and faster NVMe storage.

In truth it would be very prudent for M$, Intel and AMD to get together and fix most pressing issues. x64 doesn't have long to live.
 
That is implying Windows isn't already optimised. Do we know is Windows optimised, how much better can it even get? It's easy to just say that.

Also, Microsoft pretty much has a monopoly in the market, they don't need to improve to get you on it.

One thing these companies want is profit, spending resources to potentially optimise Windows won't help them do that, short term at least, it will do exactly the opposite.

It's much easier to optimise software for a few devices like Apple rather than billions of different combinations of hardware from tens of thousands of different manufacturers.
 
This is interesting. I wouldn't have thought the time needed to migrate a running program from the big cores to the little ones would be long enough to be noticeable to the user, and so while optimizations to make the system more responsive are, in general, a good idea, one that wastes so much of the computer's potential ability to do productive work.. is not.
At least not unless the system is running on batteries, where using the big cores on a less important task is worth avoiding.
So I think this optimization... ought to be configurable in the computer's power management settings. Allow low-priority programs to use big cores... always, never, or only if plugged in.
Generally when people design products they do so in an intelligent way, fr surpassing a laymen's analysis on what should happen. so my guess is that the scheduler works smartly and efficiently, doing exactly what you suggest but without manual intervention.
 
It's much easier to optimise software for a few devices like Apple rather than billions of different combinations of hardware from tens of thousands of different manufacturers.

Precisely why Apple is a better choice, unless you need/want those thousands of legacy items. In study after study it is determined that even when people can upgrade, most don't. for the tinkerers, well yah they would use that stuff.

For me, I don't want blue screens of death, I don't want a gazillion security updates every week or tons of malware, or programs crashing. So I'm done with windows and MS products (although the new CEO is reportedly fixing the mess). Las time I did some simple maintenance for a friend on a windows machine - so frustrating. apple is significantly better engineered and if it, as you say, because of all the legacy crap in windows that I don't need, so be it.
 
Precisely why Apple is a better choice, unless you need/want those thousands of legacy items. In study after study it is determined that even when people can upgrade, most don't. for the tinkerers, well yah they would use that stuff.

For me, I don't want blue screens of death, I don't want a gazillion security updates every week or tons of malware, or programs crashing. So I'm done with windows and MS products (although the new CEO is reportedly fixing the mess). Las time I did some simple maintenance for a friend on a windows machine - so frustrating. apple is significantly better engineered and if it, as you say, because of all the legacy crap in windows that I don't need, so be it.

While I own a MacBook Air M1 and several Windows PCs, I can't say I fully agree with you. First, I rarely have any crashes on Windows. I cannot remember the last time an application crashed and I haven't seen a BSOD in years, maybe decades.

My M1 MacBook is a great machine and I like it a lot. But, it's far from perfect. I've actually had Safari crash on me a couple of times lately. Could be the web site or could be Safari, hard to say but I haven't had this problem on my PCs.

As for updates, Apple has rolled out quite a few MacOS updates over the past couple of months. I really don't find updating MacOS or Windows to be that big of a deal. It's more or less automatic and transparent to me. Personally, I don't mind security updates, I'd rather be sure the holes are plugged than risk having my identity or data stolen.

Overall I am happy with my Mac, but I really don't see that it's significantly better (or worse) than Windows. I switch back and forth on a daily basis, depending on what I'm doing.
 
It's much easier to optimise software for a few devices like Apple rather than billions of different combinations of hardware from tens of thousands of different manufacturers.
Billions? 'tens of thousands'? Really? So many? Wow! I think we can do without the extreme numerical superlatives and define conditions more realistically as being numerous combinations of hardware from various manufacturers.
 
Well, perception is everything in normal everyday usage, so if it feels faster, that's good. Reminds me of the famous Eurostar train example by an expert in consumer behavior, Rory Sutherland:


(regarding the wifi, the train does have wifi now, but it didn't have it at the time this talk was recorded)
 
Back