In brief: Arm's latest mobile CPU and GPU designs are official, and the big focus this year is on boosting energy efficiency through various microarchitectural improvements. The company is also leaving the AArch32 (32-bit) instruction set behind after seeing that more than 90 percent of mobile apps are now distributed as 64-bit binaries.

Arm this week showed off new mobile CPU and GPU designs that manufacturers will use to power future phones, tablets, and ultraportable laptops. The company seems to have settled on a yearly cadence when iterating on its microarchitectures, and this time we're also seeing a complete transition to 64-bit computing for the company's Total Compute Solutions (TCS).

Speaking of 64-bit computing, companies like Qualcomm and MediaTek have yet to fully drop 32-bit support from their custom designs. Notably, the Qualcomm Snapdragon 8 Gen 2 that was launched last year doesn't adhere to Arm's design, and the same is true for MediaTek's Dimensity 9200/9200+ chipsets as both retain the ability to run 32-bit apps on efficiency cores.

Otherwise, Arm's newest generation of CPU cores is designed to push performance and energy efficiency up compared to previous designs. The new Cortex-X4 CPU core should deliver 15 percent higher performance than the Cortex-X3 and 40 percent better power efficiency. The speed improvement was made possible by increasing the L2 cache to two megabytes and enabling clock speeds of up to 3.4 GHz.

As for the Cortex-A720 and Cortex-A520 CPUs, they should offer efficiency boosts of 20 percent and 22 percent, respectively over previous generation designs. Of course, these are estimates for reference designs, and as such they're not indicative of actual gains we'll see in custom mobile chips from Qualcomm, Samsung, and MediaTek.

All the new cores are based on the Armv9.2 architecture, which adds a new security feature in the form of a new QARMA3 algorithm for pointer authentication. This is a memory security feature that makes it harder for malicious apps to create valid memory pointers and exploit buffer overflows and memory corruption vulnerabilities.

While this functionality has been available for a while, manufacturers have thus far been reluctant to enable it as it does come with a cost to overall performance. Now that the CPU overhead for this has been reduced to around one percent, we're hoping to see more companies use it alongside things like Memory Tagging Extensions since they are meant to reduce the attack surface for hackers looking to steal your data or passwords.

Another important announcement is that Arm is migrating from a reference phone SoC design with a 1+3+4 core cluster to a 1+5+2 layout. The idea behind this is to swap out two of the smallest CPU cores for two medium ones, which should result in 27 percent more performance on Android 13 and later versions. For flagship phones, Arm envisions a nine-core chip with a 1+4+4 configuration.

Accompanying the two phone SoC designs is the company's "most powerful cluster ever built." The new DSU-120 compute cluster features ten Cortex-X4 cores and four Cortex-A720 cores with 32 megabytes of shared L3 cache, but companies have yet to build a DSU mega chip despite Arm touting higher single-threaded performance than Intel's Core i7 mobile CPUs. Some like Qualcomm are more interested in developing in-house solutions for laptops and 2-in-1s, and Arm is actively trying to prevent that from turning into a trend.

In the GPU department, Arm revealed the latest flagship Immortalis-G720 design alongside Mali-G720 and Mali-G620 GPUs aimed at mid-range and entry-level mobile devices. These are based on the company's 5th-gen architecture, which brings improvements in key areas like memory power usage, HDR rendering, and geometry-related memory accesses.

The Mali-G620 design tops out at five cores, while the Mali-G720 and Immortalis-G720 max out at nine and 16 cores, respectively. The Immortalis-G720 also includes a ray-tracing unit, but all three GPUs should bring 15 percent higher sustained gaming performance while using 40 percent less memory bandwidth compared to previous designs. These improvements were made possible by deferred vertex shading, a key feature of Arm's 5th-gen GPU architecture.

Arm says it expects to see the first commercial products based around these new CPU and GPU designs hit the market in early 2024.