What just happened? Arm says its new Lumex platform marks the sixth consecutive year of double-digit improvements in instructions per cycle, reinforcing its strategy to keep CPUs central to AI innovation. The company is presenting this leap not as a routine hardware update, but as a shift in how consumer devices will handle advanced computing.

Unveiled this week, the Lumex Compute Subsystem (CSS) is designed to run AI directly on the device, rather than offload tasks to the cloud. Arm says Lumex delivers major gains in CPU performance, AI acceleration, and graphics capabilities while maintaining power efficiency across a range of devices, from flagship smartphones to ultra-low-power wearables.

At the core of the Lumex architecture is support for AI-heavy workloads such as real-time speech recognition, translation, and generative applications. This is enabled by Arm's integration of Scalable Matrix Extension v2 (SME2), an instruction set developed to accelerate the mathematical operations that underpin modern AI models.

Instead of relying on separate neural processors, SME2 enables the CPU to perform AI acceleration directly. Arm claims this shift brings up to five times faster AI performance compared with the previous generation, cuts latency for speech tasks by nearly a factor of five, and triples the speed of audio generation.

The Lumex CPU family introduces four different core types, each tailored to different performance and efficiency requirements. At the high end, the C1-Ultra offers a 25 percent boost in single-thread performance over Arm's earlier Neoverse design. It is intended for compute-intensive workloads such as large language model inference and computational photography.

The C1-Premium features a smaller footprint – approximately 35 percent smaller than the Ultra – while retaining much of its muscle, making it well-suited for mainstream smartphones where cost and space constraints are key. The C1-Pro strikes a balance between efficiency and power, delivering a 16 percent performance boost for sustained workloads, such as video streaming or background assistant apps.

At the most efficient tier, the C1-Nano reduces energy use by up to 26 percent, making it a viable option for wearables such as smartwatches and rings. Arm states that a cluster combining two C1-Ultras with six C1-Pros demonstrates a full fivefold increase in AI performance, showcasing the platform's scalability across devices.

Software support is central to this effort. Arm has built its KleidiAI libraries directly into leading frameworks such as PyTorch ExecuTorch, Google's LiteRT, Alibaba's MNN, and Microsoft's ONNX Runtime. As such, developers can take advantage of SME2's acceleration without needing to rewrite their code. Arm points to early examples of adoption, including Google Photos, Gmail, and YouTube, which already show improved performance with the framework integration, as well as Alipay, which has demonstrated running large language models entirely on-device using Lumex prototypes.

Gaming also received significant focus with the launch of the Mali G1-Ultra GPU. This new graphics processor introduces a second-generation Ray Tracing Unit, which doubles ray tracing performance compared to the prior Immortalis G925 GPU.

The 14-core G1-Ultra is said to deliver a 20 percent graphics performance boost while reducing power draw by 9 percent per frame. It also accelerates GPU-based AI inference by 20 percent, highlighting Arm's effort to spread AI workloads across the entire system-on-chip. Games such as Arena Breakout, Fortnite, Genshin Impact, and Honkai Star Rail are expected to see noticeable improvements. For midrange and efficiency-focused devices, scaled-back models – the G1-Premium and G1-Pro – bring similar architectural gains while extending battery life.

Several major technology players have already pledged support for Lumex. Samsung, MediaTek, Alibaba, and vivo have been among the first to highlight improvements in responsiveness for applications like voice interactions, summarization, and translation. Google has emphasized SME2's role in making AI more consistent across Android devices and Windows on Arm laptops, while Meta has pointed to performance gains in its applications built on PyTorch ExecuTorch.

Nevertheless, entering the US market can be challenging. Qualcomm remains the leading designer of processors for Android phones sold domestically and is currently locked in litigation with Arm over licensing of its technology. That fight may determine how soon Lumex-based devices become widely available outside Asia, leaving open how quickly Arm's latest CPUs will shape the broader market.