Editor's take: Nvidia became the world's most overvalued company as Big Tech players scrambled to buy every GPU and AI accelerator they could get. Many of these corporations are now turning their attention to developing their own accelerators, with Microsoft reportedly leading the pack in efficiency and performance.

Microsoft recently announced Maia 200, a new AI accelerator specifically designed for inference workloads. According to Redmond, Maia 200 can deliver "dramatic" improvements for AI applications and is already deployed in select US data centers on the Azure platform.
The company highlighted the chip's impressive specifications: Maia 200 is built on TSMC's 3nm process, features native FP8/FP4 tensor cores, a new memory system with 216 GB of HBM3e VRAM, and a massive 272 MB on-chip SRAM cache. Microsoft claims that Maia 200 offers the highest performance among all custom silicon designs currently used by other hyperscalers.
The chip is said to be up to three times more powerful than Amazon's third-generation Trainium at 4-bit precision (FP4) and surpasses Google's seventh-generation TPU at 8-bit precision (FP8). It is also more efficient, delivering 30 percent better performance per dollar compared with Microsoft's previous accelerator, Maia 100.
Maia 200 is currently deployed in Microsoft's US data center region in Iowa, with additional regions expected to come online soon. The chip integrates seamlessly with the Azure cloud platform and is also being used to generate "synthetic" data for training next-generation AI models.
Microsoft's overview of compute/memory specs across Maia 200, Amazon Trainium3, and Google's TPU v7
Concerned about the potential feedback-loop effects, major corporations are exploring alternative data streams as they anticipate that human-generated content will eventually be fully consumed by large language models and other machine learning tools.
Microsoft confirms that Maia 200 is a massive chip, with over 140 billion transistors contained within a 750 W TDP envelope. Performance is rated at over 10 petaFLOPS at FP4 and over five petaFLOPS at FP8. The SoC is capable of running today's most powerful AI models and has been designed to support even larger models in the future.
The chip also features a new network design for moving vast amounts of data. Based on standard Ethernet technology, the solution includes a custom transport layer and an integrated NIC for improved performance and reliability. In practical terms, the network interface in each Maia 200 SoC can reach 2.8 TB/s of bidirectional bandwidth.
Finally, Microsoft is inviting developers and AI startups to sign up for the official Maia 200 software development kit once it becomes available. The SDK includes a compiler, PyTorch support, low-level programming tools, a Maia simulator, and more.

