Windows Server 2025 gets VM-based GPU partitioning for AI workloads

zohaibahd

Posts: 934   +19
Staff
Forward-looking: The next major refresh of Microsoft's server operating system will include an important feature to boost AI workloads. When Windows Server 2025 arrives, it will debut a new capability called GPU Partitioning that allows multiple virtual machines to share and utilize the power of a single GPU.

Microsoft's new GPU Partitioning (GPU-P) tech aims to change how virtual machines leverage GPU resources. The feature allows Windows to split a single physical GPU into separate partitions, each getting a slice of the graphic card's overall capabilities. Users can then assign those partitions to individual VMs on the same server. So, instead of having to dedicate an entire GPU to a single VM, multiple VMs can efficiently share the power of one as if each partition were a discrete GPU.

The technology also uses a process called failover clustering. If a VM on one server node encounters a hardware fault or needs to be migrated, it can restart another node in the cluster and use a GPU partition on that other server.

Microsoft is also building centralized management tools to make it easier for admins to configure and oversee this new GPU virtualization setup. The Windows Admin Center UI will provide a unified console to view GPU partition details across an entire cluster environment and assign those partitions to VMs as needed.

With the high cost of GPUs, especially the high-end models suited for AI tasks, businesses can leverage GPU-P to maximize their investments.

Microsoft says its engineers developed GPU-P in close collaboration with Nvidia. Green Team's enterprise VP, Bob Pette, praised the feature, highlighting the security, efficiency, and raw performance. Enabling the feature allows customers to "run their key AI workloads to achieve next-level efficiencies."

Of late, cloud platforms have stolen much of the AI spotlight. However, for businesses that must keep specific AI workflows on-site for regulatory, security, or other reasons, having granular GPU sharing capabilities baked right into Windows Server could prove compelling.

Beyond this feature, Microsoft revealed several other Windows Server 2025 improvements in April. Most notably, users can receive security updates without rebooting, thanks to a clever in-memory code modification. There are also significant performance improvements for SSDs with NVME, boasting a 70-percent increase in IOPS. Plus, the OS will be available via subscription or a one-time license.

Permalink to story:

 
This is HyperV ? because ESXi already has SR-IOV no ?
Hyper-V also has SR-IOV support and has done for a long time. It's mainly used to enable network traffic to bypass the software switch layer of the Hyper-V virtualization stack.

SR-IOV doesn't help with GPU partitioning though, this needs to be developed separately. Yes, you have been able to partition GPU's in VMware for a little while now however, from memory (it's been a long time since I last looked into it), It was a very expensive Nvidia license you needed to make it possible.

Edit: GPU passthrough to a VM has been possible but a little bit janky at times, and you couldn't failover to another Node, even if it was identical hardware. Microsoft seems to have put Hyper-V development on hold since Server 2016 it feels like, nice to see them actually doing something with it, seems a little coicindental that they've restarted development of Hyper-V, right as VMware starts to collapse.
 
Back