Forward-looking: A new generation of exascale-class supercomputers is coming, and all these billion calculations will need to move a lot of data and fast. CPU and GPU technology is ready for the task, but the networking infrastructure requires a substantial (albeit conservative at the most fundamental level) overhaul.
Hosted by The Linux Foundation, the Ultra Ethernet Consortium (UEC) has just been announced with the goal of building a complete, Ethernet-based communication stack architecture for high-performance networking applications. Founding members of the new organization include AMD, Intel, Broadcom, Cisco, HPE, Meta, and Microsoft, which according to The Linux Foundation have decades of collective experience with networking deployments in cloud, AI and HPC environments.
The UEC will seemingly work on minimizing communication stack changes while maintaining and promoting Ethernet interoperability. The Consortium is developing specifications, API interfaces, and source code to define protocols, electrical and optical signaling characteristics, link-level and end-to-end network transport protocols and management mechanisms, software, storage, and security constructs.
The UEC will work at every level of the network communication stack, following a systematic approach with modular, compatible, and interoperable layers which will be tightly integrated to provide a "holistic improvement for demanding workloads." Four working groups (Physical Layer, Link Layer, Transport Layer, Software Layer) will be "seeded" with highly valuable contributions from founding companies, the Consortium says.
The Linux Foundation highlights how AI and HPC workloads are rapidly evolving, requiring best-in-class functionality, performance, and interoperability without sacrificing developer and end-user friendliness. Ethernet's ubiquity and flexibility are just perfect for the task, so the new Ultra Ethernet solution will capitalize on the historic standard to bring improved workload management in the era of exascale computing.
J Metz, Chair of the Ultra Ethernet Consortium, says that the new initiative is about tuning Ethernet to improve efficiency for "workloads with specific performance requirements." Mark Papermaster, Chief Technology Officer and Executive Vice President at AMD, says that the company is proud to be a founding member and a key contributor to UEC's working groups.
For the Santa Clara fabless chipmaker, Ultra Ethernet is a natural progression of high-performance products sold by the company including AMD EPYC server processors, AMD Instinct MI Series accelerators, and Alveo SmartNIC devices. UEC will create the necessary specifications for successful AI and HPC clusters, Papermaster states.