AMD's take on AI: Instinct MI300X combines GPU cores with 192 GB of HBM3 memory

Bob O'Donnell

Posts: 81   +1
Staff member
The big picture: One of the most undisputed beneficiaries of the generative AI phenomenon has been the GPU, a chip that first made its mark as a graphics accelerator for gaming. As it happens, GPUs have proven remarkably adept at facilitating and enhancing the process of training large foundation models, and executing AI inferencing workloads.

Up until now, the big winner in the generative AI GPU game has been Nvidia, thanks to a combination of strong hardware and a large installed base of CUDA software tools. However, at an event in San Francisco this week, AMD came out with new GPU and CPU hardware and important new software partnerships and updates. Together, AMD believes these developments will allow it to secure a larger share of a datacenter AI accelerator market it predicts will reach $150 billion by 2027.

AMD introduced the new Instinct MI300X chip as a dedicated generative AI accelerator. It leverages the same basic chiplet design as the previously announced Instinct MI300A (which AMD is now sampling), but the MI300X substitutes the 24 Zen 4 CPU cores in the MI300A with additional CDNA3 GPU cores and High Bandwidth Memory (HBM).

Indeed, this new chip – with a total of 153 billion transistors – boasts 192 GB of HBM and offers 5.2 TB/second of memory bandwidth. These numbers represent a 2.4x increase in memory capacity and a 1.6x enhancement in throughput compared to Nvidia's current H100 accelerator. Although these figures may be overwhelming for most applications, large language models (LLMs) run most efficiently in memory, indicating this should translate to solid real-world performance when the chip begins sampling in the third quarter of this year.

On the software front, AMD made several significant announcements. Firstly, the company detailed the latest version of its ROCm platform for AI software development, ROCm 5. This comprises low-level libraries, compilers, development tools, and a runtime that lets AI-related workloads run natively on AMD's Instinct line of GPU accelerators. This serves as the foundation for AI development frameworks like PyTorch, TensorFlow, and ONNX. Moreover, a new partnership with the PyTorch Foundation emerged at AMD's event. From PyTorch 2.0 onwards, any AI models or applications developed with PyTorch will run natively on AMD Instinct accelerators that have been upgraded to support ROCm 5.4.2.

This development is significant as many AI models are built with PyTorch and, until this announcement, most could only run on Nvidia's GPUs. Now, developers and major cloud computing providers can opt to use AMD Instinct accelerators directly or replace Nvidia accelerators with AMD ones.

Another major software announcement involved Hugging Face, which has rapidly become the go-to site for open-source AI models. As part of their new partnership, Hugging Face will collaborate with AMD to ensure compatibility of existing and new open-source models posted on its site with Instinct accelerators. They also plan to work on compatibility across other AMD processors, including Epyc and Ryzen CPUs, Radeon GPUs, Alveo DPUs, and Versal FPGAs (or adaptive processors, as AMD refers to them). This should help position AMD as a viable alternative to Nvidia in several AI datacenter environments.

On the datacenter CPU front, AMD announced their new "Bergamo" and "Genoa X" versions of their fourth-generation Epyc processors, and hinted at another version named "Sienna" to be announced later this year. Bergamo is optimized for cloud computing workloads, featuring a new smaller Zen4c core and accommodating more of these smaller cores onto the chip (up to 128). The refined architecture facilitates operations such as running more containers simultaneously, leading to impressive benchmarks that partners including AWS, Azure, and Meta were keen to discuss.

The Genoa X model combines AMD's 3D V-Cache technology, first introduced with their third-generation "Milan" series, with the fourth-generation Genoa CPU design. It's optimized for technical and high-performance computing (HPC) workloads that require more and faster on-die cache memories.

These CPU developments, as well as variations on the Instinct MI300 accelerator, reflect AMD's growing diversity of designs optimized for specific applications. All these can leverage key AMD technologies, including their chiplet-based design and the Infinity Fabric interconnect technology. It's a prime example of how forward-looking designs can significantly influence overall strategies.

AMD also unveiled the Instinct Platform at their event, which, like Nvidia's similar offering, integrates eight of AMD's GPU-based accelerators (MI300Xs) into a single, compact hardware design.

To be clear, Nvidia still maintains a significant lead in generative AI training and holds a strong position in inferencing. H owever, once these new accelerators and software partnerships begin to influence the market, they could make a notable difference for AMD. While many companies appreciate what Nvidia has achieved for generative AI, no one relishes a market dominated by a single entity. Consequently, several companies will likely welcome AMD's emergence as a strong alternative in this sector.

On the datacenter CPU front, AMD has already established itself as a solid alternative to Intel. It's not difficult to imagine the company developing a similar profile in datacenter AI accelerators. While the journey won't be easy, it's bound to make things more exciting.

Permalink to story.

 
Am I crazy in wanting this form factor used on these modules to be used on regular computers/motherboards?
 
I don't understand why AMD and Intel have not pushed harder in the area of replacing CUDA for ML. It is the use of CUDA that all the training and inferencing models use. CUDA is what has given Nvidia almost all of the ML marketshare. If AMD and Intel can eliminate this strangle hold, then maybe, just maybe, Nvidia can go back to making GPUs that gamers want and can actually afford.
 
making GPUs that gamers want and can actually afford.
There is no hope for humanity....

Nvidia has succeeded in turning what was an open platform (PC) into a closed one thanks to DLSS and you want more of the same...

Actually, AMD and Intel should sell all of their IP's to Nvidia, close shop and we all can rejoice under the unlimited love that Nvidia and Dear Leader Jensen will give us.

Hint, your beloved GPUS will start at US$4k.. thats how much he loves us. :)
 
That simplistic heatsink on main pic looks very good. I dig simple designs like that. I wish more PC hardware makers kept it simple and minimalistic.
 
There is no hope for humanity....

Nvidia has succeeded in turning what was an open platform (PC) into a closed one thanks to DLSS and you want more of the same...

Actually, AMD and Intel should sell all of their IP's to Nvidia, close shop and we all can rejoice under the unlimited love that Nvidia and Dear Leader Jensen will give us.

Hint, your beloved GPUS will start at US$4k.. thats how much he loves us. :)
we need to get our glorious leader, Jensen, more leather jackets
 
Am I crazy in wanting this form factor used on these modules to be used on regular computers/motherboards?
This design comes with 10-20K RPM fans and also with a temperature/moisture controlled environment. 750W in this density it's not this easy to cool down.


Krx6IwYb6Fhky54i.jpg
 
This design comes with 10-20K RPM fans and also with a temperature/moisture controlled environment. 750W in this density it's not this easy to cool down.


Krx6IwYb6Fhky54i.jpg
Well, im not expecting 8 modules like the one used in the article, maybe 2, one for the cpu and one for the gpu.

Also, have you seen some of the case’s people are using lately?

Those things have close to 10 140mm fans.

I think they can move enough air with those. :)
 
I don't understand why AMD and Intel have not pushed harder in the area of replacing CUDA for ML. It is the use of CUDA that all the training and inferencing models use. CUDA is what has given Nvidia almost all of the ML marketshare. If AMD and Intel can eliminate this strangle hold, then maybe, just maybe, Nvidia can go back to making GPUs that gamers want and can actually afford.
Because ~5yrs ago, generative AI was a curiosity that was limited to university labs and a few online services that only those who were terminally online were even aware of. CUDA was mainly there to accelerate graphics rendering, and AMD felt it wasn't enough of 'killer feature' to warrant the R&D to match it.

But my guess is 2-3yrs ago, when generative AI started spreading beyond labs and a few internet curiosities, AMD began to take notice the advantage that CUDA provided these work loads, and began working on the designs they are now putting out for sampling. Note: 2-3yrs (or even 5yrs) is an insanely quick turn-around time from concept to initial manufacturing. AMD either had this in their back pocket for a while, and couldn't justify the extra costs for developing the manufacturing processes until now, or their chiplet designs are just that flexible that they can combine them pretty much at-will and the design of the individual chiplets is pretty much entirely decoupled from their integration together into a processor.

Either way, yes, hopefully this not only spreads demand out over AMD as well as nVidia, and both start competing on price for all markets and segments. I want to see the top-tier GPU dip back below $750, rather than us pretending that pushing $1k is somehow acceptable for mid-tier cards.
 
Back