AMD says its MI300X AI accelerator is faster than Nvidia's H100

Alfonso Maruccia · Dec 18, 2023

A hot potato: AMD is fighting back at Nvidia's claims about the H100 GPU accelerator, which according to Team Green is faster than the competition. But Team Red said Nvidia didn't tell the whole story, and provided further benchmark results with industry-standard inferencing workloads.

AMD has finally launched its Instinct MI300X accelerators, a new generation of server GPUs designed to provide compelling performance levels for generative AI workloads and other high-performance computing (HPC) applications. MI300X is faster than H100, AMD said earlier this month, but Nvidia tried to refute the competitor's statements with new benchmarks released a couple of days ago.

Nvidia tested its H100 accelerators with TensorRT-LLM, an open-source library and SDK designed to efficiently accelerate generative AI algorithms. According to the GPU company, TensorRT-LLM was able to run 2x faster on H100 than on AMD's MI300X with proper optimizations.

AMD is now providing its own version of the story, refuting Nvidia's statements about H100 superiority. Nvidia used TensorRT-LLM on H100, instead of vLLM used in AMD benchmarks, while comparing performance of FP16 datatype on AMD Instinct MI300X to FP8 datatype on H100. Furthermore, Team Green inverted AMD's published performance data from relative latency numbers to absolute throughput.

AMD suggests that Nvidia tried to rig the game, while it is still busy identifying new paths to unlock performance and raw power on Instinct MI300 accelerators. The company provided the latest performance levels achieved by the Llama 70B chatbot model on MI300X, showing an even higher edge over Nvidia's H100.

By using the vLLM language model for both accelerators, MI300X was able to achieve 2.1x the performance of H100 thanks to the latest optimizations in AMD's software stack (ROCm). The company highlighted a 1.4x performance advantage over H100 (with equivalent datatype and library setup) earlier in December. vLLM was chosen because of its broad adoption within the community and the ability to run on both GPU architectures.

Even when using TensorRT-LLM for H100, and vLLM for MI300X, AMD was still able to provide a 1.3x improvement in latency. When using lower-precision FP8 and TensorRT-LLM for H100, and higher-precision FP16 with vLLM for MI300X, AMD's accelerator was seemingly able to demonstrate a performance advantage in absolute latency.

vLLM doesn't support FP8, AMD explained, and FP16 datatype was chosen for its popularity. AMD said that its results show how MI300X using FP16 is comparable to H100 even when using its best performance settings with FP8 datatype and TensorRT-LLM.

Permalink to story.

https://www.techspot.com/news/101238-amd-mi300x-ai-accelerator-faster-than-nvidia-h100.html

Puiu · Dec 18, 2023

There are just too many variables at play. Everybody is picking something that works best for them.

yRaz · Dec 18, 2023

Puiu said:
There are just too many variables at play. Everybody is picking something that works best for them.

That was my thought, different workloads world better on different architectures. AMD is only the second big player in the AI GPU space and there will be many more to follow. nVidias monopoly in AI workloads will soon be at an end. I'm sure they'll still be a dominant player but they will be far from the only player

redgarl · Dec 18, 2023

Well, it doesn`t matter, the industry will buy all they can anyway since there is no other competitors in the field beside Nvidia and AMD when it comes for top of the line AI Accelerators.

kiwigraeme · Dec 18, 2023

redgarl said:
Well, it doesn`t matter, the industry will buy all they can anyway since there is no other competitors in the field beside Nvidia and AMD when it comes for top of the line AI Accelerators.

Intel was claiming they will have the bees knees soon - Think Google working on something ( they already have some experience in designing stuff and have like the other big guys bought out emerging players )

zamroni111 · Dec 20, 2023

kiwigraeme said:
Intel was claiming they will have the bees knees soon - Think Google working on something ( they already have some experience in designing stuff and have like the other big guys bought out emerging players )

google uses their own tensor processing unit.
basically, it's easier to design compute gpu than gaming gpu, which is why:
a. google has its own tpu and microsoft is making their own ai gpu
b. intel arc has higher fp32 compute spec than rx 6800 but way below in gaming performance
c. amd is still behind nvidia in gaming driver optimization

bviktor · Jan 5, 2024

Puiu said:
There are just too many variables at play. Everybody is picking something that works best for them.

Not really. Does it support CUDA? No? Then it's irrelevant. End of story.

AMD says its MI300X AI accelerator is faster than Nvidia's H100

Alfonso Maruccia

Posts: 1,025 +302

A hot potato: AMD is fighting back at Nvidia's claims about the H100 GPU accelerator, which according to Team Green is faster than the competition. But Team Red said Nvidia didn't tell the whole story, and provided further benchmark results with industry-standard inferencing workloads.

Puiu

Posts: 6,451 +5,644

yRaz

Posts: 6,413 +9,489

redgarl

Posts: 565 +930

kiwigraeme

Posts: 1,999 +1,410

zamroni111

Posts: 446 +249

bviktor

Posts: 1,566 +2,244

Similar threads

Latest posts