Intel says its Gaudi2 accelerator is more than a match for the Nvidia A100


Posts: 632   +123
Staff member
In brief: Intel has drummed up a rivalry between its new Gaudi2 accelerator and the now two-year-old market leader, the Nvidia A100. In two benchmarks suited to its niche, the new gaudily-named accelerator pulls out ahead.

Gaudi2 is made for Intel by Habana Labs, an Israeli company that it acquired at the end of 2019 for $2 billion. Habana actually makes two types of specialized accelerators: some for training neural networks, like Gaudi2; and others for running (I.e., "inferencing") them, such as Goya and Greco.


Habana and Intel launched Gaudi2 in May but waited until last week to upload its benchmark scores into the public MLPerf database. In their graphs, they compare the scores of their Gaudi2 system against the public scores of A100-equipped systems from Nvidia and Dell.

ResNet-50 tests hardware's ability to train an AI to classify images. Habana's Gaudi2 system took just 18 minutes to train the AI well enough for it to pass the test, easily surpassing Nvidia's A100 system, which needed almost half an hour.

Habana's Gaudi2 system took just 17 minutes to train the BERT model, beating Nvidia's A100 system's time by about a minute. BERT is a natural language processing model, and in this test, it trains itself with Wikipedia articles.

For both benchmarks, all the systems used eight accelerators/GPUs. Habana's system paired theirs with a pair of 40-core Intel Xeon 8380 CPUs and Nvidia's used two 64-core AMD Epyc 7742 CPUs.


Gaudi2 features 24 TPCs (tensor processor cores) and two MMEs (matrix multiplication engines) that run partially in parallel. It supports a broad array of data types, including FP32, TF32, BF16, FP16, and FP8. It also has a dedicated media engine for processing audio and visual media as inputs.

For memory, Gaudi2 has six 16 GB stacks of HBM2e that sum to 96 GB and 2.45 TB/s of total memory bandwidth. Inside, it has a 48 MB cache. For connectivity, it uses an x16 PCIe 4.0 connection and has 24x 100 Mbps RoCE2 (RDMA over Converged Ethernet 2) ports.


Habana has clearly created a real A100-competitor for Intel. Its timing could be better, given that Nvidia announced the H100 three months ago, but the two are such different products that even though they might compete in benchmarks, they might not really be competing for motherboard slots.

Whereas the A100 and H100 are versatile behemoths, Gaudi2 is a streamlined accelerator trying to do something different, and it'll be fascinating to see whether it's successful or not.

Permalink to story.


Hooda Thunkett

Posts: 22   +30
Dont which one is worse, people still believing in Intel claims or sites spreading their false claims.

I say false because as stated, intel always exaggerate their numbers either by cheating or straight up lying.
I mean, paying companies not to carry your competitors products, and then spreading Fear, Uncertainty, and Doubt (FUD) about them is fair and completely within corporate ethics, right? /S


Posts: 4,983   +6,466
Comparing a product that is dedicated to such tasks with more general hardware and barely winning in cherry picked tests... ouch.
Frankly if the product is cheaper and has better performance per watt then data centers who buy several thousand of these won't care.

Mr Majestyk

Posts: 1,570   +1,477
Can we run an article in 6 months to see if it's actually being sold. Ponte Vecchio is barely in the wild and Intel hyping it's successor already. ARC now about 12 months late and still no criticism. Media love to do Intel's PR. Maybe we should have an Arrow Lake article to really ram home how great Intel is.