ChatGPT was made possible thanks to tens of thousands of Nvidia GPUs, which Microsoft...

Daniel Sims · Mar 13, 2023

Forward-looking: A new report has revealed the enormous amount of Nvidia GPUs used by Microsoft and the innovations it took in arranging them to help OpenAI train ChatGPT. The news comes as Microsoft announces a significant upgrade to its AI supercomputer to further its homegrown generative AI initiative.

According to Bloomberg, OpenAI trained ChatGPT on a supercomputer Microsoft built from tens of thousands of Nvidia A100 GPUs. Microsoft announced a new array utilizing Nvidia's newer H100 GPUs this week.

The challenge facing the companies started in 2019 after Microsoft invested $1 billion into OpenAI while agreeing to build an AI supercomputer for the startup. However, Microsoft didn't have the hardware in-house for what OpenAI needed.

After acquiring Nvidia's chips, Microsoft had to rethink how it arranged such a massive number of GPUs to prevent overheating and power outages. The company won't say precisely how much the endeavor cost, but executive vice president Scott Guthrie put the number above several hundred million dollars.

Also read: Has Nvidia won the AI training market?

Simultaneously running all the A100s forced Redmond to consider how it placed them and their power supplies. It also had to develop new software to increase efficiency, ensure the networking equipment could withstand massive amounts of data, design new cable trays that it could manufacture independently, and use multiple cooling methods. Depending on the changing climate, the cooling techniques included evaporation, swamp coolers, and outside air.

Since the initial success of ChatGPT, Microsoft and some of its rivals have started work on parallel AI models for search engines and other applications. To speed up its generative AI, the company has introduced the ND H100 v5 VM, a virtual machine that can use between eight and thousands of Nvidia H100 GPUs.

The H100s connect through NVSwitch and NVLink 4.0 with 3.6TB/s of bisectional bandwidth between each of the 8 local GPUs within each virtual machine. Each GPU boasts 400 Gb/s of bandwidth through Nvidia Quantum-2 CX7 InfiniBand and 64GB/s PCIe5 connections. Each virtual machine manages 3.2Tb/s through a non-blocking fat-tree network. Microsoft's new system also features 4th-generation Intel Xeon processors and 16-channel 4800 MHz DDR5 RAM.

Microsoft plans to use the ND H100 v5 VM for its new AI-powered Bing search engine, Edge web browser, and Microsoft Dynamics 365. The virtual machine is now available for preview and will come standard with the Azure portfolio. Prospective users can request access.

Permalink to story.

https://www.techspot.com/news/97919-chatgpt-possible-due-tens-thousands-nvidia-gpus-which.html

psycros · Mar 13, 2023

Didn't realize that Microsoft had been involved with that project from the start. Now I'm even more worried.

alexnode · Mar 13, 2023

Just to be clear machine learning won't come cheap. Millions of hours of supervised learning, plus billions in infrastructure. Ideally these models have to be retrained regularly to fit our specific needs new information, new science etc...

Arbie · Mar 13, 2023

Yeah, but can it run Crysis?

yannus · Mar 13, 2023

Soon, opinion will be influenced by billions of fake accounts trained to win in any conversation. Both AI and sheeple will celebrate Ngreedia and Micro$oft while others will feel lonely in an ocean of smart biased bot accounts.

takaozo · Mar 14, 2023

"According to Bloomberg, OpenAI trained ChatGPT on a supercomputer Microsoft built from tens of thousands of Nvidia A100 GPUs. Microsoft announced a new array utilizing Nvidia's newer H100 GPUs this week"

I can't get this out of my head "tens of thousands", who else could have had the finance to burn making nGreedia datacenter revenue that big.

James Cameron got the name wrong, it's not Skynet who will bring us down.

Gimp65 · Mar 14, 2023

Arbie said:
Yeah, but can it run Crysis?

yes

noel24 · Mar 14, 2023

The message here: "Buy nVidia GPUs, they are not getting any cheaper, as M$, Meta and Alphabet will buy them all!"

Dysnomia · Mar 14, 2023

Arbie said:
Yeah, but can it run Crysis?

It can hallucinate the code, compile it and run it. All in realtime.

AxelAminoff · Mar 14, 2023

Arbie said:
Yeah, but can it run Crysis?

The eternal question! Hahaha!

B5S46M · Mar 14, 2023

Gimp65 said:
yes

Actually, no. A100 GPUs have no graphics rendering. It is strictly a compute card. The same goes for the H100.

neeyik · Mar 14, 2023

B5S46M said:
Actually, no. A100 GPUs have no graphics rendering. It is strictly a compute card. The same goes for the H100.

Technically, they could do, despite having no video output sockets. All (and I use that word very loosely) that would be required is for the drivers to be fully rewritten to support Direct3D and then the game recoded to copy the final frame buffer to system memory, and then use the video output of the mainframe used (because it will have one) to display the frame. However, an RTX 4090 would leave them both in the dust for game rendering.

SirEvilJebus · Mar 14, 2023

B5S46M said:
Actually, no. A100 GPUs have no graphics rendering. It is strictly a compute card. The same goes for the H100.

If they can run Crysis on a single Threadripper 3990x with no GPU then I'm sure they could run it on this.

https://www.pcgamer.com/amds-threadripper-3990x-can-run-crysis-without-a-graphics-card/

Hodor · Mar 14, 2023

A-haaaaa..... so, that's where all the GPUs have disappeared. We were blaming the price increase on crypto gold diggers, while all the time it was Microsoft !!

Vanderlinde · Mar 14, 2023

Arbie said:
Yeah, but can it run Crysis?

You can actually ask ChatGPT to write you a code simular to the game of Crysis.

Really, its quite good if you know what to ask and looking for.

maxxcool7421 · Mar 14, 2023

Arbie said:
Yeah, but can it run Crysis?

It can beat Crysis

rsmith222 · Mar 14, 2023

noel24 said:
The message here: "Buy nVidia GPUs, they are not getting any cheaper, as M$, Meta and Alphabet will buy them all!"

The more they buy, the more you save.

B5S46M · Mar 15, 2023

neeyik said:
Technically, they could do, despite having no video output sockets. All (and I use that word very loosely) that would be required is for the drivers to be fully rewritten to support Direct3D and then the game recoded to copy the final frame buffer to system memory, and then use the video output of the mainframe used (because it will have one) to display the frame. However, an RTX 4090 would leave them both in the dust for game rendering.

It doesn't even provide graphics rendering for virtualization in a VDI cluster. Again strictly a compute engine. I just had a conference call with nVidia about this very issue since we need both visualization and FP64 compute for our simulations. I understand what you are saying, but that is not really feasible. I am not happy that cards costing $10,000s cannot perform simple visualization in addition to the FP64 compute. Cards that can do visualization such as an A16 or L40 are great at FP32 and have enough RAM for CUDA, but smooth particle hydrodynamics needs FP64

And putting 2 different cards in a single VDI node, say an L40 and A100 is not possible either according to nVidia.

ChatGPT was made possible thanks to tens of thousands of Nvidia GPUs, which Microsoft...

Daniel Sims

Posts: 1,963 +53

psycros

Posts: 4,913 +7,644

alexnode

Posts: 207 +95

Arbie

Posts: 540 +911

yannus

Posts: 444 +395

takaozo

Posts: 1,084 +1,665

Gimp65

Posts: 84 +169

noel24

Posts: 1,091 +1,789

Dysnomia

Posts: 25 +12

AxelAminoff

Posts: 31 +27

B5S46M

Posts: 76 +103

neeyik

Posts: 2,963 +3,645

SirEvilJebus

Hodor

Posts: 765 +513

Vanderlinde

Posts: 659 +451

maxxcool7421

Posts: 664 +989

rsmith222

Posts: 154 +45

B5S46M

Posts: 76 +103

Similar threads

Latest posts