ChatGPT was made possible thanks to tens of thousands of Nvidia GPUs, which Microsoft...

Daniel Sims

Posts: 1,372   +43
Staff
Forward-looking: A new report has revealed the enormous amount of Nvidia GPUs used by Microsoft and the innovations it took in arranging them to help OpenAI train ChatGPT. The news comes as Microsoft announces a significant upgrade to its AI supercomputer to further its homegrown generative AI initiative.

According to Bloomberg, OpenAI trained ChatGPT on a supercomputer Microsoft built from tens of thousands of Nvidia A100 GPUs. Microsoft announced a new array utilizing Nvidia's newer H100 GPUs this week.

The challenge facing the companies started in 2019 after Microsoft invested $1 billion into OpenAI while agreeing to build an AI supercomputer for the startup. However, Microsoft didn't have the hardware in-house for what OpenAI needed.

After acquiring Nvidia's chips, Microsoft had to rethink how it arranged such a massive number of GPUs to prevent overheating and power outages. The company won't say precisely how much the endeavor cost, but executive vice president Scott Guthrie put the number above several hundred million dollars.

Also read: Has Nvidia won the AI training market?

Simultaneously running all the A100s forced Redmond to consider how it placed them and their power supplies. It also had to develop new software to increase efficiency, ensure the networking equipment could withstand massive amounts of data, design new cable trays that it could manufacture independently, and use multiple cooling methods. Depending on the changing climate, the cooling techniques included evaporation, swamp coolers, and outside air.

Since the initial success of ChatGPT, Microsoft and some of its rivals have started work on parallel AI models for search engines and other applications. To speed up its generative AI, the company has introduced the ND H100 v5 VM, a virtual machine that can use between eight and thousands of Nvidia H100 GPUs.

The H100s connect through NVSwitch and NVLink 4.0 with 3.6TB/s of bisectional bandwidth between each of the 8 local GPUs within each virtual machine. Each GPU boasts 400 Gb/s of bandwidth through Nvidia Quantum-2 CX7 InfiniBand and 64GB/s PCIe5 connections. Each virtual machine manages 3.2Tb/s through a non-blocking fat-tree network. Microsoft's new system also features 4th-generation Intel Xeon processors and 16-channel 4800 MHz DDR5 RAM.

Microsoft plans to use the ND H100 v5 VM for its new AI-powered Bing search engine, Edge web browser, and Microsoft Dynamics 365. The virtual machine is now available for preview and will come standard with the Azure portfolio. Prospective users can request access.

Permalink to story.

 
Just to be clear machine learning won't come cheap. Millions of hours of supervised learning, plus billions in infrastructure. Ideally these models have to be retrained regularly to fit our specific needs new information, new science etc...
 
"According to Bloomberg, OpenAI trained ChatGPT on a supercomputer Microsoft built from tens of thousands of Nvidia A100 GPUs. Microsoft announced a new array utilizing Nvidia's newer H100 GPUs this week"

I can't get this out of my head "tens of thousands", who else could have had the finance to burn making nGreedia datacenter revenue that big.

James Cameron got the name wrong, it's not Skynet who will bring us down.
 
Actually, no. A100 GPUs have no graphics rendering. It is strictly a compute card. The same goes for the H100.
Technically, they could do, despite having no video output sockets. All (and I use that word very loosely) that would be required is for the drivers to be fully rewritten to support Direct3D and then the game recoded to copy the final frame buffer to system memory, and then use the video output of the mainframe used (because it will have one) to display the frame. However, an RTX 4090 would leave them both in the dust for game rendering.
 
A-haaaaa..... so, that's where all the GPUs have disappeared. We were blaming the price increase on crypto gold diggers, while all the time it was Microsoft !!
 
Technically, they could do, despite having no video output sockets. All (and I use that word very loosely) that would be required is for the drivers to be fully rewritten to support Direct3D and then the game recoded to copy the final frame buffer to system memory, and then use the video output of the mainframe used (because it will have one) to display the frame. However, an RTX 4090 would leave them both in the dust for game rendering.
It doesn't even provide graphics rendering for virtualization in a VDI cluster. Again strictly a compute engine. I just had a conference call with nVidia about this very issue since we need both visualization and FP64 compute for our simulations. I understand what you are saying, but that is not really feasible. I am not happy that cards costing $10,000s cannot perform simple visualization in addition to the FP64 compute. Cards that can do visualization such as an A16 or L40 are great at FP32 and have enough RAM for CUDA, but smooth particle hydrodynamics needs FP64 :( And putting 2 different cards in a single VDI node, say an L40 and A100 is not possible either according to nVidia.
 
Back