AMD CPUs and GPUs will power the future world's fastest supercomputer, 10x faster than...

William Gayde

Posts: 382   +5
Staff
Something to look forward to: The 'El Capitan' supercomputer at Lawrence Livermore National Laboratory (LLNL) will be built with both AMD processors and GPUs. It is expected to pack more than 2 exaFLOPs of performance and will come online in early 2023. El Capitan will be a huge leap forward in supercomputing performance with more power than the current top 200 fastest systems combined. That is also 10 times faster than the current fastest system.

The new system will be maintained by the US Department of Energy's (DOE) National Nuclear Security Administration (NNSA). Its main purpose will be to help model how America's existing nuclear weapons stockpile is aging through simulations and artificial intelligence.

In addition to national security workloads, El Capitan will also target some other key areas. This includes a partnership with the National Cancer Institute and additional DOE labs to accelerate research towards cancer drugs and how certain proteins mutate. El Capitan will also be used in research to help fight climate change.

This system is a big win for both AMD and Hewlett Packard Enterprise (HPE), who designed the system. Supercomputers used to be dominated by Intel CPUs and Nvidia GPUs, but AMD's improvements in both sectors are starting to eat away at that.

El Capitan will use 4th generation EPYC CPUs, codenamed "Genoa," based on the Zen 4 architecture. On the GPU side, it will use Radeon Instinct cards with the 3rd generation Infinity architecture.

The compute hardware will be implemented using Cray's Shasta system and Slingshot interconnect.

This features a 4:1 GPU to CPU ration with local flash storage for improved access speed. To help manage the massive heat generated by such a system, the blades are all individually water cooled. In addition to El Capitan, HPE and DOE are also working on two other exascale systems, Aurora and Frontier.

Permalink to story.

 
Impressive!
I wouldnt consider a Radeon GPU but theres no doubt AMD is coming around after 15 years of being the little guy, nothing lasts forever. Nvidia still has the best GPUs and there is no close 2nd, but hopefully AMD improves here because Nvidias pricing is just assinine.
 
Impressive!
I wouldnt consider a Radeon GPU but theres no doubt AMD is coming around after 15 years of being the little guy, nothing lasts forever. Nvidia still has the best GPUs and there is no close 2nd, but hopefully AMD improves here because Nvidias pricing is just assinine.
nVidia, currently, has the best GPUs for gaming, however, if you look at BOINC projects that have a GPU version and have versions for both AMD and nVidia GPUs, AMD GPUs process work units up to 10 times as fast as nVidia - especially in consumer cards. I bet that holds true with the professional cards, too.

GPU compute is/has been AMD's world for a long time, and my bet is that is part of what drove the decision to use AMD GPUs. The compute power of those GPUs is probably a big factor in the compute power of the system.

I don't hold out much hope for price reductions on GPUs. If AMD should take the lead - however unlikely that may be - the bar that nVidia has set in price will certainly either be met or exceeded by AMD, IMO. Take the price of the TR 3990 for example; IMO, we have Intel to thank for that.
 
AMD GPUs have been far ahead of nVidia GPUs in compute for years.

I wonder what new and exciting developments we will see from AMD with the revenue from such wins.

EDIT: I almost forgot - will it play Crysis? ?

“GPUs now power five out of the world’s seven fastest systems as well as 17 of the 20 most energy efficient systems on the new Green500 list,” the company remarked, adding that the “majority of computing performance added to the Top500 list comes from Nvidia GPUs.”

The latest Top500 report includes 110 systems with some manner of accelerator and/or co-processor technology, up from 101 six months ago. 98 are equipped with Nvidia chips, seven systems utilize Intel Xeon Phi (coprocessor) technology and four are using PEZY technology. Two systems (ranked 52 and 252) employ a combination of Nvidia and Intel Xeon Phi accelerators/coprocessors. The newly upgraded Tianhe-2a (now in fourth position with 61.44 petaflops up from 33.86 petaflops), installed at the National Super Computer Center in Guangzhou, employs custom-built Matrox-2000 accelerators. 19 systems now use Xeon Phi as the main processing unit.
 
“GPUs now power five out of the world’s seven fastest systems as well as 17 of the 20 most energy efficient systems on the new Green500 list,” the company remarked, adding that the “majority of computing performance added to the Top500 list comes from Nvidia GPUs.”

The latest Top500 report includes 110 systems with some manner of accelerator and/or co-processor technology, up from 101 six months ago. 98 are equipped with Nvidia chips, seven systems utilize Intel Xeon Phi (coprocessor) technology and four are using PEZY technology. Two systems (ranked 52 and 252) employ a combination of Nvidia and Intel Xeon Phi accelerators/coprocessors. The newly upgraded Tianhe-2a (now in fourth position with 61.44 petaflops up from 33.86 petaflops), installed at the National Super Computer Center in Guangzhou, employs custom-built Matrox-2000 accelerators. 19 systems now use Xeon Phi as the main processing unit.
Interesting.

I speak from the fact that I run nVidia GPUs mainly for GPU Grid. I've surveyed other BOINC projects and the result times were in that range in my survey. Anyway, I'm just some joe internet guy but this speaks volumes - https://www.anandtech.com/show/15422/the-amd-radeon-rx-5600-xt-review/14 though they are just consumer GPUs.

Your post speaks to the fact that systems have chosen nVidia GPUs in a majority of the instances. There had to have been a very good reason for choosing AMD for this machine, and I doubt that they would have chosen an inferior GPU for pricing reasons alone.
 
Impressive!
I wouldnt consider a Radeon GPU but theres no doubt AMD is coming around after 15 years of being the little guy, nothing lasts forever. Nvidia still has the best GPUs and there is no close 2nd, but hopefully AMD improves here because Nvidias pricing is just assinine.

As others mentioned, AMD's compute capabilities far exceed Nv's, and therefore, AMD IS the better choice here.

Also, NV doesn't have the best GPU's... they have a lead in gaming performance at 2070 Super and above, however, up to that tier, AMD is holding their ground.
RTX is not really an 'advantage'... its a proverbial gimmick that one barely notices anyway due to how fast paced action in games which support RTX usually is... and it reduces performance by a large amount (for a minimum of difference in visuals).

Compute and gaming are two different things.
Much larger/faster compute hw is one of the reasons why GCN is more power hungry as a consumer card (well, that/compute along with AMD's aggressive overvolting from factory to improve the number of functional dies).

I think you may be wondering about AMD improving on the gaming side.... but their GPU's are more than adequate for that too (even for 2k) if you don't need a top end GPU (which most people don't).
RDNA 2 will be out later this year, so it will be interesting to see what AMD does with it.

 
Interesting.

I speak from the fact that I run nVidia GPUs mainly for GPU Grid. I've surveyed other BOINC projects and the result times were in that range in my survey. Anyway, I'm just some joe internet guy but this speaks volumes - https://www.anandtech.com/show/15422/the-amd-radeon-rx-5600-xt-review/14 though they are just consumer GPUs.

Your post speaks to the fact that systems have chosen nVidia GPUs in a majority of the instances. There had to have been a very good reason for choosing AMD for this machine, and I doubt that they would have chosen an inferior GPU for pricing reasons alone.

Supercomputers have specific workloads, yes. In this case, AMD hardware is required. But as you can see, NVIDIA and Intel are dominating.
 
I love the fact Nvidia has no competition at the very top.
It means our enemies have to pay out the nose for a tiny 20% performance increase.
They all do the hating wile we do the laughing.
 
Supercomputers have specific workloads, yes. In this case, AMD hardware is required. But as you can see, NVIDIA and Intel are dominating.
Absolutely - on systems commissioned a few years ago. No sane person would have built a super computer based on AMD CPU prior to Ryzen (excluding the original Opteron here).

Since Epyc established itself otoh, there don‘t seem to be (m)any Xeon based super computer announcements.
 
AMD GPUs have been far ahead of nVidia GPUs in compute for years.

I wonder what new and exciting developments we will see from AMD with the revenue from such wins.

EDIT: I almost forgot - will it play Crysis? ?
For some raw compute workloads AMD is definitely ahead, but Nvidia is no slouch either. They have a ton of workloads where they are leaders (for example machine learning is one of them)
 
Absolutely - on systems commissioned a few years ago. No sane person would have built a super computer based on AMD CPU prior to Ryzen (excluding the original Opteron here).

Since Epyc established itself otoh, there don‘t seem to be (m)any Xeon based super computer announcements.

But there is desire for Xeon 14nm and 10nm orders if you look. Also, Xeon orders are likely to be requested because they are the best for the desired job. AMD doesn't have the one chip that rules them all. Intel and Nvidia are still the heavyweights in supercomputing and HPC. Even Xe is getting a mention. AMD has many roadblocks ahead.

Your first paragraph - That's obvious.
Your second paragraph - With new players of course others will lose out.

Neither paragraph contained anything that added to the conversation. Your comments are basic, but if you have some relevant links you think I should look at, I'd be glad to see them.
 
Supercomputers have specific workloads, yes. In this case, AMD hardware is required. But as you can see, NVIDIA and Intel are dominating.
I agree about the workloads. It would be interesting to hear why they chose AMD hardware.
Absolutely - on systems commissioned a few years ago. No sane person would have built a super computer based on AMD CPU prior to Ryzen (excluding the original Opteron here).

Since Epyc established itself otoh, there don‘t seem to be (m)any Xeon based super computer announcements.
Agreed. No supercomputer designer in their right mind would have specified any AMD CPU between Opteron and and Epyc. The one thing that I can think of where AMD is ahead of Intel is the fact that Epyc has 128 lanes of PCI-e available for use. That means perhaps as many as 6, GPUs running 16 PCI-e lanes and some left over for other peripherals per Epyc CPU.
 
AMD GPUs have been far ahead of nVidia GPUs in compute for years.
Except for the radeon VII, which matched a 10-20k$ card in compute for just 700$

edit: I thought he meant far behind lel
yeah I have 10 iq sorry
still, the Radeon VII was a misunderstood and misadvertised card at the time which could’ve been marketed for content creation and professional workloads instead of being marketed as a gaming card
 
Last edited:
Hmmm. I'll have to look for one of those. :)
Yeah, sadly it reached EOL because of low sales and low supply
pic_disp.php
 
I agree about the workloads. It would be interesting to hear why they chose AMD hardware.

Agreed. No supercomputer designer in their right mind would have specified any AMD CPU between Opteron and and Epyc. The one thing that I can think of where AMD is ahead of Intel is the fact that Epyc has 128 lanes of PCI-e available for use. That means perhaps as many as 6, GPUs running 16 PCI-e lanes and some left over for other peripherals per Epyc CPU.

Surprisingly to me price/performance was a deciding factor on this computer and another I forget the name of.
 
Last edited:
I agree about the workloads. It would be interesting to hear why they chose AMD hardware.
This article on next platform has good information:
Lawrence Livermore To Surpass 2 Exaflops With AMD Compute

To quote from the article:

“Our workloads are primarily not deep learning models, although we are exploring something we call cognitive simulation, which brings deep learning and other AI models to bear on our workloads by evaluating how they can accelerate our simulations and how they can also improve their accuracy and find where they actually work,” explained de Supinski. “And so for that, we see this system as providing some significant benefits because of those operations. But I think it’s important to understand that that the primary goal of this system is large scale physics simulation and not deep learning.”

Agreed. No supercomputer designer in their right mind would have specified any AMD CPU between Opteron and and Epyc. The one thing that I can think of where AMD is ahead of Intel is the fact that Epyc has 128 lanes of PCI-e available for use. That means perhaps as many as 6, GPUs running 16 PCI-e lanes and some left over for other peripherals per Epyc CPU.

Current CPU have more PCIe lanes, and 4.0 (I.e. twice the bandwith) at that. If this is still the case in the time frame this system is planned for is questionable.

One advantage is certainly higher density in single socket systems. Intel is nowhere near that and by the time they have their own "glue", AMD has already had experience with it for several years.

Power consumption is next - it can be argued that AMD may still hold this advantage in the future.

Another quote from the article to put the cost of power consumption in perspective:

It costs roughly $1 per watt per year to power a supercomputer in the urban areas where they tend to be installed. So that is $50 million over five years for that incremental 10 megawatts of juice

While gamers may not care about power consumption (does not really matter from a financial perspective for one PC), this is quite different for super computers.

So if you need half the power for the same performance, you can save a lot, I.e. a 30MW system will save $150 Million over five years vs. a 60 MW system.

Another advantage may be nUMA between the CPU and GPU. AMD exlored this in the past but this time it may work due to the better interconnect.
 
This article on next platform has good information:
Lawrence Livermore To Surpass 2 Exaflops With AMD Compute

To quote from the article:





Current CPU have more PCIe lanes, and 4.0 (I.e. twice the bandwith) at that. If this is still the case in the time frame this system is planned for is questionable.

One advantage is certainly higher density in single socket systems. Intel is nowhere near that and by the time they have their own "glue", AMD has already had experience with it for several years.

Power consumption is next - it can be argued that AMD may still hold this advantage in the future.

Another quote from the article to put the cost of power consumption in perspective:



While gamers may not care about power consumption (does not really matter from a financial perspective for one PC), this is quite different for super computers.

So if you need half the power for the same performance, you can save a lot, I.e. a 30MW system will save $150 Million over five years vs. a 60 MW system.

Another advantage may be nUMA between the CPU and GPU. AMD exlored this in the past but this time it may work due to the better interconnect.
All very interesting.

This article seems to think that AMD is hinting that Genoa will have PCI-e 5.0 https://www.truecosmos.com/amd-epyc-genoa-ddr5-memory-pcie-5-0-protocol/
 
Yeah, sadly it reached EOL because of low sales and low supply
pic_disp.php


It didn't do everything well, but quite excelled in certain workloads: And their Instinct cards do better yet at some of these ... So yeah ... if one is building a supercomputer and needs a lot of very high precision compute ... well, the last two graphs paint an interesting picture ...

Sandra-Scientific-GPU-Performance-AMD-Radeon-VII.png


Medical-Viewport-Performance-AMD-Radeon-VII.png


LuxMark-Performance-AMD-Radeon-VII.png



Just look at the FP64 compute compared to 2080ti ... makes the ti look like a little kids toy ... eh, quantum physics ;)? And there's the $5,000 quadro P6000 down there near the bottom ...
Sandra-2018-SP3a-Scientific-FP64-Double-Precision-GPU-Performance-AMD-Radeon-VII.png


Sandra-2018-SP3a-Financial-FP64-Double-Precision-GPU-Performance-AMD-Radeon-VII.png



I wonder what Arcturus will be like at compute?
 
Last edited:
It didn't do everything well, but quite excelled in certain workloads: And their Instinct cards do better yet at some of these ... So yeah ... if one is building a supercomputer and needs a lot of very high precision compute ... well, the last two graphs paint an interesting picture ...

Sandra-Scientific-GPU-Performance-AMD-Radeon-VII.png


Medical-Viewport-Performance-AMD-Radeon-VII.png


LuxMark-Performance-AMD-Radeon-VII.png



Just look at the FP64 compute compared to 2080ti ... makes the ti look like a little kids toy ... eh, quantum physics ;)? And there's the $5,000 quadro P6000 down there near the bottom ...
Sandra-2018-SP3a-Scientific-FP64-Double-Precision-GPU-Performance-AMD-Radeon-VII.png


Sandra-2018-SP3a-Financial-FP64-Double-Precision-GPU-Performance-AMD-Radeon-VII.png



I wonder what Arcturus will be like at compute?
Thanks for the backup. ?
 
Take the price of the TR 3990 for example; IMO, we have Intel to thank for that.

By that logic we also have Intel to thank for $30 AMD processors as well as the AMD Vega Toaster. Absolutely nothing makes better or cheaper GPU Cheesy Toast than AMD.
 
Just look at the FP64 compute compared to 2080ti ... makes the ti look like a little kids toy ... eh, quantum physics ;)? And there's the $5,000 quadro P6000 down there near the bottom ...
A slightly fairer comparison would be the Radeon VII vs the Titan V (3.36 TFLOPS @ 1.75GHz vs 7.45 TFLOPS @ 1.46 GHz), as both products are targeted at FP64 throughput. Not on price, of course, as the AMD card is roughly a third the cost of the Nvidia product!
 
Last edited:
Back