Nvidia's first Ampere GPU is a silicon monster for AI and servers

nanoguy

Posts: 1,355   +27
Staff member
Forward-looking: Nvidia's Ampere architecture has finally arrived even though we're still months away from consumer cards we can use in our gaming PCs. The company says this is a significant leap over the Turing architecture: the new A100 chips are based on a 7nm process and is as much as 20 times faster than the Tesla V100.

Ampere hype had been at the highest levels as of late, with many getting excited about Nvidia's next-gen GPU architecture and the improvements it may bring. But when CEO Jensen Huang was shown in a video pulling a very heavy piece of kit from his oven, it didn't look like consumer-grade hardware.

Today, Nvidia officially unveiled its next-generation Ampere GPU architecture, which is coming to servers and supercomputers first in the form of A100, a GPU designed for cloud computing, AI, and scientific number crunching. For those of you expecting the GeForce RTX 3080 to make an appearance, that's still months away.

The company says the A100 is the biggest generational leap for its GPUs, with 20 times the speed of the previous Volta-based solution and third-generation Tensor cores. This is a silicon beast that has 54 billion transistors and offers 6,912 CUDA cores. And, as expected, Nvidia's new Ampere GPU is built on a 7nm process.

One of the biggest advantages of the new chip is that it can be used to reduce costs for big data centers. Nvidia says that a system that costs $11 million today and requires 25 racks of servers and 630 KW of power can be replaced by an Ampere system that fits in a single rack, costs $1 million and takes 28 KW of power to operate.

Such a system is based on what Huang pulled out of his home oven -- a monstrous HGX motherboard that packs in eight A100 GPUs along with 30,000 discrete components and a kilometer of wire traces. This makes it one of the most complex motherboards out there, and Nvidia uses it in the DGX A100 system that is able to deliver 5 petaflops of AI compute performance and 320 GB of GPU memory with 12.4 TB per second of bandwidth in a relatively small package that weighs 50 pounds.

As for the consumer-grade hardware based on Ampere, Huang explains that Nvidia will configure the chip a bit differently. For instance, the A100 was designed to be great on double-precision floating point compute, with most of the 54 billion transistors going towards Tensor cores and FP64 units to deliver 19.5 teraflops and 9.7 teraflops of performance, respectively. Consumer oriented Ampere GPUs will be biased towards graphics and less towards compute.

The GTC 2020 keynote and press releases reveal little about clock speeds, but we do know that Ampere cards will support PCIe 4.0, and the move to 7nm means Nvidia can pack a lot more RT cores for improved ray tracing performance. Judging from the DLSS 2.0 presentation, there are significant improvements on the software front as well.

Permalink to story.

 
I'm not really interested in the numbers and math.

All I need to see, as the average consumer, is the capabilities and the price tag.

I just want to know how much a 3080Ti is, and when I can buy one.

I've been playing DCS lately and I love being able to crank up all my "graphics" to maximum. And I don't obsess over the "fps count" as long as it looks good to my eyes while playing.
 
Last edited:
And yes, I'm going to say it again. I'm looking forward to the full desktop variants hitting laptops. They probably will right around the time I am ready to replace my current one July 2021). Maybe the first ones will hit as soon as the holidays. Then later we will start to see the "Desktop Replacements" drop. 200+ watt goodness. Next-gen I would like to see the full wattage Ti versions (Or AMD equivalent) hit laptops but the cooling will have to be exquisite. (Looking at you MSI).

Well. There is my list.
 
How is it that Nvidia cant show of there consumer products? We all know at least one them had to be done or close to it. They could have shown something off but will likely hold there bs conference in june or July like the rest are doing. Sad nvidia didnt do more or show more when they could have.
 
And yes, I'm going to say it again. I'm looking forward to the full desktop variants hitting laptops. They probably will right around the time I am ready to replace my current one July 2021). Maybe the first ones will hit as soon as the holidays. Then later we will start to see the "Desktop Replacements" drop. 200+ watt goodness. Next-gen I would like to see the full wattage Ti versions (Or AMD equivalent) hit laptops but the cooling will have to be exquisite. (Looking at you MSI).

Well. There is my list.
Your gonna be waiting awhile on that list. The closest to a desktop graphics card will be the max q versions. I think they are like 20% less performance than the desktop part. Laptops dont want to be big n bulky anymore. You wont truely ever get a desktop performance card in a laptop, mainly just do to cooling. At least not one that will be light weight.
 
How is it that Nvidia cant show of there consumer products? We all know at least one them had to be done or close to it. They could have shown something off but will likely hold there bs conference in june or July like the rest are doing. Sad nvidia didnt do more or show more when they could have.

I don't know why so many people seemed to think consumer cards were going to be shown off here. It's been clear for months now they only intended to show off professional products at GTC.
 
Personally I applaud them for not showing off consumer cards they are not close to being ready to sell. I'm hoping for a real launch at which I can immediately buy a card from the vendor of my choice, vs. spending weeks trying to find someone with one in stock.
 
I don't know why so many people seemed to think consumer cards were going to be shown off here. It's been clear for months now they only intended to show off professional products at GTC.
I don't know maybe because sites like these kept saying there was a good chance of that happening. While I had hope, I also know nvidia will likely show them off with details in June or July. With a release in Aug. or Sept.
 
Personally I applaud them for not showing off consumer cards they are not close to being ready to sell. I'm hoping for a real launch at which I can immediately buy a card from the vendor of my choice, vs. spending weeks trying to find someone with one in stock.
They will show them off sooner than you think. They are also a lot closer to being done too. At least one or 2 of the cards. I believe they will launch In Aug. and/or Sept.
 
I don't know maybe because sites like these kept saying there was a good chance of that happening. While I had hope, I also know nvidia will likely show them off with details in June or July. With a release in Aug. or Sept.

I don't believe it was steve or Tim writing those articles. Follow the hardware unboxed channel on youtube. Their predictions have been pretty solid.
 
Your gonna be waiting awhile on that list. The closest to a desktop graphics card will be the max q versions. I think they are like 20% less performance than the desktop part. Laptops dont want to be big n bulky anymore. You wont truely ever get a desktop performance card in a laptop, mainly just do to cooling. At least not one that will be light weight.
And you'll be paying $4000+ for the privilege of owning a microwave lap warmer disguised as a laptop. The 2080 did finally get a full fat desktop part, and it is a nightmare to cool in any capacity, still slower then the full desktop version and no 2080ti mobile at all, for obvious reasons.
 
Your gonna be waiting awhile on that list. The closest to a desktop graphics card will be the max q versions. I think they are like 20% less performance than the desktop part. Laptops dont want to be big n bulky anymore. You wont truely ever get a desktop performance card in a laptop, mainly just do to cooling. At least not one that will be light weight.
History predicts that there will in fact be desktop replacement 3000 series GPUs.
I have an MSI GT76 Titan with a full desktop 9900K @ 5 GHz and a 205 watt 2080. The wattage allowance goes up slightly when I connect my external monitor because it bypasses Optimus.
 
And you'll be paying $4000+ for the privilege of owning a microwave lap warmer disguised as a laptop. The 2080 did finally get a full fat desktop part, and it is a nightmare to cool in any capacity, still slower then the full desktop version and no 2080ti mobile at all, for obvious reasons.
Note that my score is slightly above a desktop ave with the same CPU\GPU combo. And you can usually add 5 to 10% once optimus is bypassed with an external monitor.

I also ran the 3dMark stress test (40 passes) and the 2080 temps peaked at 64C.
 
Last edited:
I like those numbers. This is basically nerd porn article. It's a high noon on _my_ sundial!

Yup, sounds like a highly interesting and impressive piece of tech. Nothing for normal users but exciting nonetheless. And eventually these advances will trickle down to consumer products.
 
Now the most interesting thing is how Nvidia able to cram 2.5x more transistors/mm2 going from 12nm FFN to 7nm N7
Volta: 21.1 billion transistors/ 815mm2
Ampere: 54.2 billion transistors/ 826mm2
This is even bigger density increase than TSMC 28nm to 16nm (slightly less than 2x)

Going with the transistors density of Ampere, the current 2080 Ti die can be shrink to just 283mm2 on 7nm N7, which is the same size as the 1660Ti die.
And with the rumored spec of the next 3080Ti, the size of 3080 Ti die could be smaller than TU 106 which is 445mm2 (RTX 2070).
So yeah Nvidia could be selling the 3080 Ti at 1200usd, make a crap ton of profit and everyone could be thanking them for it, even AMD...
 
Last edited:
...but can it run Crysis remastered at 4k144 with DXR and HDR?

Jokes aside my RTX 2070 is just not powerful enough for 4k gaming at high settings and I can't wait for the next generation.
 
More details, including the specification of the full GA100 chip (as the A100 has one SM disabled), can be read here:

 
Last edited:
More details, including the specification of the full GA100 chip (as the A100 has one SM disabled), can be read here:


I do some Tera RTX-OPS calculation the way Nvidia does and it looks like Ampere in its current form has 3.5x the RTX-OPS performance of 2080Ti, without the need of any RTX core.
https://www.techarp.com/computer/nvidia-rtx-ops-calculation/
So yeah bye bye RTX core...
 
Volta: 21.1 billion transistors/ 815mm2
Ampere: 54.2 billion transistors/ 826mm2
This is even bigger density increase than TSMC 28nm to 16nm (slightly less than 2x)
It's even more impressive when you realise that the stated die areas includes the HBM/HBM2.

The bulk of the changes lies within the tensor cores, of course, but the rest of the chip is essentially 'big-Volta', I.e. more SMs, more cache (a lot more L2, over 6 times more), more memory controllers, more NV Links:

GV100 SM structure
809-sm-diagram.jpg


GA100 SM structure
931-sm-diagram.jpg


GV100
809-block-diagram.jpg


GA100
ga100-full-gpu-128-sms.png


How much of this will transfer to the consumer version of Ampere? The FP64 units will be dropped right down (the TU102 has 2 per SM) and given that the new tensor cores are twice as capable as Volta's, we might see GeForce-Ampere chip only sport 4 per SM, rather than the current 8, freeing up space for more/larger RT cores.

Biggest question for me is how is this all going to get scaled down to keep the power consumption sensible: the A100's TDP is 400W, which is 150W more that the Tesla V100 SXM2 32GB. Obviously dropping the HBM2 will help quite a bit, as will hoofing off a few SMs, but it's still going to be high. 300W maybe?
 
Now the most interesting thing is how Nvidia able to cram 2.5x more transistors/mm2 going from 12nm FFN to 7nm N7
Volta: 21.1 billion transistors/ 815mm2
Ampere: 54.2 billion transistors/ 826mm2
This is even bigger density increase than TSMC 28nm to 16nm (slightly less than 2x)

Going with the transistors density of Ampere, the current 2080 Ti die can be shrink to just 283mm2 on 7nm N7, which is the same size as the 1660Ti die.
And with the rumored spec of the next 3080Ti, the size of 3080 Ti die could be smaller than TU 106 which is 445mm2 (RTX 2070).
So yeah Nvidia could be selling the 3080 Ti at 1200usd, make a crap ton of profit and everyone could be thanking them for it, even AMD...

Yeah, it is great to see those densities. Also means there is a lot than can go wrong.

I think you are off on your calculations a bit, though. 2080ti would be slightly over 300mm^2, using same node as Ampere.



 
I do some Tera RTX-OPS calculation the way Nvidia does and it looks like Ampere in its current form has 3.5x the RTX-OPS performance of 2080Ti, without the need of any RTX core.
https://www.techarp.com/computer/nvidia-rtx-ops-calculation/
So yeah bye bye RTX core...
Given that tensor cores only do matrix FMA, this is unlikely. The RT cores are specialised ASICs, two in fact: one for handling ray-triangle intersection calculations and the other for accelerating BVH algorithms. The tensor cores are used for denoising the images - but given that Ampere's are far more capable than Turing's, we're more likely to just see fewer TCs per SM, to allow for more RTCs. Or now that DLSS is TC-based, we could see this being pushed far more to offset the RT performance hit.
 
Your gonna be waiting awhile on that list. The closest to a desktop graphics card will be the max q versions. I think they are like 20% less performance than the desktop part. Laptops dont want to be big n bulky anymore. You wont truely ever get a desktop performance card in a laptop, mainly just do to cooling. At least not one that will be light weight.

No, you're wrong. Non-maxQ versions are faster. There are desktop GPUs, then there are standard laptop discrete parts, and then there are even slower maxQ parts. For example a laptop with 2080 maxQ will be closer to laptop 2070 rather than desktop 2080.
I think maxQ parts are limited in W and height to fit in thinner laptops?
 
Back