Intel tweets a photo of the biggest and weirdest GPU ever made

mongeese

Posts: 516   +110
Staff member

This graphics processor is strange. As you can see from the image, from the outside it’s more like a CPU than a GPU. It sits in a socket using almost three thousand pins. It has an integrated heat spreader.

Inside, it’s like nothing we've seen before. Using the AA battery for scale, the chip is about 4000mm2. If the active area is half that size, that's three times the size of an RTX 2080 Ti. It has “tens of billions” of transistors, while the RTX 2080 Ti has just under twenty billion.

But what sort of GPU is it? For the time being, this processor eludes classification as the information Intel has provided is contradictory. Koduri previously said that the “father of all” is a Xe HP (high performance) product. Later on, he would also mention this chip supports BF16, a very niche AI acceleration compute format. BF16 is a confirmed, exclusive feature of Xe HPC (high performance compute). This chip could therefore be a Xe HPC product...

The difference between the two is substantial. Xe HP is oriented towards gaming and workstations. Leaks pin the core counts at between 1024 and 4096. Xe HPC is for servers and optimized for AI and scientific work. The single HPC product under development is codenamed Ponte Vecchio, it is said to have 1024 cores, but is by far the most powerful accelerator Intel is developing by virtue of its complex memory subsystem and the variety of cores and operations it supports. You can read more about this in our in-depth breakdown on Xe.

While neither HP nor HPC can be ruled out definitively, the nature of the latter seems more suitable to the new mystery processor. It would be quite strange for a gaming/workstation processor to be prototyped in a socket-based format, whereas Ponte Vecchio has already been depicted in what could be a socket.

Follow up story: Intel's Raja Koduri confirms that massive 'father of all' GPU is aimed at the data center

Further, from early schematics and presentations, Ponte Vecchio could be expected to be unprecedentedly large due to the inclusion of the high-footprint RAMBO cache and onboard HBM2 (or HBM2E). There is no reason to expect a HP product to be so large.

Regardless of what exactly this GPU is, the “father of all” will undoubtedly have the mother of all price tags, so you’d better start saving. Xe HP will release sometime this year and Ponte Vecchio will be ready in late 2021.

Permalink to story.

 

Vulcanproject

Posts: 1,389   +2,457
The four 'quadrants' of resistors look interesting, probably hint to a large MCM with multiple interconnects.

Everything these days is designed to be scalable chiplets it seems. Still an LGA socket mount design is unusual to say the least, you would usually conclude that it's just for ease of testing and development.

However seeing as though it is a supercomputing or workstation product it isn't beyond the realms of possibility that's release packaging. Interesting.

The concept of a modular graphics card is a possibility, cost would probably restrict it to this kind of level though.
 

neeyik

Posts: 1,881   +2,199
Staff member
Normally, the complexity of the memory interconnects precludes graphics card from being modular, but since this will almost certainly have onboard HBM2, that problem is removed.

If you look at a Vega 64 board, you can see that all it carries are VRMs, the PCIe interface, and outputs:

2871-pcb-front.jpg


[Source]

The chip seems very similar to this slide from Intel:

DEVCON%202019_16x9_v13_FINAL%5B2%5D_73.jpg

DEVCON%202019_16x9_v13_FINAL%5B2%5D_69.jpg


[Source]
 

QuantumPhysics

Posts: 5,226   +5,927
I'll reserve judgement till I can use a finished product. I'd really love to see Intel take on Nvidia and knock both them and AMD down a peg. I know they can do it.

Hopefully they figure out how to make it faster, cheaper and more powerful.
 

Evernessince

Posts: 5,464   +6,149
I'll reserve judgement till I can use a finished product. I'd really love to see Intel take on Nvidia and knock both them and AMD down a peg. I know they can do it.

Hopefully they figure out how to make it faster, cheaper and more powerful.

The big problem is, this is multi-GPU on a single package. As we all know, there are significant drawbacks to that in games. Intel would have to somehow fix the issue on a hardware/driver level. It would be pretty amazing if they did but I'm thinking this doesn't really feel like a consumer product.
 

EEatGDL

Posts: 765   +490
I'll reserve judgement till I can use a finished product. I'd really love to see Intel take on Nvidia and knock both them and AMD down a peg. I know they can do it.

Hopefully they figure out how to make it faster, cheaper and more powerful.
We are talking about the same Intel advising support for up to 500 W power draw at the CPU socket for Z490 motherboards, right?
The right title for this article is: "Intel is delusional". Nothing to see here, folks, just old Intel HD Graphics with HBM. Move on.
Just wait for the surprise, and it won't be a positive one. If NVIDIA launches the top RTX Quadro at 40 nm node, I'm pretty sure they can go bigger in area, if that's what we are now going for.
 

QuantumPhysics

Posts: 5,226   +5,927
We are talking about the same Intel advising support for up to 500 W power draw at the CPU socket for Z490 motherboards, right?
The right title for this article is: "Intel is delusional". Nothing to see here, folks, just old Intel HD Graphics with HBM. Move on.
Just wait for the surprise, and it won't be a positive one. If NVIDIA launches the top RTX Quadro at 40 nm node, I'm pretty sure they can go bigger in area, if that's what we are now going for.

I will definitely take your word for it over a multi-billion dollar multi-national corporation like Intel - who's been making CPU and motherboards and hardware for as long as I've been alive to own a PC.

Thank you for setting me straight.
 

Evernessince

Posts: 5,464   +6,149
I will definitely take your word for it over a multi-billion dollar multi-national corporation like Intel - who's been making CPU and motherboards and hardware for as long as I've been alive to own a PC.

Thank you for setting me straight.

None of which are points to accept Intel marketing. The same company that tried to hide the fact that it was using a chiller to cool it's new high core count CPU or the fact that it hired principled technologies, which botched basic benchmark methodology.

FYI you really shouldn't take the word of any large corporation, regardless of their expertise. Their goal is to make money, not give you an unbiased opinion.
 
Last edited:

Reehahs

Posts: 1,262   +929
This looks like Intel's idea of making a competitive GPU by shoving lots of CPU cores together.
 

Lounds

Posts: 896   +796
If this takes off it'll be in the data center only. I can't see consumers buying these. They'll cost more than a high end consumer CPU.
 

Uncle Al

Posts: 8,167   +6,925
My best guess is that it might be part of a government project, for DOD or possibly DOE but I simply cannot imagine how or why. Maybe those hidden "grey's" at WPAFB were asking for an upgrade to their cell phones ...... LOL
 

Danny101

Posts: 1,837   +790
It would be something if we could drop in GPUs like we do processors. Including HBM2 memory slots.
 

DaveBG

Posts: 570   +252
I really hope they bring something to the table. GPU market has been stagnating for years with no competition and nVidia insane pricing.
 

neeyik

Posts: 1,881   +2,199
Staff member
For GPU compute work, it would be unwise to dismiss Intel. Here are 2 OpenCL Geekbench 5 results:

Intel UHD Graphics 630
Nvidia GeForce RTX GeForce 2080 Super

The latter rightly trounces the former, but it has 16 times more shader units, and a higher clock speed.

intelcompute.png

Scaling the Intel results for the difference in shader count and clock speed puts the Gen 9.5 architecture is the same ballpark as Turing. Of course, it's not really possible to tell how well it would scale in reality, because Geekbench results don't compare across architectures particularly well.

In the same structures, though, the results do scale with cores and clocks reasonably linearly. The Geekbench database for the OpenCL results has a 2080 Ti at 129064 and a 2080 Super at 108958; an 18.5% increase for the Ti. With reference clocks, the Ti has about a 21% FP32 throughput advantage.

Intel's IGPs are 'rubbish' because they're tiny:

800px-coffee_lake_die_%28octa_core%29_%28annotated%29.png

[Source]

The whole CPU is about 175 mm2, and the GPU takes up roughly 44m2 (and one of the shader blocks is disabled in the 630). The GPU in the GT 1030 is 74 mm2 and that thing is hideously slow.
 

Red999

Posts: 100   +37
The TU106 in the RTX 2060 has around 10.8 billion transistors; a Ryzen 7 3700X has approximately 9.9 billion (3.9 per CCX, 2.1 for the I/O chip). So larger, yes, but absolutely not twice the amount.
3700x only has a 8 core chiplet which only contains 3.9B transistors. I didn't count the io chiplet.
Intel 8 core coffee lake even only has less than 3B
 

yeeeeman

Posts: 421   +371
For GPU compute work, it would be unwise to dismiss Intel. Here are 2 OpenCL Geekbench 5 results:

Intel UHD Graphics 630
Nvidia GeForce RTX GeForce 2080 Super

The latter rightly trounces the former, but it has 16 times more shader units, and a higher clock speed.

View attachment 86533

Scaling the Intel results for the difference in shader count and clock speed puts the Gen 9.5 architecture is the same ballpark as Turing. Of course, it's not really possible to tell how well it would scale in reality, because Geekbench results don't compare across architectures particularly well.

In the same structures, though, the results do scale with cores and clocks reasonably linearly. The Geekbench database for the OpenCL results has a 2080 Ti at 129064 and a 2080 Super at 108958; an 18.5% increase for the Ti. With reference clocks, the Ti has about a 21% FP32 throughput advantage.

Intel's IGPs are 'rubbish' because they're tiny:

800px-coffee_lake_die_%28octa_core%29_%28annotated%29.png

[Source]

The whole CPU is about 175 mm2, and the GPU takes up roughly 44m2 (and one of the shader blocks is disabled in the 630). The GPU in the GT 1030 is 74 mm2 and that thing is hideously slow.
Yeah. That happens because every child has a "uncore" part that is needed no matter if the chip is small as hd630 or rtx2080.
The hd630 is a nice and has potential but it all depends if they can make it scale well. From the looks of it, they did. And tbh, the real money will be done from gpu chiplets not cpu chiplets, so even if Intel didn't went the chiplet router with CPUs, I think they did the right thing with gpu chiplets.
 

Red999

Posts: 100   +37
Yeah. That happens because every child has a "uncore" part that is needed no matter if the chip is small as hd630 or rtx2080.
The hd630 is a nice and has potential but it all depends if they can make it scale well. From the looks of it, they did. And tbh, the real money will be done from gpu chiplets not cpu chiplets, so even if Intel didn't went the chiplet router with CPUs, I think they did the right thing with gpu chiplets.
Intel must target quadro and tesla area. That is the market where big margin exists.
Gaming gpu may generate a lot of news but three profit margin is not as high as professional and datacenter gpu.