Groq launches the first AI accelerator card capable of 1 PetaOPS

mongeese · Jan 25, 2020

Why it matters: Groq is the hundredth startup to take a shot at making an AI accelerator card, the second to market, and the first to have a product reach the 1 quadrillion operations per second threshold. That’s quadruple the performance of Nvidia’s most powerful card.

The Groq Tensor Streaming Processor (TSP) demands 300W per core, so luckily, it’s only got one. Even luckier, Groq has turned that from a disadvantage into the TSP’s greatest strength.

You should probably throw everything you know about GPUs or AI processing out the window, because the TSP is just plain weird. It’s a giant piece of silicon with almost nothing but Vector and Matrix processing units and cache, so no controllers or backend whatsoever. The compiler has direct control.

The TSP is divided into 20 superlanes. Superlanes are built from, in order of left to right: a Matrix Unit (320 MACs), Switch Unit, Memory Unit (5.5 MB), Vector Unit (16 ALUs), Memory Unit (5.5 MB), Switch Unit, Matrix Unit (320 MACs). You’ll notice that the components are mirrored around the Vector Unit, this divides the superlane into two hemispheres that can act almost independently.

The instruction stream (there is only one) is fed into every component of superlane 0, with 6 instructions for the Matrix Units, 14 for the Switch Units, 44 for the Memory Units, and 16 for the Vector Unit. Every clock cycle, the units perform their operations and move the piece of data to where it’s going next within the superlane. Each component can send and receive 512B from its next-door neighbors.

Once the superlane’s operations are complete, it passes everything down to the next superlane and receives whatever the superlane above (or the instruction controller) has. Instructions are always passed down vertically between the superlanes, while data only transfers horizontally within a superlane.

	Groq TSP	Nvidia Tesla V100	Nvidia Tesla T4
Cores	1	5120	2560
Maximum Frequency	1250 MHz	1530 MHz	1590 MHz
FP16 TFLOPS	205 TFLOPS	125 TFLOPS	65 TFLOPS
INT8 TOPS	1000 TOPS	250 TOPS	130 TOPS
Chip Cache (L1)	220 MB	10 MB	2.6 MB
Board Memory	N/A	32 GB HBM2	16 GB GDDR6
Board Power (TDP)	300W	300W	70W
Process	14nm	12nm	12nm
Die Area	725 mm²	815 mm²	545 mm²

All that makes for a processor that is extremely good at neural network training and inferencing, and incapable of anything else. To put some benchmarks to it, in ResNet-50 it can perform 20,400 Inferences per Second (I/S) at any batch size, with an inference latency of 0.05 ms.

Nvidia’s Tesla V100 can perform 7,907 I/S at a batch size of 128, or 1,156 I/S at a batch size of one (batch sizes generally aren’t this low, but it demonstrates TSP’s versatility). Its latency at batch 128 is 16 ms and 0.87 ms at batch one. Obviously, the TSP outperforms Nvidia’s most equivalent card in this workload.

One of TSP’s strengths is that it has so much L1 cache, but it also doesn’t have anything else. If a neural network expands beyond that volume or if it is dealing with very large inputs, it will seriously suffer. Nvidia’s cards have gigabytes of memory that can handle that scenario.

This sums up the TSP really well. In specific workloads it’s more than twice as powerful as the Tesla V100, but if your workload varies, or if heaven-forbid you want to do something with more than half-precision, you can’t. The TSP definitely has a future in areas like self-driving cars, where the volume of input is predictable, and the neural network can be guaranteed to fit. In this case its spectacular latency, 320x better than Nvidia’s, means the car can respond faster.

The TSP is presently available to select customers as an accelerator within the Nimbix Cloud.

Permalink to story.

https://www.techspot.com/news/83719-groq-launches-first-ai-accelerator-card-capable-1.html

QuantumPhysics · Jan 25, 2020

Can it run games in 4K 120fps?

Bullwinkle M · Jan 25, 2020

AI will never work!

I reject Ads, spam and pop ups all day long and the Ai never gets my drift

I just get served more of the same

Get a Clue Artificial DumbA$$!

Adi6293 · Jan 26, 2020

QuantumPhysics said:
Can it run games in 4K 120fps?

Well that's a stupid question don't you think?

QuantumPhysics · Jan 26, 2020

Adi6293 said:
Well that's a stupid question don't you think?

No - not at all.

seeprime · Jan 26, 2020

Bullwinkle M said:
AI will never work!

I reject Ads, spam and pop ups all day long and the Ai never gets my drift

I just get served more of the same

Get a Clue Artificial DumbA$$!

The AI's monitoring you are all jerks because you forgot to ask nicely, and put blocks on your firewall.

seeprime · Jan 26, 2020

QuantumPhysics said:
Can it run games in 4K 120fps?

Games are written to utilize GPUs. This add-in card would need to have games with the code written into them to utilize it. Whether it works well or not is something we'll possibly learn in the future.

CharmsD · Jan 26, 2020

This $thing can sort the stream of uplikes/downvotes on YouTube in real time!

#badjoke

BigBoomBoom · Jan 26, 2020

QuantumPhysics said:
No - not at all.

I think you're asking the wrong question.

Can it runs Crysis?

QuantumPhysics · Jan 26, 2020

BigBoomBoom said:
I think you're asking the wrong question.

Can it runs Crysis?

In 4K 120fps...?

BigBoomBoom · Jan 26, 2020

QuantumPhysics said:
In 4K 120fps...?

It may not run Crysis at 4K 120fps but maybe it can PLAY Crysis... like an AI.

Imagine these card running bots in Counter Strike.

QuantumPhysics · Jan 26, 2020

BigBoomBoom said:
It may not run Crysis at 4K 120fps but maybe it can PLAY Crysis... like an AI.

Imagine these card running bots in Counter Strike.

So what you're saying is, I could be facing a enemy team filled with bots that can headshot me from a mile away using their pistol?

Luay · Jan 26, 2020

I saw 11 comments on the title of a scientific post in Techspot and clicked right in. I knew what I was getting into.

Adi6293 · Jan 27, 2020

QuantumPhysics said:
No - not at all.

This card is build specifically for AI I would be surprised if it could run games at all never mind 4K at 120fps

BigBoomBoom · Jan 27, 2020

QuantumPhysics said:
So what you're saying is, I could be facing a enemy team filled with bots that can headshot me from a mile away using their pistol?

No, but they will be swatting you after losing. I bet they will be real toxic as well, like human.

harm9963 · Jan 28, 2020

BFG Tech AGEIA PhysX card was cool , I had one , with my 7600GS Sli, the year after that ,I had a 8800GT , and my PhysX card was history, hello CUDAcores.

RealNeil · Jan 28, 2020

BigBoomBoom said:
I think you're asking the wrong question.

Can it runs Crysis?

Crysis on full? Lol!

RealNeil · Jan 28, 2020

QuantumPhysics said:
So what you're saying is, I could be facing a enemy team filled with bots that can headshot me from a mile away using their pistol?

Sounds like a real drag.

kmo911 · Feb 5, 2020

I this was a gamin g pu it would beat them all ?
we just hawe to wait for nxt gen intel matrox amd nvidia s3 games in 4k 8k. how much would a 5 min 8k raw take in streaming . if 1pb pr 5 th min it would take enourmous lines too just run 1 min of a 8k raw. if you using a 5g modem that can hanndle real time 8k raw or comprimissed movies and so on. streaming even in 4k raw would tak all bandwidth and a 1 gb - 1gb bit net to run hot. if this helps on to get that quality in and out I would be wounderfull. like linius he tryes rendering 4k 8k in real time. would this gpu + a hll of a cpu ram and so on t render nicely ? 8if I missunderstand please correct me. this is a GPU ?

linked from

gpdr

Groq launches the first AI accelerator card capable of 1 PetaOPS

Posts: 643 +123

Why it matters: Groq is the hundredth startup to take a shot at making an AI accelerator card, the second to market, and the first to have a product reach the 1 quadrillion operations per second threshold. That’s quadruple the performance of Nvidia’s most powerful card.

Posts: 6,306 +7,258

Posts: 911 +817

Posts: 931 +1,309

Posts: 6,306 +7,258

Posts: 702 +924

Posts: 702 +924

Posts: 411 +257

Posts: 100 +78

Posts: 6,306 +7,258

Posts: 100 +78

Posts: 6,306 +7,258

Posts: 125 +67

Posts: 931 +1,309

Posts: 100 +78

Posts: 128 +69

Posts: 44 +24

Posts: 44 +24

Posts: 352 +44

Similar threads