About three years ago we got our first look at Nvidia's Kepler architecture that powered the GTX 680, a $500 card packing 3.54 billion transistors and 192GB/s bandwidth. It was the end of the line for the GeForce 600 series, but it wasn't the end for the Kepler architecture.

While we eagerly awaited for next-gen GeForce 700 cards, Nvidia dropped the GeForce GTX Titan, wielding 7.08 billion transistors for an unwieldy price of $1,000. The Titan instantly claimed king of the hill, and even though the Radeon R9 290X brought similar performance for half the price six months later, Nvidia refused to budge on the Titan's MSRP.

This was every gamer's dream GPU for half a year, but its fate was sealed when the GTX 780 Ti shipped many months later (Nov/13), offering more CUDA cores at a more affordable $700.

Although the GTX Titan was great for gaming, that wasn't the sole purpose of the GPU, which was equipped with 64 double-precision cores for 1.3 teraflops of double-precision performance. Previously only found in Tesla workstations and supercomputers, this feature made the Titan ideal for students, researchers and engineers after consumer-level supercomputing performance.

A year after the original Titan's release, Nvidia followed up with a full 2880-core version known as the Titan Black, which boosted the card's double-precision performance 1.7 teraflops. A month later, the GTX Titan Z put two Titan Blacks on one PCB for 2.7 teraflops of compute power, though this card never made sense at $3,000 – triple the Titan Black's price.

Since then, the Maxwell-based GeForce 900 series arrived with the GTX 980's unbeatable performance vs. power ratio leading the charge as today's undisputed single-GPU king. Given that the GTX 980 has a modest 2048 cores using 5.2 billion transistors in a small 398mm2 die area, it manages to be 29% smaller with 26% fewer transistors than the flagship Kepler parts.

We knew there would be more ahead for Maxwell and so here it comes. Six months after the GTX 980, Nvidia is back with the GeForce GTX Titan X, a card that's bigger and more complex than any other. However, unlike previous Titan GPUs, the new Titan X is designed exclusively for high-end gaming and as such offers similar compute performance similar to the GTX 980.

Announced at GDC, there's plenty to be psyched about: headline features include 3072 CUDA cores, 12GB of GDDR5 memory running at 7Gbps, and a whopping 8 billion transistors. At its peak, the GTX Titan X will deliver 6600 GFLOPS single precision and 206 GFLOPS double precision processing power.

Nvidia reserved pricing information to the last minute as they delivered the opening keynote at their GPU Technology Conference – unsurprisingly the Titan X will be $999. But without getting bogged down in how stupid that was – let's focus on the fact that we get to show you how the GTX Titan X performs and that it's a hard launch with availability expected today.

Titan X's GM200 GPU in Detail

The GeForce Titan X is a processing powerhorse. The GM200 chip carries six graphics processing clusters, 24 streaming multiprocessors with 3072 CUDA cores (single precision).

As noted earlier, the Titan features a core configuration that consists of 3072 SPUs which take care of pixel/vertex/geometry shading duties, while texture filtering is performed by 192 texture units. With a base clock frequency of 1000MHz, texture filtering rate is 192 Gigatexels/sec, which is over 33% higher than the GTX 980. The Titan X also ships with 3MB of L2 cache and 96 ROPs.

The memory subsystem of GTX Titan X consists of six 64-bit memory controllers (384-bit) with 12GB of GDDR5 memory. This means that the 384-bit wide memory interface and 7GHz memory clock deliver a peak memory bandwidth that is 50% higher than GTX 980 at 336.5GB/sec.

And with its massive 12GB of GDDR5 memory, gamers can play the latest DX12 games on the Titan X at 4K resolutions without worrying about running short on graphics memory.

Nvidia says that the Titan X is built using the full implementation of GM200. The display/video engines are unchanged from the GM204 GPU used in the GTX 980. Also like the GTX 980, overall double-precision instruction throughput is 1/32 the rate of single-precision instruction throughput.

As mentioned, the base clock speed of the GTX Titan X is 1000MHz, though it does feature a typical Boost Clock speed of 1075MHz. The Boost Clock speed is based on the average Titan X card running a wide variety of games and applications. Note that the actual Boost Clock will vary from game to game depending on actual system conditions.

Setting performance aside for a moment, one of the Titan X's other noteworthy features is its stunning board design. As was the case with previous Titan cards, the Titan X has an aluminum cover. The metal casing gives the board a premium look and feel, while the card's unique black cover sets it apart from predecessors – this is the Darth Vader of Titans.

A copper vapor chamber is used to cool the Titan X's GM200 GPU. This vapor chamber is combined with a large, dual-slot aluminum heatsink to dissipate heat off the chip. A blower style fan then exhausts this hot air through the back of the graphics card and outside the PC's chassis. The fan is designed to run very quietly, even while under load when the card is overclocked.

If you recall, the GTX 980 reference board design included a backplate on the underside of the card with a section that could be removed in order to improve airflow when multiple GTX 980 cards are placed directly adjacent to each other (as with 3- and 4-way SLI, for example). In order to provide maximum airflow to the Titan X's cooler in these situations, Nvidia does not include a backplate on the Titan X reference.

The Titan X reference board measures 10.5" long. Display outputs include one dual-link DVI output, one HDMI 2.0 output and three DisplayPort connectors. One 8-pin PCIe power connector and one 6-pin PCIe power connector are required for operation.

Speaking of power connectors, the Titan X has a TDP rating of 250 watts and Nvidia calls for a 600w power supply when running just a single card. That is a little over 50% higher than the TDP rating of the GTX 980, though it is still 14% lower than the Radeon R9 290X.

Nvidia says that being a gaming enthusiast's graphics card, the Titan X has been designed for overclocking and implements a six-phase power supply with overvoltaging capability. An additional two-phase power supply is dedicated for the board's GDDR5 memory.

This 6+2 phase design supplies Titan X with more than enough power, even when the board is overclocked. The Titan X reference board design supplies the GPU with 275 watts of power at the maximum power target setting of 110%.

Nvidia has used polarized capacitors (POSCAPS) to minimize unwanted board noise as well as molded inductors. To further improve Titan X's overclocking potential, Nvidia has improved airflow to these board components so they run cooler compared to previous high-end GK110 products, including the original GTX Titan.

Moreover, Nvidia says it pushed the Titan X to speeds of 1.4GHz using nothing more than the supplied air-cooler during its own testing, so we're obviously interested in testing that.