What's the difference between DDR3 memory and GDDR5 memory?

marcussparticus · Oct 20, 2012

Hi to all at Techspot. I'm not a techie but noticed that there was two kinds of memory, DDR3 and DDR5. My Question is,
What's the difference between DDR3 Memory and the Graphics DDR5 Memory and if the DDR5 is in anyway better then why is it not used for desktop computer memory instead of DDR3.
I know some apps are taking advantage of or utilizing graphics memory Thanks for any explanation

Razer · Oct 20, 2012

As far as I know, memories used on video cards have different characteristics than memories used on PC.. since this is beyond my knowledge, I'm sorry I can't explain much more in details.. Just wait for other members with deeper knowledge that can give some information regarding of this matter

dividebyzero · Oct 20, 2012

marcussparticus said:
What's the difference between DDR3 Memory and the Graphics DDR5 Memory and if the DDR5 is in anyway better then why is it not used for desktop computer memory instead of DDR3.

The principle differences are:
•DDR3 runs at a higher voltage that GDDR5 (typically 1.25-1.65V versus ~1V)
•DDR3 uses a 64-bit memory controller per channel ( so, 128-bit bus for dual channel, 256-bit for quad channel), whereas GDDR5 is paired with controllers of a nominal 32-bit (16 bit each for input and output), but whereas the CPU's memory contoller is 64-bit per channel, a GPU can utilise any number of 32-bit I/O's (at the cost of die size) depending upon application ( 2 for 64-bit bus, 4 for 128-bit, 6 for 192-bit, 8 for 256-bit, 12 for 384-bit etc...). The GDDR5 setup also allows for doubling or asymetric memory configurations. Normally (using this generation of cards as example) GDDR5 memory uses 2Gbit memory chips for each 32-bit I/O (I.e for a 256-bit bus/2GB card: 8 x 32-bit I/O each connected by a circuit to a 2Gbit IC = 8 x 2Gbit = 16Gbit = 2GB), but GDDR5 can also operate in what is known as clamshell mode, where the 32-bit I/O instead of being connected to one IC is split between two (one on each side of the PCB) allowing for a doubling up of memory capacity. Mixing the arrangement of 32-bit memory controllers, memory IC density, and memory circuit splitting allows of asymetric configurations ( 192-bit, 2GB VRAM for example)
•Physically, a GDDR5 controller/IC doubles the I/O of DDR3 - With DDR, I/O handles an input (written to memory), or output (read from memory) but not both on the same cycle. GDDR handles input and output on the same cycle.

The memory is also fundamentally set up specifically for the application it uses:
System memory (DDR3) benefits from low latency (tight timings) at the expense of bandwidth, GDDR5's case is the opposite. Timings for GDDR5 would seems unbelieveably slow in relation to DDR3, but the speed of VRAM is blazing fast in comparison with desktop RAM- this has resulted from the relative workloads that a CPU and GPU undertake. Latency isn't much of an issue with GPU's since their parallel nature allows them to move to other calculation when latency cycles cause a stall in the current workload/thread. The performance of a graphics card for instance is greatly affected (as a percentage) by altering the internal bandwidth, yet altering the external bandwidth (the PCI-Express bus, say lowering from x16 to x8 or x4 lanes) has a minimal effect. This is because there is a great deal of I/O (textures for examples) that get swapped in and out of VRAM continuously- the nature of a GPU is many parallel computations, whereas a CPU computes in a basically linear way.

Sorry for the wall of text, but you did ask.

Jad Chaar · Oct 21, 2012

dividebyzero said:
The principle differences are:
•DDR3 runs at a higher voltage that GDDR5 (typically 1.25-1.65V versus ~1V)
•DDR3 uses a 64-bit memory controller per channel ( so, 128-bit bus for dual channel, 256-bit for quad channel), whereas GDDR5 is paired with controllers of a nominal 32-bit (16 bit each for input and output), but whereas the CPU's memory contoller is 64-bit per channel, a GPU can utilise any number of 32-bit I/O's (at the cost of die size) depending upon application ( 2 for 64-bit bus, 4 for 128-bit, 6 for 192-bit, 8 for 256-bit, 12 for 384-bit etc...). The GDDR5 setup also allows for doubling or asymetric memory configurations. Normally (using this generation of cards as example) GDDR5 memory uses 2Gbit memory chips for each for each 32-bit I/O (I.e for a 256-bit bus/2GB card: 8 x 32-bit I/O each connected by a circuit to a 2Gbit IC = 8 x 2Gbit = 16Gbit = 2GB), but GDDR5 can also operate in what is known as clamshell mode, where the 32-bit I/O instead of being connected to one IC is split between two (one on each side of the PCB) allowing for a doubling up of memory capacity. Mixing the arrangement of 32-bit memory controllers and memory circuit splitting allows of asymetric configurations ( 192-bit, 2GB VRAM for example)
•Physically, a GDDR5 controller/IC doubles the I/O of DDR3 - With DDR, I/O handles an input (written to memory), or output (read from memory) but not both on the same cycle. GDDR handles input and output on the same cycle.

The memory is also fundamentally set up specifically for the application it uses:
System memory (DDR3) benefits from low latency (tight timings) at the expense of bandwidth, GDDR5's case is the opposite. Timings for GDDR5 would seems unbelieveably slow in relation to DDR3, but the speed of VRAM is blazing fast in comparison with desktop RAM- this has resulted from the relative workloads that a CPU and GPU undertake. Latency isn't much of an issue with GPU's since their parallel nature allows them to move to other calculation when latency cycles cause a stall in the current workload/thread. The performance of a graphics card for instance is greatly affected (as a percentage) by altering the internal bandwidth, yet altering the external bandwidth (the PCI-Express bus, say lowering from x16 to x8 or x4 lanes) has a minimal effect. This is because there is a great deal of I/O (textures for examples) that get swapped in and out of VRAM continuously- the nature of a GPU is many parallel computations, whereas a CPU computes in a basically linear way.

Sorry for the wall of text, but you did ask.

well said lol. you need faster memory for transferring larger amounts of data, that is why Gddr5 is used, but DDR3 is better for transferring smaller amounts of data. If you notice, DDR3 video cards have mainly 64bit memory bandwidth or 128 (64x2), while gddr5 cards have 256 or 384 bit memory bandwidths.

macfred · Feb 22, 2013

What do you think about the fact that the PS4 is going to use 8GB of GDDR5? Could you tell what are the pros and cons? I would really like to hear your opinion about that. Thanks!

Jad Chaar · Feb 22, 2013

It is much faster than DDR3... it is gonna be used for playing games and playing videos... it is more expensive and just isnt a feasible option for PCs at the moment. PCs are everyday machines... not always for gaming.

dividebyzero · Feb 22, 2013

macfred said:
What do you think about the fact that the PS4 is going to use 8GB of GDDR5? Could you tell what are the pros and cons? I would really like to hear your opinion about that. Thanks!

The use of GDDR5 is probably mandatory if you note the likelihood of increased complexity in the next generation console games (higher polygon counts, more complex post process image quality). The PS4 will use an AMD APU, which has already demonstrated that it is very sensitive to memory bandwidth, and given the long life cycle of a console it needs a degree of future proofing by adding as much bandwidth as possible.

cliffordcooley · Feb 22, 2013

dividebyzero said:
The use of GDDR5 is probably mandatory if you note the likelihood of increased complexity in the next generation console games (higher polygon counts, more complex post process image quality).

Can you clarify what make GDDR5 a better candidate for higher polygon counts? Forgive the questioning, I'm really ignorant when it comes to these memory/graphical aspects of computing.

dividebyzero · Feb 22, 2013

Yup. No problem
It goes back to the nature of graphics rendering and how the polygons are drawn. Sorry if I'm teaching my grandmother to suck eggs, but it might be a little easier if I outline the graphics pipeline. I'll use a red coloured font to show the video memory transactions and green for system RAM (probably be better as a flow chart but nvm)
On the software side you have your game (or app) [FONT=Calibri]↔[/FONT] API (DirectX/OpenGL) [FONT=Calibri]↔[/FONT] User Mode Driver / ICD [FONT=Calibri]↔[/FONT] Kernel Mode Driver (KMD) + CPU command buffer[FONT=Calibri]→[/FONT] loading textures to vRAM [FONT=Calibri]→ [/FONT]GPU Front End (Input assembler) .
Up until this point you're basically dealing with CPU and RAM- executing and monitoring game code, creating resources, shader compile, draw calls and allocating access to the graphics (since you likely have more than just the game needing resources). From here, the workload becomes hugely more parallel and moves to the graphics card. The video memory now holds the textures and the shader compilations that the game+API+drivers have loaded, These are added to the first few stages of the pipeline as and where needed to each the following shaders as the code is transformed from points (co-ordinates) and lines into polygons and their lighting:

Input Assembler (vRAM input) [FONT=Calibri]→[/FONT] Vertex Shader (vRAM input) [FONT=Calibri]→[/FONT] Hull Shader (vRAM input) [FONT=Calibri]→[/FONT] Tessellation Control Shader (vRAM input) (if Tessellation is used) [FONT=Calibri]→ [/FONT]Domain Shader (vRAM input) [FONT=Calibri]→[/FONT] Geometry Shader (vRAM input)

At this point, the stream output can move all or part of the render back into the memory to be re-worked. Depending on what is called for, the output can be called to any part of the previous shader pipeline (basically a loop) or held in memory buffers. Once the computations are completed they then move to Rasterization (turning the 3D image into pixels):
Rasterizer [FONT=Calibri]→[/FONT] Pixel Shader* (vRAM input and output) [FONT=Calibri]→[/FONT] Output Manager (tasked with producing the final screen image, and requires vRAM input and output)

* The Compute Shaders (if they exist on the card) are tasked with post processing (ambient occlusion, film grain, global illumination, motion blur, depth of field etc), A.I. routines, physics, and a lot of custom algorithms depending on the app., also run via the pixel shader, and can use that shaders access to vRAM input and output.

So basically, the parallel nature of graphics calls for input and output from vRAM at many points covering many concurrent streams of data. Some of that vRAM is also subdivided into memory buffers and caches to save data that would otherwise have to re-compiled for following frames. All this swapping out of data calls for high bandwidth, but latency can be lax (saving power demand) as any stall in one thread is generally lost in the sheer number of threads queued at any given time.
As I noted previously, GDDR5 allows a write and read to/from memory every clock cycle, whereas DDR3 is limited to a read or a write, which reduces bandwidth. Graphics DDR also allows for multiple memory controllers to cope with the I/O functions.

cliffordcooley · Feb 22, 2013

Yikes, I think you give me an outline for a two week study. If only my train of thought wasn't elsewhere at the moment, it would be easier for me to understand.

I re-read your comment a few times and think I vaguely understand the answer to my question.

P.S.
To be honest, I couldn't bring myself to suck an egg. lol

dividebyzero · Feb 22, 2013

If you were just wanting a comparison between DDR and GDDR then the multiple memory controllers, and read+write per cycle (as opposed to read or write per cycle of DDR) which both allow for higher aggregate bandwidth, are probably the main differentiators. I just highlighted why the internal bandwidth was necessary, but latency (which increases with bandwidth) is relatively unimportant in graphics memory.

EDIT: I knew I should have hunted for a diagram! This looks a little easier to understand (courtesy of Microsoft)

Jad Chaar · Feb 22, 2013

Interesting. You learn something everyday.

Samurai99 · Feb 23, 2013

dividebyzero What do you think of the rumour that the next Xbox will use 8GB DDR3 unified memory (as opposed to PS4's 8GB GDDR5) with 32MB of ESDRAM to compensate for the bandwidth? Do you think this is purely a cost reduction move or does this make sense for a console which will be an entertainment hub than a dedicated gaming device. To this end will the console benefit from lower latency memory such as DDR3 for the OS + Apps while boosting bandwidth with ESDRAM for the GPU for gaming.

thanks

dividebyzero · Feb 23, 2013

Samurai99
tbh I couldn't say definitively what the reasoning behind using ESRAM is for the Xbox, other than the facts that Xbox 720 doesn't seem a great an evolution from Xbox 360 (seems derivative from my understanding), and that consoles have a much more linear code programming than PC graphics.
Without the need for various driver overheads, a standard API, minimal resource management, and moderate graphics horsepower requirement, I could see why the path was taken but doesn't look as interesting as the PS4 solution (although my knowledge is largely based on only a few articles such as this one)

Samurai99 · Feb 24, 2013

dividebyzero said:
Samurai99
tbh I couldn't say definitively what the reasoning behind using ESRAM is for the Xbox, other than the facts that Xbox 720 doesn't seem a great an evolution from Xbox 360 (seems derivative from my understanding), and that consoles have a much more linear code programming than PC graphics.
Without the need for various driver overheads, a standard API, minimal resource management, and moderate graphics horsepower requirement, I could see why the path was taken but doesn't look as interesting as the PS4 solution (although my knowledge is largely based on only a few articles such as this one)

Thanks for your reply, yes its indeed quite interesting. I wonder how Sony will deal with implementing 8GB GDDR5 in their console, right now we only have 512MB modules but in theory modules up-to 1GB is possible. Also how will they deal with the latency in GDDR5? Perhaps with some caching or does it not matter? Maybe a version of GDDR5 with tighter timings if it exists. At the moment the only SoC I can think of similar to this is the Intel Xeon Phi which uses GDDR5 for its 50 cores, but that's HPC so cost is no issue. The Durango GPU is also fascinating as it has a higher level of customisation with ESDRAM and the Data Move engines. Its only on paper and an unknown quantity at the moment. All this of this of course based around a 9 month old devkit so things may change by the time of their reveal.

hahahanoobs · Feb 24, 2013

Is it possible the 8GB is shared between CPU and GPU?

Jad Chaar · Feb 24, 2013

I think it is shared

Samurai99 · Feb 25, 2013

Its unified memory, there is a difference between shared memory and unified memory if I'm not mistaken. In Shared memory the GPU can access the system memory. In Unified memory both CPU and GPU access a unified memory address space.

Vienn_22 · Apr 10, 2013

macfred said: ↑

What do you think about the fact that the PS4 is going to use 8GB of GDDR5? Could you tell what are the pros and cons? I would really like to hear your opinion about that. Thanks!

The use of GDDR5 is probably mandatory if you note the likelihood of increased complexity in the next generation console games (higher polygon counts, more complex post process image quality). The PS4 will use an AMD APU, which has already demonstrated that it is very sensitive to memory bandwidth, and given the long life cycle of a console it needs a degree of future proofing by adding as much bandwidth as possible.

I think he was meaning what will happen to the CPU now that the PS4 is using GDDR5 instead of DDR3 since you said the following as well :

...The memory is also fundamentally set up specifically for the application it uses:
System memory (DDR3) benefits from low latency (tight timings) at the expense of bandwidth, GDDR5's case is the opposite. Timings for GDDR5 would seems unbelieveably slow in relation to DDR3, but the speed of VRAM is blazing fast in comparison with desktop RAM- this has resulted from the relative workloads that a CPU and GPU undertake...

You also said above that the APU is "sensitive to memory bandwidth" but that is only for the GPU side, since the APU has a CPU side too what will happen now that PS4 is using GDDR5 for both, since the 2 memories are opposites wouldn't it affect the CPU performance?? what are your thoughts, is there any workaround for that as well? do you think sony found a way around it?

dividebyzero · Apr 10, 2013

Vienn_22 said:
I think he was meaning what will happen to the CPU now that the PS4 is using GDDR5 instead of DDR3

Nothing of any importance.
A console processor requires a very limited functionality compared with a PC proc. Consoles processors aren't required to be optimized for a vast range of software, drivers, hardware changes, OS bloat, concurrent processes. Even a PC CPU can make do with a very small amount of RAM if the workload is streamlined- as it is in a console.

Vienn_22 said:
You also said above that the APU is "sensitive to memory bandwidth" but that is only for the GPU side, since the APU has a CPU side too what will happen now that PS4 is using GDDR5 for both, since the 2 memories are opposites wouldn't it affect the CPU performance??

Nope, not in the slightest. What applications would a console be running that are CPU intensive and require minimal latency? CPUs require minimal latency because of multiple applications fighting for resources from available compute threads/cores - and multiple concurrent applications aren't likely to come into play with a console.
Most, if not all, applications running on a console APU would be hardware (GPU) accelerated. At this point I'm not even sure if PhysX wouldn't be HW accelerated on an AMD APU.

brunogm · May 24, 2013

"The GDDR5 SGRAM uses a 8n prefetch architecture and DDR interface to achieve high-speed operation. The GDDR5 interface transfers two 32 bitwide data words per WCK clock cycle to/from the I/O pins. Corresponding to the 8n prefetch a single write or read access consists ofa 256 bit wide, two CK clock cycle data transfer at the internal memory core and eight corresponding 32 bit wide one-half WCK clock cycle data transfers at the I/O pin" Now for 5 Gbit/s data rate per pin (CK clock runs with 1.25 GHz and WCK with 2.5 GHz). Can someone help calculeta the effective latency in nanoseconds?

dividebyzero · May 24, 2013

Its not really that simple, since GPU memory transactions aren't really as straightforward as system memory. Typically, vRAM IC latency (CAS) runs at 10-20 cycles- sometime a bit lower. Depends on read and write timings.

onlinekller · May 26, 2013

I think that the xbox one is going 8gb ddr 3 just for its cpu and the gpu will have its own memory like 1-2gb gddr5.If so this could be interesting.It could out power the ps4 with the shared 8gb gddr5.Not sure but this will also make it more like a pc.It might make it again easier to devolop for.Hope at e3 we get what gpu they are both using.Really don't think microsoft is going to let the cpu and gpu share that 8gb ddr3 and find out how much of that 8gb gddr5 sony is dedicating to the gpu.

devastator1980 · May 26, 2013

onlinekller said:
I think that the xbox one is going 8gb ddr 3 just for its cpu and the gpu will have its own memory like 1-2gb gddr5.If so this could be interesting.It could out power the ps4 with the shared 8gb gddr5.Not sure but this will also make it more like a pc.It might make it again easier to devolop for.Hope at e3 we get what gpu they are both using.Really don't think microsoft is going to let the cpu and gpu share that 8gb ddr3 and find out how much of that 8gb gddr5 sony is dedicating to the gpu.

The Xbox One has a unified memory architecture, same as PS4. The Xbox One does have 32 MB of cache (I believe on the GPU side but could be fore the overall APU since "APU" is CPU and GPU smooshed together on the same die). 3GB of the 8GB is reserved for the Xbox operating system, so the question really is "how much of the 5GB is reserved for GPU?" I don't really know but given the unified architecture, I'm wondering if the allocation would be dynamic (or at least determined by each developer's choice). Of course there would be a minimum amount needed just to run.

TheVogon · Jun 16, 2013

dividebyzero said:
What applications would a console be running that are CPU intensive and require minimal latency? .

Games spring to mind. Game and GPU performance is extremely CPU dependent - just look at the effect of CPU performance on Future Mark for instance.

Sony have provided a powerful GPU architecture, but at the cost of crippling their CPU performance. Microsoft have provided a similar level of GPU memory performance without crippling the CPU by using very fast on GPU SRAM cache.

Plus Microsoft have provided 3X the on console compute power available on demand in the cloud....Xbox Live is upgrading from 15,000 physical servers to 300,000 physical servers to support this.....

So likely the final GPU memory performance will be similar between consoles, Sony will have a 50% advantage in shaders, but Microsoft will have an on console CPU performance advantage - plus 3 times more compute power available in the cloud...

What's the difference between DDR3 memory and GDDR5 memory?

Posts: 126 +14

Posts: 4,840 +1,271

Posts: 6,481 +976

Posts: 6,481 +976

Posts: 4,840 +1,271

Posts: 13,141 +6,441

Posts: 4,840 +1,271

Posts: 13,141 +6,441

Posts: 4,840 +1,271

Posts: 6,481 +976

Posts: 4,840 +1,271

Posts: 5,223 +3,076

Posts: 6,481 +976

Posts: 4,840 +1,271

Posts: 4,840 +1,271

Similar threads