Yup. No problem
It goes back to the nature of graphics rendering and how the polygons are drawn. Sorry if I'm teaching my grandmother to suck eggs, but it might be a little easier if I outline the graphics pipeline. I'll use a red coloured font to show the video memory transactions and green for system RAM (probably be better as a flow chart but nvm)
On the software side you have your game (or app) [FONT=Calibri]

[/FONT] API (DirectX/OpenGL) [FONT=Calibri]

[/FONT] User Mode Driver / ICD [FONT=Calibri]

[/FONT] Kernel Mode Driver (KMD) + CPU command buffer[FONT=Calibri]
→[/FONT]
loading textures to vRAM [FONT=Calibri]
→ [/FONT]GPU Front End (Input assembler) .
Up until this point you're basically dealing with CPU and RAM- executing and monitoring game code, creating resources, shader compile, draw calls and allocating access to the graphics (since you likely have more than just the game needing resources). From here, the workload becomes hugely more parallel and moves to the graphics card. The video memory now holds the textures and the shader compilations that the game+API+drivers have loaded, These are added to the first few stages of the pipeline as and where needed to
each the following shaders as the code is transformed from points (co-ordinates) and lines into polygons and their lighting:
Input Assembler (
vRAM input) [FONT=Calibri]
→[/FONT] Vertex Shader (
vRAM input) [FONT=Calibri]
→[/FONT] Hull Shader (
vRAM input) [FONT=Calibri]
→[/FONT] Tessellation Control Shader (
vRAM input) (if Tessellation is used) [FONT=Calibri]
→ [/FONT]Domain Shader (
vRAM input) [FONT=Calibri]
→[/FONT] Geometry Shader (
vRAM input)
At this point, the stream output can move all or part of the render back into the memory to be re-worked. Depending on what is called for, the output can be called to any part of the previous shader pipeline (basically a loop) or held in memory buffers. Once the computations are completed they then move to Rasterization (turning the 3D image into pixels):
Rasterizer [FONT=Calibri]
→[/FONT] Pixel Shader*
(vRAM input and output) [FONT=Calibri]
→[/FONT] Output Manager (
tasked with producing the final screen image, and requires
vRAM input and output)
* The Compute Shaders (if they exist on the card) are tasked with post processing (ambient occlusion, film grain, global illumination, motion blur, depth of field etc), A.I. routines, physics, and a lot of custom algorithms depending on the app., also run via the pixel shader, and can use that shaders access to
vRAM input and output.
So basically, the parallel nature of graphics calls for input and output from vRAM at many points covering many concurrent streams of data. Some of that vRAM is also subdivided into memory buffers and caches to save data that would otherwise have to re-compiled for following frames. All this swapping out of data calls for high bandwidth, but latency can be lax (saving power demand) as any stall in one thread is generally lost in the sheer number of threads queued at any given time.
As I noted previously, GDDR5 allows a write and read to/from memory every clock cycle, whereas DDR3 is limited to a read or a write, which reduces bandwidth. Graphics DDR also allows for multiple memory controllers to cope with the I/O functions.