We’ve been collecting data on memory bandwidth for some time now – of course we have – but one of the big questions hanging over Skylake is what the DDR4 support really brings to the table. It’s also worth comparing four generations of memory controllers – two dual-channel and two quad-channel – and seeing what the weaknesses and strengths of each one are.
Editor's note: Guest author Dustin Sklavos is a Technical Marketing Specialist at Corsair and has been writing in the industry since 2005. This article was originally published on the Corsair blog.
With all that in mind, we compared Intel’s Ivy Bridge-E (quad-channel DDR3), Haswell (dual-channel DDR3), Haswell-E (quad-channel DDR4), and Skylake (dual-channel DDR4) at a variety of speed grades in synthetic testing in AIDA64 to isolate raw memory bandwidth. You may have heard by now that Skylake has a very robust memory controller, and that’s turned out to be true as you’ll see.
The following CAS latencies were used for each speed grade:
|Memory clock||DDR3 CAS latency||DDR4 CAS latency|
One crucial thing to point out with DDR4 is that it has an oddball “CAS latency hole.” You’ll notice we jumped directly from C16 to C18; C17 isn’t officially supported. The result is that there is a substantial jump in CAS latency moving up to 3466MHz that needs to be ameliorated, amusingly enough, by driving the memory at even higher clocks.
The blue bars represent our DDR3 configurations, while the red bars represent our DDR4 configurations. This should hopefully lay to rest some concerns about DDR4’s higher latencies negatively impacting performance when compared to DDR3. There were situations where DDR3 could be faster than DDR2 during that transition, but DDR4 is a different animal. It offers consistently higher read bandwidth at the same clock.
Note also that Haswell’s memory controller has a hard time going past 2400MHz, which really has been the performance sweet spot in DDR3. Yet there’s no point where the wheels start to shake on Skylake’s controller; it continues scaling, even up to and beyond 3600MHz.
Finally, one more trend you’ll see: DDR4-3000 on Skylake produces more raw memory bandwidth than Ivy Bridge-E’s default DDR3-1600. We now have a mainstream, dual-channel platform capable of generating nearly as much memory bandwidth as last generation’s quad-channel.
Interestingly, it seems like memory write operations have consistently been a minor sore spot. Haswell-E’s memory write performance capped at ~48000 MB/s and basically stayed there regardless of speed. That’s mighty fast, but Skylake is able to actually exceed it at 3200MHz and beyond. Skylake also easily eclipses Haswell and Ivy Bridge-E.
The memory copy operations look basically the same as the read operations. Haswell has the same drop at 2666MHz, and the DDR4-equipped platforms are consistently faster even at the same speed. Skylake’s exceptional ability to scale up in clock speed allows it to make up bandwidth and, at a high enough speed, put it in striking distance of Haswell-E.
This is arguably what DDR4 skeptics are going to gravitate toward despite the immense raw bandwidth of the technology. DDR4 latency is a bit higher than DDR3, but not catastrophically so.
What you need to focus on is essentially mapping the curve of DDR3 against the curve of DDR4. DDR3 more or less starts at 1600MHz for mainstream platforms, while DDR4 doesn’t go below 2133MHz. So at the entry level for each platform, latency is more or less the same, while bandwidth is significantly better on DDR4.
First, while Skylake’s instructions-per-clock gains are a little underwhelming, its memory controller is something else entirely. We’ll need to see how it handles DDR3L – and we’ll be testing that in greater detail soon enough – but it has none of the scaling hiccups any of its predecessors have. Skylake’s memory controller is incredibly robust, and Skylake seems to overall be more efficient with memory in general.
Second, DDR4 just doesn’t have the latency issues the transition from DDR2 to DDR3 did. In fact, it’s only when you’re making the C16 to C18 jump that overall latency starts to creep up, but that’s solved almost immediately by just going to the next speed grade.
Ultimately, DDR4 draws less power, runs cooler, and delivers more bandwidth-per-clock than the venerable DDR3, and it has the scaling headroom that DDR3 lacked in both capacity and raw bandwidth. In other words, it’s a worthy successor.