3200MHz timings - 14-14-14-34 vs 16-18-18-38 - What is the "effective" bandwidth difference?

TheBigFatClown

Posts: 968   +393
I've been trying to understand memory timings for years and while I fool myself into thinking I do sometimes I really still need help in understanding the practical real world differences.

So, I'll state a few things I believe to be true for sure. A kit labelled for sale @ 3200MHz is really only stating the "effective" speed if operating in dual-channel mode. So, it's really each chip running @ 1600MHz. The more truthful representation of the bandwidth for a single memory chip would be the PC25600 description. Anyway, with that fact now stated we calculate the total theoretical bandwidth for a 3200MHz dual-channel memory kit as 1600MHz x 16 bytes a transfer x 2 channels = 51.2 GB/sec.

So, let's start with that number which is 51.2 GB/sec for a set of memory running at an "effective" speed of 3200MHz in dual-channel mode.

That's the theoretical maximum for all memory kits running at a rated 3200MHz "effective" speed regardless of timings.

But, if the memory can only do work during a percentage of those clock cycles and has to wait other times then we won't get 51.2 GB/sec in reality. Similar to a lamborghini that can do 120 MPH on a highway. If there are stoplights and the car has to stop then the maximum potential is lost. So, my question is what would be the "effective" bandwidth difference in two memory kits rated to run @ 3200MHz but with the different timings I listed in the subject. Again, they are: 14-14-14-34 vs 16-18-18-38. I've asked this question in chats and I know the difference may be negligible in real-life (or maybe it isn't) and people say it's no big deal. But how can I calculate the real numbers myself just so I know for sure the difference in the value of the 2 memory kits.

Thanks for reading!

EDIT: One last point to be made. I care about this because my CPU (right now) is an AMD Ryzen 2200G with Vega 8 graphics where these memory speeds matter more than on an Intel system for gaming.

EDIT #2: So, if you lose 18 clock cycles per transfer out of 3.2 billion versus 14 clock cycles out of 3.2 billion that does really seem significant. I guess the best way to solve this puzzle is with benchmarks. Is AIDA 64 still the best tool for this task? I can post my results for 3200MHz with 16-18-18-34 timings. I don't have a faster kit.

EDIT # 3: I got 45,442 MB/s using the AID64 benchmarking option on my 16-18-18-38 memory kit. Seems pretty good. On the other hand though. 45.4 / 51.2 means about 89% usable bandwidth. The other 11% spent waiting.
 
Last edited:

neeyik

Posts: 1,881   +2,199
Staff member
To be honest, you can't. There are so many variables behind even just the reading of data, that unless you have all the data on the various timings, it would be too difficult to work out.

However, one can generalise a situation based on the timings stated, and it would something like this: a row address is activated, then tRCD cycles run before the column can be activated. Then CL cycles pass before the data burst occurs. The row needs to be precharged before it can be activated again, and the earliest this can occur is tRAS cycles after row activation. After pre-charge, tRP cycles take place before the next round of data request can take place.

14-14-14-34 refers to CL-tRCD-tRP-tRAS. We can ignore tRAS because it's only significant if it's more than the sum of the other three (which it isn't for both memory configurations). Commands and data burst lengths will be same, in terms of time, so all we need to go is compare the sum of CL+tRCD+tRP:

14+14+14 = 42 cycles
16+18+18 = 52 cycles

The latter is 24% slower than the former. Does this mean the bandwidth is 24% lower? Potentially yes, but because of the multitude of other factors and latencies behind the scenes, calculating this without all the relevant information isn't possible..

Obviously, memory settings do affect performance:


But notice how there's little difference between the AIDA64 bandwidth results for CL14 and CL16, at the same memory speeds? It's only as the clock is raised do we see the significant changes. DDR4-3800 CL16 had roughly 25% better bandwidth than DDR4-3000 CL16, whereas CL14 vs CL16 for DDR4-3000 was marginally worse.

For the situation you're interested in, think of the timings as being a stabilising factor and focus more on raising the clock speed to gain better performance for your APU.
 

TheBigFatClown

Posts: 968   +393
Thank you for the insights on this issue. So, I should increase my memory speeds vs worrying about getting a new set of 3200MHz with lower timings. It's bizarre how much of a price gap there is between the 2 sets when the value is not there. Oh well. Their loss. I won't buy the 14-14-14-34 set then.

On another note, I went back and read a page of the article that actually prompted me to purchase my first AMD CPU/APU in years. The 3200MHz memory benchmarks in the review are horrible compared to my memory set now which are G.Skill FlareX 3200MHz 16-18-18-38. So, that kinda makes me feel good.


On page 9 of the same review, the reviewer states that the G.Skill 3200MHz FlareX memory kit was used but I couldn't seem to find what timings were on the kit. Regardless, I'm getting approximately 10 GB/sec more bandwidth as reported by AIDA 64 than what was shown in the review. That's crazy baby!
 
Last edited:

neeyik

Posts: 1,881   +2,199
Staff member
So, I should increase my memory speeds vs worrying about getting a new set of 3200MHz with lower timings. It's bizarre how much of a price gap there is between the 2 sets when the value is not there.
DRAM is manufactured and then binned in a similar manner to the way that CPU/GPUs are - a large wafer, containing all the DRAMs, gets fabrication, cut up and tested at certain clock speeds and voltages.

During the testing, numerous timings are cycled through, and the dies get binned into various categories based on the test parameters. So from the same wafer, one could have dies that happily run at 14-14-14-38, for example, whereas others needed longer latencies to run at the same clock and voltage.

Due to the nature of fabrication defects and the huge number of chips made, the pattern of bins will follow a normal distribution. That means the bulk of them will be in the 'middle', with a decreasing amount being better or worse. The very best will be the lowest in quantity, and so the supply of them will be restrictive - hence the big differences in price.

Memory with lower rated timings is going to be more stable than those rated higher, for the same clock speed. Personally I would always get memory with the lowest possible latencies, for a reasonable clock speed (e.g. 3000 or 3200). That way, I could increase the clocks and still have maneuvering room to adjust the timings to ensure it ran stable. Well, I used to: now I just get the best I can afford and switch on the XMP profile.

It's also worth noting that reported memory bandwidth, regardless of the benchmark, is affected by a number of variables and not just clocks+timings. The motherboard design, memory topology, trace layout, and BIOS revision can all make a difference.

I'd also worry less about bandwidth tests and more about actual application performance. The former is a 'best case' workload: typically, simple sequential read/writes, whereas a game is going to be involve a lot of random read/writes, across all kinds of data patterns and sizes. Here, latency can be more significant than outright clock speed, although it does depend on how CPU-constrained the game is.

So with your DDR4-3200 16-18-18-38 set, you've got plenty of scope to fiddle about with everything, to get the best performance for the application that matters the most for you. For example, running it as DDR4-3000 but with tighter timings would probably be more beneficial in games that trying to run it at DDR4-3400.
 

TheBigFatClown

Posts: 968   +393
DRAM is manufactured and then binned in a similar manner to the way that CPU/GPUs are - a large wafer, containing all the DRAMs, gets fabrication, cut up and tested at certain clock speeds and voltages.

During the testing, numerous timings are cycled through, and the dies get binned into various categories based on the test parameters. So from the same wafer, one could have dies that happily run at 14-14-14-38, for example, whereas others needed longer latencies to run at the same clock and voltage.

Due to the nature of fabrication defects and the huge number of chips made, the pattern of bins will follow a normal distribution. That means the bulk of them will be in the 'middle', with a decreasing amount being better or worse. The very best will be the lowest in quantity, and so the supply of them will be restrictive - hence the big differences in price.


Memory with lower rated timings is going to be more stable than those rated higher, for the same clock speed. Personally I would always get memory with the lowest possible latencies, for a reasonable clock speed (e.g. 3000 or 3200). That way, I could increase the clocks and still have maneuvering room to adjust the timings to ensure it ran stable. Well, I used to: now I just get the best I can afford and switch on the XMP profile.

It's also worth noting that reported memory bandwidth, regardless of the benchmark, is affected by a number of variables and not just clocks+timings. The motherboard design, memory topology, trace layout, and BIOS revision can all make a difference.

I'd also worry less about bandwidth tests and more about actual application performance. The former is a 'best case' workload: typically, simple sequential read/writes, whereas a game is going to be involve a lot of random read/writes, across all kinds of data patterns and sizes. Here, latency can be more significant than outright clock speed, although it does depend on how CPU-constrained the game is.

So with your DDR4-3200 16-18-18-38 set, you've got plenty of scope to fiddle about with everything, to get the best performance for the application that matters the most for you. For example, running it as DDR4-3000 but with tighter timings would probably be more beneficial in games that trying to run it at DDR4-3400.

The paragraph I quoted in bold is very interesting. I don't think I really understand what you're saying there. But I'm guessing it's 'overclocking territory' which I don't generally venture into. However, I may be guilty of doing so without having ever given it much thought. The kit I'm using right now is a G.Skill FlareX 3200MHz 16-18-18-38 16GB (2x8GB). I did have to go into my ASRock AB350m Pro4 BIOS and enable the XMP 2.0 to get these chips to run at 3200MHz. Technically, that's an overclock from 1.2v to 1.35v which CPU-Z confirms in Windows. Do I have any reason to think these memory chips won't be stable as long as I don't go into hobbyist (if that's the right word) overclocking territory? These have been running @ 3200MHz for about a week now without any Windows 10 freeze-ups or BSODs of any kind. So, I'm happy with the stability.

I like getting bang for my buck. I've never been real comfortable with the experimental overclocking ideas. In fact, your post makes it pretty clear that the memory manufacturer's have done all this "binning" for us already so they rate them in advance and that's the performance one should expect.
 

neeyik

Posts: 1,881   +2,199
Staff member
The paragraph I quoted in bold is very interesting. I don't think I really understand what you're saying there. But I'm guessing it's 'overclocking territory' which I don't generally venture into.
Yes, that's correct. Let's say you have some memory that runs at DDR4-2666 with timings 15-15-15, which are fairly low timings. Now overclocking that to DDR4-3000 with the same timings might be unstable, but not if I raise them to 17-17-17. Because they were already low to begin with, I only had to increase them a little to achieve the required stability.

This is exactly how the RAM in the system I'm using right now is configured. By default, it's DDR4-2666 15-15-15 but the XMP profile is DDR4-3000 17-17-17. Of course, one can do the same with, say, DDR4-2666 20-20-20, but as those timings are quite 'slow', rising them higher to achieve stability might offset any gains I could achieve through increasing the clock.

Do I have any reason to think these memory chips won't be stable as long as I don't go into hobbyist (if that's the right word) overclocking territory?
The idea behind XMP is that it provides a standard for DRAM manufacturers to adhere to, with regards to what clock+timings+voltage combination is stable. And as you've pointed out, the chips have already been binned and appropriately tested, even after being mounted on the DIMM package.

However, what they can't be certain of is that it will be stable in your system, which is why XMP carries no guarantee. This is because the CPU and motherboard, along with the BIOS, play a significant role in memory stability. My own system was somewhat glitchy when using XMP, but later BIOS revisions solved that problem.
 

TheBigFatClown

Posts: 968   +393
It's very interesting how all 3 parts play a significant role. I'm very happy with my current setup as it seems very stable. My specific G.Skill memory kit isn't even listed on the QVL memory list for ASRock motherboard but it works @ 3200MHz.
 

neeyik

Posts: 1,881   +2,199
Staff member
And there's a lot more than those three - for example, the CL (column address strobe latency) is just for data reads, there's a separate CWL for data writes. Most have a pretty narrow range of settings though, and the three stated as some of the widest ranging ones (plus they are amongst the critical timings that affect stability).

Motherboard manufacturers have a daunting task when it comes to testing RAM for their QVL - there are over 40 memory vendors (although only there's only 5 or so actual manufacturers of the chips themselves), and each one will offer 2 GB, 4 GB, 8 GB, etc DIMMs of varying clock speeds and timings. And if the motherboard itself offers multiple DIMM slots, then various combinations have to be tested as well. To be honest, it's pot luck if your memory actually appears on the list or not!