DirectStorage benchmark shows massive transfer speed improvements

Daniel Sims

Posts: 1,363   +43
Staff
Why it matters: Microsoft's DirectStorage API promises to bring PCs ultra-fast load times akin to what Xbox Series console and PlayStation 5 users have experienced for two years. As the first game supporting DirectStorage prepares to launch, a benchmark shows real performance gains on retail hardware.

Tests from PC Games Hardware show that Microsoft's DirectStorage API can help NVMe SSDs load assets significantly faster than SATA SSDs. They also offer the enormous advantages of GPU-based decompression over CPU decompression.

The site ran Microsoft's publicly available Avocado-loading DirectStorage demo on a SATA SSD, a PCIe 3.0 NVMe SSD, and a PCIe 4.0 NVMe. It also compared decompression speeds between three GPUs and a CPU --an AMD Radeon RX 7900 XT, an Intel Arc A770, an Nvidia GeForce RTX 4080, and a 5.2GHz Intel i9-12900K.

The chart below displays the transfer rate of each hardware configuration in GB/s, showing the mean result out of five tests. The NVMe SSDs ran several times faster than the SATA SSD here, and PCIe 4.0 had a slight advantage over PCIe 3.0. Possibly strangest of all is that the A770 outperforms the RX 7900 XT and 4080 in GPU decompression despite being in a lower weight class for game performance.

Screenshots from the demo demonstrate the difference between CPU and GPU decompression. Some show a few gigabytes of assets taking between one and a half seconds and five seconds to load, with between 30 percent and 100 percent CPU utilization. Others show the same assets loading in around half a second with less than five percent CPU utilization, indicating the GPU has taken over the job.

The demo shows promising early results for hardware outside of Microsoft's labs. Furthermore, they demonstrate a huge turnaround from 2020 tests showing that recent games don't fully utilize the bandwidth advantages of NVMes over SATA drives.

However, loading a bunch of avocados isn't the same as loading a 3D game environment. Those interested won't need to wait long to test DirectStorage's real-world performance. The feature will debut in Square Enix's Forspoken, which launches on January 24.

At GDC in March, Square Enix claimed that DirectStorage lets Forspoken load new scenes and environments in less than two seconds on an NVMe SSD compared to several seconds on a SATA SSD and almost half a minute on an HDD. Microsoft designed the new API for use on Windows 11. Due to its legacy storage stack, Windows 10 systems will only see limited benefits.

Permalink to story.

 
Is there any way to make this available on older games and other OS, or is it just going to be limited to Win11 and newer games only? It would be a literal game changer if it was available for everything, and it will still be nice going forward I'm sure, but I tend to play older games that I'd like to get this for. Especially open world like Skyrim and TW3. I find for other genres things are already fine. Funny thing, people used to insist there is no difference between SATA SSD and PCIE, and my own tests proved otherwise back on FO3 and FNV when PCIE SSDs started coming to the consumer space. Now people claim a huge difference between PCIE 3 and PCIE 5, but all the testing I've seen shows for games there is none. Random 4K reads don't seem to have changed at all really.
 
Is there any way to make this available on older games and other OS, or is it just going to be limited to Win11 and newer games only? It would be a literal game changer if it was available for everything, and it will still be nice going forward I'm sure, but I tend to play older games that I'd like to get this for. Especially open world like Skyrim and TW3. I find for other genres things are already fine. Funny thing, people used to insist there is no difference between SATA SSD and PCIE, and my own tests proved otherwise back on FO3 and FNV when PCIE SSDs started coming to the consumer space. Now people claim a huge difference between PCIE 3 and PCIE 5, but all the testing I've seen shows for games there is none. Random 4K reads don't seem to have changed at all really.
There's always a way to make something happen in computing, the issue is never the lack of way, but the lack of financial incentives. Microsoft obviously doesn't have much motivation to backport new features to EOL systems. Those systems are already sold. And what would make you upgrade then, too? There has to be some new features to make you upgrade, only the very tech-savvy upgrade just for the sake of being up-to-date.
 
It's ridiculous it's taken this long and also ridiculous it's not available on windows 10. This reminds me of the DX10 fiasco where you needed to upgrade to vista to play crysis in DX10.
 
I don't know what to think of this. Does this improve performance of the game? (I don't see how) or is it only improving loading times?. Most games already have decent loading times (with some exceptions) so I don't see the excitement around it. Great, my game now loads in 5 seconds, instead of 10, but my fps is still in the gutter.
 
Hello Techspot, not all NVMe drives support Direct Storage (natively). The controller itself can have the protocol native and if so performance leaps. If not, it still will show big improvements, but to take full advantage you should have a drive with native direct storage protocol baked in.

As of right now, I have only noticed two drives in retail that has. That said, we know new drives are coming out, I would guess the new controllers will have NATIVE Direct Storage protocols. Time will tell.
 
I don't know what to think of this. Does this improve performance of the game? (I don't see how) or is it only improving loading times?.
Games that constantly stream assets from the system memory and/or storage would potentially get a performance benefit from using DS -- it depends on whether the title in question is CPU-limited on the PC running it.

As things currently stand, all PC games load assets via the CPU, including any decompression if it's required, and if this involves a big pile of small files, the CPU overhead of reading and writing them can easily bog down the processor. DS greatly reduces this by offering a much better data I/O -- instead of it being a completely serial, one-file-at-a-time system, DS offers greater parallelization.

The API also provides GPU decompression of files, so they can stay compressed on the drive to reduce the storage footprint and bandwidth requirements, and free up the CPU from handling the decompression duties.
 
Hello Techspot, not all NVMe drives support Direct Storage (natively). The controller itself can have the protocol native and if so performance leaps. If not, it still will show big improvements, but to take full advantage you should have a drive with native direct storage protocol baked in.
The vast majority of NVMe drives on the market do support the use of DirectStorage and it's nothing to do with the controller -- it's entirely down to whether the NVM Express driver is compatible. For example, Samsung's driver is somewhat picky about it, although this may improve in time, but the standard Microsoft driver works with pretty much any NVMe SSD and is fully compatible with the DirectStorage API.
 
I don't know what to think of this. Does this improve performance of the game? (I don't see how) or is it only improving loading times?. Most games already have decent loading times (with some exceptions) so I don't see the excitement around it. Great, my game now loads in 5 seconds, instead of 10, but my fps is still in the gutter.

Depends on the game, the best example I can think of so far is Ratchet & Clank: A Rift Apart on PS5

The game mechanics are designed around fast storage where you can open a ‘rift’ and load an entirely new game world in a blink of an eye.

This is very much a core element of the game though, it’s designed around every PS5 having NVMe storage response times.
 
Anyone know how to download and run the SLN file? Can't see the option on git. I've downloaded VS 2022 as well
Download the entire DirectStorage sample set from Git. You'll also need Windows SDK. Right-click on the Bulk Load demo solution in Visual Studio, and retarget it to the Windows SDK you have installed.

Edit: Probably better to change the Project properties to the installed SDK instead.
 
Last edited:
Depends on the game, the best example I can think of so far is Ratchet & Clank: A Rift Apart on PS5

The game mechanics are designed around fast storage where you can open a ‘rift’ and load an entirely new game world in a blink of an eye.

This is very much a core element of the game though, it’s designed around every PS5 having NVMe storage response times.
The game definitely benefited from the fast storage reads and would be hard to replicate on PC without having the appropriate hardware and direct storage API, but PCI gen 4 and fast enough NVME drives should be able to do it. It's really sad to see expensive PC hardware lagging behind consoles in so many ways these days. Even shader caching, its just annoying to see consoles have basically zero issues with that, but PC games stuttering along while building a shader cache. I think this is more of a game engine issue, but its an issue the same. I don't know if direct storage can help with the cache problem or not.
 
The game definitely benefited from the fast storage reads and would be hard to replicate on PC without having the appropriate hardware and direct storage API, but PCI gen 4 and fast enough NVME drives should be able to do it. It's really sad to see expensive PC hardware lagging behind consoles in so many ways these days. Even shader caching, its just annoying to see consoles have basically zero issues with that, but PC games stuttering along while building a shader cache. I think this is more of a game engine issue, but its an issue the same. I don't know if direct storage can help with the cache problem or not.
I am not sure if it is a game engine issue. Essentially the same game on say an Xbox Series S/X loads a lot faster than the same game on PC with even more powerful hardware. The game engine as far as I am aware is the same. The hardware, is also essentially a Ryzen Zen 2 with a RDNA2 GPU. So it is also very PC like. The key difference however, is the OS/ software. Xbox as far as I am aware runs on a cut down and custom Windows. I am not sure if this is true, but this is what I recall reading.
In any case, I am quite interested to know if Direct Storage will actually result in lower FPS in games since the load has moved from CPU to GPU. In games, GPU is generally at 100% utilization while CPU utilization rate is quite low, assuming one is using an 8 core CPU.
 
It's more than just load time. They should put more emphasis on asset grabbing than constantly focusing on load times. If something can grab assets faster, that means more can be loaded at any given moment which is a HUGE DEAL. This is why you see worlds filled with more, see the demo with Ratchet and Clank. Faster load times isn't the real focus, its more of a byproduct of the real goal.
 
"Square Enix claimed that DirectStorage lets Forspoken load new scenes and environments in less than two seconds on an NVMe SSD compared to several seconds on a SATA SSD and almost half a minute on an HDD."

OK, but that was a well known fact for many years now!! Games do load and play faster on NVMes....

Please do explain to us peasants, what will this new technology really help: Games? applications?? Data backup? Calculating ballistic trajectories? Decompressing huge zipped files? Bragging rights? What??

PS: The article keeps harping on "decompression", so decompressing what exactly and to what end??
 
Eh, these, these articles are always written for people that have already been following this for years.

First, if someone wants to run the demo, here is a post explaining where to download and what to run WITHOUT needing to compile SLN in Visual Studio:
https://forums.tomshardware.com/thr...-amd-intel-nvidia-ready.3784310/post-22857085

As for general comments, loading times as such are of little value, as already mentioned. 2 seconds or 15 seconds doesn't matter much when loading a game or save game. But when running (better yet - driving or flying) through open world game, and you suddenly jump from the jungle and into a big city, or "rift" and teleport to whole new world, then that can be loaded without using an actual load screen (or any of masking technique like cut scenes and elevators and dumb animations). Because by the time your character makes 3 steps, game can load 1 kilometer of space. Better yet, on systems with less memory (or GPUs with less memory) you can just stream assets (models, textures, etc) without need to load the whole level in advance. So THAT may show as better performance on such hardware, because you can use NVMe as sort of extended memory.

Now, also, why this will take ages to take roots on a PC, simply because not everyone has NVMe (or even SSD) yet, and devs are holding off from designing games around it, because it would alienate people with lesser hardware. Console devs don't have that issue, if game isn't planned for release on PS4, just PS5, they can make a game that 100% relies on the asset streaming. And that brings better performance, smoother gameplay, more detailed worlds, better textures, and so on, without sacrificing anything really.
 
And once again MS leaves it upto Game devs to incorporate it into their games why can't MS just make it work with everything regardless the game
 
In real life only the access speed is relevant. And that value is basically the same whichever SSD you use. When you switch from HDD to SSD the change is dramatic. Because their 4K file transfer rate, mostly affected by access time, is beyond compare. After that, switching from SATA SSD to M2 SSD, or from one SSD manufacturer to another, is very hard to notice. Even though one may be enormously faster than the other in sequential data transfer, the file access times will be similar. And that's the primary bottleneck when loading Windows and most of Windows apps/games.
 
And once again MS leaves it upto Game devs to incorporate it into their games why can't MS just make it work with everything regardless the game
DirectStorage is an API for input/output transfers, and requires the use of specific instructions and data structures to make use of it. No API can magically force that into a game.
Please do explain to us peasants, what will this new technology really help: Games? applications?? Data backup? Calculating ballistic trajectories? Decompressing huge zipped files? Bragging rights? What??
DirectStorage offers two things: the removal of the need to copy assets into system memory and then transfer them into GPU global memory, via the CPU. This reduces the memory footprint taken up by the game and lowers the CPU overhead involved in the data transfers, which can be significant, depending on the game. The second thing it offers is to let the GPU perform the decompression of assets, instead of how it’s normally done at the moment - load compressed files into system memory, decompress them via the CPU back into system memory, then copy them across to the GPU memory. By moving the files in compressed form first and then using the GPU to decompress them, you get lower CPU usage, fewer memory transfers, smaller system memory footprint, and lower bus usage.

For games that have traditionally had to carefully manage data transfers due to the CPU impact, DirectStorage offers a way past this all.
 
Back