In context: Shader Execution Reordering significantly reduces the performance cost of ray tracing and path tracing in games such as Alan Wake 2 and Indiana Jones and the Great Circle on recent Nvidia graphics cards. The latest version of DirectX makes these efficiency gains accessible to all developers, facilitating the implementation of ray tracing in future titles.

The recently released Shader Model 6.9 brings Microsoft's official take on Shader Execution Reordering (SER) out of preview. Nvidia and a few game developers have already benefited from the feature, but the latest test nearly doubles frame rates for Intel Arc Battlemage GPUs.

One of the biggest performance bottlenecks in ray tracing is divergence, which occurs when rays bounce into different kinds of objects, forcing shader threads and branches to access different types of information. This often causes the GPU to run the threads sequentially rather than in parallel, significantly reducing performance.

Since the process occurs on shaders rather than RT cores, simply adding or enhancing RT cores does not address the issue. SER minimizes divergence by keeping similar threads in parallel.

Microsoft claims that the feature can increase performance in ray tracing, and especially path tracing, by up to 100%, but the impact varies across games depending on the amount of detail in each scene.

For example, according to Khronos, SER improved path tracing performance by approximately 24% in Indiana Jones and the Great Circle, 39% in Alan Wake 2 (with help from opacity micromaps), and a staggering 370% in Black Myth: Wukong. A freely available sample program illustrates the improvement, which can reach 40% on an Nvidia RTX 4090 and 90% on Battlemage graphics cards.

Nvidia introduced SER with the RTX 4000 series GPUs in 2022 and doubled the feature's efficiency with RTX 5000 GPUs. Microsoft previewed SER's standardized implementation in DirectX last year but has now made it a mandatory component of Shader Model 6.9.

This will make implementing the feature easier for developers and extend its benefits to Intel Arc Battlemage. AMD Radeon RX GPUs currently do not include hardware SER support, but future generations likely will.

Shader Model 6.9 also formalizes support for opacity micromaps, a feature that reduces the overhead from ray tracing when rendering transparent objects. All RTX GPUs support the feature, and standardization across DirectX increases the likelihood that future Intel and AMD hardware will as well.