Skip to content

Rework geometry shaders

Alyssa Rosenzweig requested to merge alyssa/mesa:agx/gs-better into main

Currently we implement vertex + geometry shaders as:

(VS + GS merged)/compute    ---------RAM------> (copy)/vertex

This is bad for several reasons:

  • merging VS/GS makes shader object very challenging to implement
  • heap memory usage is unpredictable
  • excessive memory usage/bandwidth for amplifying GS
  • poor parallelism in the massive merged shader -- easily can become latency-bound

The new approach looks like:

VS/compute  -----RAM-------> (GS index buffer + XFB only shader)/compute
                  |
                  |-------------------------------------------------------> (GS output)/vertex

The GS gets split up into a compute prepass, that generates an index buffer but does not shade outputs, and a vertex shader, that shades a single output vertex. This is better because

  • no more merging, easier shader objects
  • heap usage proprtional to VS outputs, not GS outputs: much more predictable and (for an amplifying GS) much smaller
  • reduced memory bandwidth/usage for amplifying GS
  • better parallelism with the smaller shaders

The output programs can do some redundant work, but in practice this ends up massively faster in Citra (my most GS-heavy apitrace). More importantly, it gets us much closer to shader objects. Along the way, we pick up disk caching and shader-db support for GS, and whittle down our GS key to almost nothing.

Frametime in a Citra trace decreased 27% across the series (fps increased 37%)

Edited by Alyssa Rosenzweig

Merge request reports