Skip to content

GitLab

  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • mesa mesa
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 2,879
    • Issues 2,879
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 903
    • Merge requests 903
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Releases
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Mesa
  • mesamesa
  • Merge requests
  • !1844

Merged
Created Sep 02, 2019 by Connor Abbott@cwabbott0Developer

ac/nir: Use nir_opt_constants to create PC-relative loads

  • Overview 5
  • Commits 3
  • Pipelines 4
  • Changes 8

Shaders oftentimes use lookup tables, which means they have a big array of constants which they index into indirectly:

const vec4 foo[4] = vec4[4](vec4(0.0, 1.0, 2.0, 3.0), ...);

uint i = ...;
... = foo[i];

There was a GLSL-level optimization used by radeonsi to turn these into uniforms, but since it only looked at array initializers it missed cases where the shader initializes the array contents dynamically, which is common in shaders translated from some low-level IR:

vec4 foo[4];
foo[0] = vec4(0.0, 1.0, 2.0, 3.0);
foo[1] = vec4(...);
foo[2] = vec4(...);
foo[3] = vec4(...);

uint i = ...;
... = foo[i];

radv had no such optimization at all. It turns out NIR already has an optimization pass that can recognize both cases. Using this in combination with LLVM's support for PC-relative loads from a read-only data section after the code, we can get the optimal code sequence for this without having to do much inside the driver itself.

In addition to being a worthwhile optimization on its own, this should fix a regression from enabling scratch support by making the relevant shader use loads from a constant pool instead of storing a bunch of constants to scratch every iteration of a loop.

We can also shut off the GLSL lowering since ours is better, but that will take a bit more work.

Assignee
Assign to
Reviewer
Request review from
Time tracking
Source branch: review/ac-nir-large-constants