Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Register
  • Sign in
  • mesa mesa
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 20
    • Issues 20
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 6
    • Merge requests 6
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Artifacts
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Container Registry
    • Model experiments
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar

Admin message

Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this wiki page for instructions on how to get full permissions. Sorry for the inconvenience.

  • Faith Ekstrand
  • mesamesa
  • Issues
  • #26

Implement write-lock-read

The idea behind write-lock-read (WLR) is that it's a generalization of SSA which allows for multiple and partial writes to values while still maintaining most of the nice properties of SSA when it comes to register allocation, CSE, and copy-prop. A value X is a valid WLR value if the following hold:

  1. All writes to X occur in the same block
  2. All reads from X either belong to an instruction which writes X as its only output or are dominated by the final write to X.

In other words things like x |= y are ok but you can't do arbitrary reads and write. When a WLR value is optimized, the sequence of instructions generating that value are effectively considered to be one meta-instruction. No optimization is possible within the WLR instruction sequence so code which generates WLR values is expected to generate an optimal sequence.

When you considered all of the writes to a WLR value as a single meta-instructions, WLR values can be treated as SSA values during optimization. In particular, they have the property that they have exactly one (multi-instruction) definition which dominates all the uses.

There are many places where we need something like this:

  1. Building payloads and headers involves multiple MOV instructions which generate a single value. This can also be handled with intrinsics and an allocator that can coalesce on-the-fly.
  2. Predication in the IR requires either something like this or psi-SSA
  3. Generating gl_SubgroupInvocation requires writing 0x76543210:v into a HW_GRF and then emitting 0-2 (depending on SIMD width) adds to generate the other 8 or 24 channels of indices.
  4. Subgroup ops like SPIR-V's OpGroupNonUniformFAdd require piles of scratching around on a HW_GRF and we want to be able to copy-prop the result

We had some disussions in-person and came to the conclusion that the first couple of cases were ones in which we could probably just use regular SSA. For payloads and headers, we probably want to because classic SSA and doing register coalesce in RA will likely yield the best results when it comes to coalescing and range splitting. However, the other cases are a bit harder to handle without WLR.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking