Skip to content

GitLab

  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • mesa mesa
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 2,882
    • Issues 2,882
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 924
    • Merge requests 924
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Releases
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Mesa
  • mesamesa
  • Merge requests
  • !11431

Merged
Created Jun 16, 2021 by Danylo Piliaiev@Danil🇺🇦Developer

ir3: LDG,STG allow immidiate offset with register offset and with variable shift

  • Overview 25
  • Commits 3
  • Pipelines 20
  • Changes 23

The full form for ldg/stg offset is:

 g[reg_address + reg_offset << (imm_shift + 2) + imm_offset << 2]

where imm_shift is in [0, 3] range and imm_offset is in [0, 3] range.

a6xx blob was found to produce a bit simpler offset calculations for TES/TCS shaders in GTA V:

 [c002000a_03c14215] ldg.a.f32 r2.z, g[r1.y+((r2.z+1)<<2)], 3;
 [c0020004_01c14609] ldg.a.f32 r1.x, g[r1.y+((r1.x+3)<<2)], 1;

However I wasn't able see shift other than 2 anywhere.

Our new syntax is:

 stg.u32 g[r2.x+(r1.x+1)<<2], r5.x, 1
 stg.u32 g[r2.x+r1.x<<4+3<<2], r5.x, 1
 ldg.f32 r1.w, g[r1.y+(r1.w+1)<<2], 3
 ldg.f32 r1.w, g[r1.y+r1.w<<5+2<<2], 3

Also refactored stg registers order.


Now stg/ldg calls are rather ugly...


There is also computerator changes to quickly test the new offset calculation.
For example such assembly could be used for testing:

@localsize 32, 1, 1
@buf 32(c2.x)  ; g[0]
@const(c0.x)  0.0, 0.0, 0.0, 0.0
@wgid(r48.x)        ; r48.xyz
@invocationid(r0.x) ; r0.xyz
mov.u32u32 r0.y, r0.x
mov.u32u32 r1.x, c2.x
mov.u32u32 r1.y, c2.y
mov.u32u32 r1.z, 3
mov.u32u32 r2.x, 66
(rpt5)nop
stg.u32 g[r1.x+r1.z<<5+3<<2], r2.x, 1
nop(ss)(sy)
ldg.u32 r4.x, g[r1.x+r1.z<<5+3<<2], 1
nop(ss)(sy)
stg.u32 g[r1.x], r4.x, 1
end
nop
Edited Jun 16, 2021 by Danylo Piliaiev
Assignee
Assign to
Reviewer
Request review from
Time tracking
Source branch: ir3/ldg-stg-offset-calculation