Skip to content
  • Rhys Perry's avatar
    aco: increase accuracy of SGPR limits · 08d51001
    Rhys Perry authored
    
    
    SGPRs are allocated in groups of 16 on GFX8/GFX9. GFX10 allocates a fixed
    number of SGPRs and has 106 addressable SGPRs.
    
    pipeline-db (Vega):
    SGPRS: 5912 -> 6232 (5.41 %)
    VGPRS: 1772 -> 1780 (0.45 %)
    Spilled SGPRs: 0 -> 0 (0.00 %)
    Spilled VGPRs: 0 -> 0 (0.00 %)
    Private memory VGPRs: 0 -> 0 (0.00 %)
    Scratch size: 0 -> 0 (0.00 %) dwords per thread
    Code Size: 88228 -> 87904 (-0.37 %) bytes
    LDS: 0 -> 0 (0.00 %) blocks
    Max Waves: 559 -> 571 (2.15 %)
    
    piepline-db (Navi):
    SGPRS: 341256 -> 363384 (6.48 %)
    VGPRS: 171536 -> 170960 (-0.34 %)
    Spilled SGPRs: 832 -> 581 (-30.17 %)
    Spilled VGPRs: 0 -> 0 (0.00 %)
    Private memory VGPRs: 0 -> 0 (0.00 %)
    Scratch size: 0 -> 0 (0.00 %) dwords per thread
    Code Size: 14207332 -> 14190872 (-0.12 %) bytes
    LDS: 33 -> 33 (0.00 %) blocks
    Max Waves: 18072 -> 18251 (0.99 %)
    
    v2: unconditionally count vcc as an extra sgpr on GFX10+
    v3: pass SGPRs rounded to 8
    
    Signed-off-by: default avatarRhys Perry <pendingchaos02@gmail.com>
    Reviewed-by: default avatarDaniel Schürmann <daniel@schuermann.dev>
    08d51001