Skip to content
  • Connor Abbott's avatar
    nir/find_array_copies: Handle wildcards and overlapping copies · 156306e5
    Connor Abbott authored
    
    
    This commit rewrites opt_find_array_copies to be able to handle
    an array copy sequence with other intervening operations in between. In
    particular, this handles the case where we OpLoad an array of structs
    and then OpStore it, which generates code like:
    
    foo[0].a = bar[0].a
    foo[0].b = bar[0].b
    foo[1].a = bar[1].a
    foo[1].b = bar[1].b
    ...
    
    that wasn't recognized by the previous pass.
    
    In order to correctly handle copying arrays of arrays, and in particular
    to correctly handle copies involving wildcards, we need to use a tree
    structure similar to lower_vars_to_ssa so that we can walk all the
    partial array copies invalidated by a particular write, including
    ones where one of the common indices is a wildcard. I actually think
    that when factoring in the needed hashing/comparing code, a hash table
    based approach wouldn't be a lot smaller anyways.
    
    All of the changes come from tessellation control shaders in Strange
    Brigade, where we're able to remove the DXVK-inserted copy at the
    beginning of the shader. These are the result for radv:
    
    Totals from affected shaders:
    SGPRS: 4576 -> 4576 (0.00 %)
    VGPRS: 13784 -> 5560 (-59.66 %)
    Spilled SGPRs: 0 -> 0 (0.00 %)
    Spilled VGPRs: 0 -> 0 (0.00 %)
    Private memory VGPRs: 0 -> 0 (0.00 %)
    Scratch size: 8696 -> 6876 (-20.93 %) dwords per thread
    Code Size: 329940 -> 263268 (-20.21 %) bytes
    LDS: 0 -> 0 (0.00 %) blocks
    Max Waves: 330 -> 898 (172.12 %)
    Wait states: 0 -> 0 (0.00 %)
    
    Reviewed-by: default avatarJason Ekstrand <jason@jlekstrand.net>
    156306e5