Skip to content
  • Hyunjun Ko's avatar
    freedreno/ir3: Add new ir3 pass to fold out fp16 conversions · 6ee375f6
    Hyunjun Ko authored
    This pass tries to fold f2f16 conversion into alu instructions.
    This will be useful to help reduce the number of instructions once
    mesa starts supporting precision lowering.  For example:
    
      add.f r0.w, r0.w, c0.x
      cov.f32f16 hr2.x, r0.w
    
    to
    
      add.f hr2.x, r0.w, c0.x
    
    Additionally this pass also tries to fold f2f16 conversion into load_input
    instruction:
    
      bary.f r0.x, 3, r0.w
      cov.f32f16 hr0.x, r0.x
    
    to
    
      bary.f hr1.x, 3, r0.x
    
    v2: Edit to not fold OPC_MAX_F and OPC_MIN_F, since that's not valid.
    
    v3: Add OPC_ABSNEG_F to the blacklist as well.
    
    v4: Don't remove dead cov instructions, DCE will do that later; don't
    iterate through sources when a cov only has one; remove special
    handling of IR3_REG_ARRAY and IR3_REG_RELATIV.
    
    v5: Handle folding into u32.u32 movs of floats correctly, don't bail
    out on IR3_REG_RELATIV or IR3_REG_ARRAY movs.
    
    Part-of: <mesa!3822>
    6ee375f6