Skip to content
  • Alyssa Rosenzweig's avatar
    pan/bi: Handle fsqrt like the DDK · 26e69b73
    Alyssa Rosenzweig authored
    
    
    There are 4 distinct cases of fsqrt:
    
    1. FP16 on Bifrost
    
    Here we may use lower to `x * rsqrt(x)` with a .left modifier on the
    FMA_RSCALE.v2f16 used to carry out the multiplication, ensuring correct
    handling of NaN and Inf.
    
    2. FP32 on G71
    
    Missing FRSQ.f32 instruction, do something simple since we don't even
    probe the driver on G71...
    
    3. FP32 on G72 and newer
    
    We can do the same lowering as FP16 in theory. However, this may have
    precision issues. The DDK uses extra FREXPM/FREXPE instructions in a
    .sqrt mode for a range reduction. It's unknown if this is necessary for
    OpenGL (ES), Vulkan, OpenCL, or some combination thereof.
    
    4. FP16 on Valhall
    
    We want to use the same strategy as on Bifrost, but Valhall removed the
    FMA_RSCALE.v2f16 instruction. Instead we use an ordinary FMA.v2f16 for
    the multiply and check the special case explicitly with CSEL.v2f32 ...
    I'm not sure if this is right for infinity but the DDK does it so ¯\_(ツ)_/¯
    
    Signed-off-by: default avatarAlyssa Rosenzweig <alyssa@collabora.com>
    26e69b73