1. 19 Feb, 2022 4 commits
    • Connor Abbott's avatar
      spirv: Rewrite determinant calculation · 7e8d8859
      Connor Abbott authored and Marge Bot's avatar Marge Bot committed
      
      
      The old calculation for mat3 was clever, but it turns out that a
      straightforward application of subdeterminants similar to how mat4 is
      handled is more efficient: on a scalar architecture with some sort of
      combined multiply+add instruction with a negate modifier (both fairly
      common), the new determinant is 9 instructions vs. 15 for the old one,
      and without the multiply-add it's 14 instructions vs. 18 for the old
      one.  When used as a routine for inverse() the savings are compounded,
      because we now use the same method as used to compute the adjucate
      matrix and so CSE can combine most of the calculations with the adjucate
      matrix ones.
      
      Once mat3 and mat4 use the same method for computing determinants, we
      can combine them into a single recursive function. I also pulled up the
      mat_subdet() function because it was doing basically what we need, so
      it's now shared between determinant and inverse. This shrinks the
      implementation significantly, as can be seen from the diffstat.
      
      The real reason I want to change this, though, is that it fixes
      dEQP-VK.glsl.builtin.precision_fp16_storage16b.inverse.compute.mat3 with
      turnip. Qualcomm uses round-to-zero for 16-bit frcp, which combined with
      some inaccuracy in the old method of calculating the determinant led us
      to fail. Qualcomm's driver uses something like the new method to
      calculate the determinant in the inverse. We could argue that Mesa's
      method should be allowed, because round-to-zero for floating-point
      division is within spec and there are no precision guarantees given for
      determinant() or inverse(). However we might as well use the more
      efficient method.
      Reviewed-by: Jason Ekstrand's avatarJason Ekstrand <jason.ekstrand@collabora.com>
      Part-of: <mesa/mesa!14652>
      7e8d8859
    • Connor Abbott's avatar
      util/blob: Clarify rules on blob::data · c21065c8
      Connor Abbott authored and Marge Bot's avatar Marge Bot committed
      
      Reviewed-by: Jason Ekstrand's avatarJason Ekstrand <jason.ekstrand@collabora.com>
      Part-of: <mesa/mesa!15028>
      c21065c8
    • Connor Abbott's avatar
      nir/serialize: Don't access blob->data directly · 67615503
      Connor Abbott authored and Marge Bot's avatar Marge Bot committed
      
      
      It won't work if the blob is fixed-size and we overrun the size, which
      will be the case with the Vulkan pipeline cache.
      
      This gets a bit tricky for the repeated-header optimization, because we
      can't read the header from the blob. Instead we have to store the header
      itself.
      Reviewed-by: Jason Ekstrand's avatarJason Ekstrand <jason.ekstrand@collabora.com>
      Part-of: <mesa/mesa!15028>
      67615503
    • Alyssa Rosenzweig's avatar
      pan/bi: Disambiguate IDVS variants in shader-db · 9168dcbb
      Alyssa Rosenzweig authored and Marge Bot's avatar Marge Bot committed
      
      
      Label IDVS variants as being MESA_SHADER_{POSITION, VARYING} stages;
      reserve the MESA_SHADER_VERTEX label for non-IDVS shaders. This reduces
      confusion where a single shader compiles to two MESA_SHADER_VERTEX
      shaders with different stats.
      
      While we're at it, de-vendor the blend shader stage name; these stats
      are internal anyway.
      Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@collabora.com>
      Part-of: <mesa/mesa!15086>
      9168dcbb
  2. 18 Feb, 2022 36 commits