Skip to content
Snippets Groups Projects
  1. Jun 07, 2019
    • Kenneth Graunke's avatar
      egl/x11: calloc dri2_surf so it's properly zeroed · 4e3297f7
      Kenneth Graunke authored
      
      Commit 2282ec0a refactored drawable creation across various platforms
      into a new dri2_create_drawable helper function.
      
      The GBM code in platform_drm.c code passed in dri2_surf->gbm_surf as the
      loaderPrivate, while most other backends passed in dri2_surf directly.
      
      To try and handle this, the patch checked if dri2_surf->gbm_surf was
      non-NULL, and if so, presumed that the caller is the DRM platform and
      we should use the dri2_surf->gbm_surf pointer.
      
      This worked for most platforms, which calloc their dri2_surf structure,
      zeroing the data.  Unfortunately, platform_x11.c used malloc, leaving
      most of the dri2_surf as garbage.  In particular, dri2_surf->gbm_surf
      was often non-NULL, causing dri2_create_drawable to try and use it,
      passing a garbage pointer to the createNewDrawable hook, usually leading
      to a SIGBUS or SIGSEGV when trying to dereference that bad pointer.
      
      Since most callers calloc the data, make platform_x11.c follow suit.
      
      Fixes crashes with i915_dri.so when running dEQP-GLES2.
      
      Reviewed-by: default avatarMathias Fröhlich <Mathias.Froehlich@web.de>
      Reviewed-by: default avatarTapani Pälli <tapani.palli@intel.com>
      4e3297f7
  2. Jun 06, 2019
    • Mark Janes's avatar
      tests/graw: use C99 print conversion specifier for 32 bit builds · 04dac697
      Mark Janes authored
      
      Fixes formatting errors for 32 bit compilations, eg:
      
        error: format specifies type 'unsigned long' but the argument has
        type 'uint64_t' (aka 'unsigned long long') [-Werror,-Wformat]
        printf("result1 = %lu result2 = %lu\n", res1.u64, res2.u64);
      
      Reviewed-by: default avatarEmil Velikov <emil.velikov@collabora.com>
      Reviewed-by: default avatarEric Anholt <eric@anholt.net>
      04dac697
    • Alyssa Rosenzweig's avatar
      panfrost/midgard: Fix crash with unused SSA values · 30adeb7a
      Alyssa Rosenzweig authored
      
      Crash introduced in "b38dab10" but not
      adding a Fixes tag since it's our bug anyway.
      
      Signed-off-by: default avatarAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
      30adeb7a
    • Boris Brezillon's avatar
      panfrost: Report sRGB colorspace as not supported · 3d661a4e
      Boris Brezillon authored
      
      The driver does not support sRGB yet, so let's report it as unsupported.
      
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Reviewed-by: default avatarAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
      3d661a4e
    • Erik Faye-Lund's avatar
      docs: do not use div for line-breaking · c0dfe8c6
      Erik Faye-Lund authored
      
      HTML has the <p>-tag for this purpose. It adds some margins, but that
      just makes this read better, IMO.
      
      Signed-off-by: default avatarErik Faye-Lund <erik.faye-lund@collabora.com>
      Reviewed-by: default avatarEric Engestrom <eric.engestrom@intel.com>
      c0dfe8c6
    • Erik Faye-Lund's avatar
      docs: fixup code-tag positioning · f3235cfa
      Erik Faye-Lund authored
      
      This reads better if we include the asterisk in the code-block, as it's
      part of the function-reference, even though it's not technically
      speaking code. But as the <code>-tag isn't purely for code, this should
      be fine.
      
      Signed-off-by: default avatarErik Faye-Lund <erik.faye-lund@collabora.com>
      Reviewed-by: default avatarEric Engestrom <eric.engestrom@intel.com>
      f3235cfa
    • Erik Faye-Lund's avatar
      docs: add missing code-tags · 205f960e
      Erik Faye-Lund authored
      
      Looks like I missed a few cases when I recently added more code-tags
      here. So let's add these cases as well.
      
      Signed-off-by: default avatarErik Faye-Lund <erik.faye-lund@collabora.com>
      Reviewed-by: default avatarEric Engestrom <eric.engestrom@intel.com>
      205f960e
    • Erik Faye-Lund's avatar
      docs: add accidentally dropped "at" · 54b7a1f1
      Erik Faye-Lund authored
      
      When rewriting 20c56e18 after review, I accidentally dropped the "at"
      here. Sorry for that, and let's fix it up!
      
      Signed-off-by: default avatarErik Faye-Lund <erik.faye-lund@collabora.com>
      Fixes: 20c56e18 ("docs: use proper links instead of code-tags")
      Reviewed-by: default avatarEric Engestrom <eric.engestrom@intel.com>
      54b7a1f1
    • Gurchetan Singh's avatar
      anv: allow NV12 <--> AHARDWAREBUFFER_FORMAT_Y8Cb8Cr8_420 inter-op · 110f139f
      Gurchetan Singh authored
      
      AHARDWAREBUFFER_FORMAT_Y8Cb8Cr8_420 is an implementation defined
      flexible YUV format.  Most of the times, it's NV12 or YV12.
      On Intel, NV12 is preferred since it can be used by the display
      engine.  
      
      This API adds a dependency between gralloc and buffer consumers,
      unfortunately.  Right now, the code seems to work for i915 gralloc,
      but not cros_gralloc.  Add a preprocessor flag to fix this.
      
      TEST=android.graphics.cts.MediaVulkanGpuTest#testMediaImportAndRendering
      
      Reviewed-by: default avatarTapani Pälli <tapani.palli@intel.com>
      110f139f
    • Connor Abbott's avatar
      ac/nir: Remove stale TODO · 9d93d2a4
      Connor Abbott authored
      
      While we're here, copy the comment explaining this from radeonsi.
      
      Reviewed-by: default avatarSamuel Pitoiset <samuel.pitoiset@gmail.com>
      9d93d2a4
    • Connor Abbott's avatar
      radeonsi: Don't force dcc disable for loads · 1d55b0da
      Connor Abbott authored
      
      When e9d935ed added force_dcc_off(), we forced it off for any
      preloaded image descriptor which had stores associated with them, since
      the same preloaded descriptors were used for loads and stores. However,
      when the preloading was removed in 16be87c9, the existing logic was
      kept despite it not being necessary anymore. The comment above
      force_dcc_off() only mentions stores, so only force DCC off for stores.
      
      Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
      Cc: Marek Olšák <marek.olsak@amd.com>
      Reviewed-by: default avatarMarek Olšák <marek.olsak@amd.com>
      1d55b0da
    • Gert Wollny's avatar
    • Gert Wollny's avatar
      mapi/glapi/registry: Update gl.xml to latest upstream version · f1f6228a
      Gert Wollny authored
      
      The old copy didn't include EXT_clip_control, so update it.
      
      Signed-off-by: default avatarGert Wollny <gert.wollny@collabora.com>
      Reviewed-by: default avatarTapani Pälli <tapani.palli@intel.com>
      Acked-by: default avatarMarek Olšák <marek.olsak@amd.com>
      Reviewed-by: default avatarEmil Velikov <emil.velikov@collabora.com>
      f1f6228a
    • Gert Wollny's avatar
      virgl: Enable CAP_CLIP_HALFZ if host supports it · 8657257a
      Gert Wollny authored
      
      On according hosts this enables the piglits as "pass":
        arb_clip_control-*
      
      v2: sync flag with host
      
      Signed-off-by: default avatarGert Wollny <gert.wollny@collabora.com>
      Reviewed-by: Chia-I Wu <olvaffe@gmail.com> (v1)
      Reviewed-by: default avatarEmil Velikov <emil.velikov@collabora.com>
      8657257a
    • Charmaine Lee's avatar
      svga: Remove unnecessary check for the pre flush bit for setting vertex buffers · f29b8fde
      Charmaine Lee authored
      
      This fixes the missing rebind when the can_pre_flush bit
      is not set and the vertex buffers are the same as what have been sent.
      
      Cc: mesa-stable@lists.freedesktop.org
      Reviewed-by: default avatarNeha Bhende <bhenden@vmware.com>
      Signed-off-by: default avatarCharmaine Lee <charmainel@vmware.com>
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      f29b8fde
    • Deepak Rawat's avatar
      winsys/svga/drm: Fix 32-bit RPCI send message · 72fc8868
      Deepak Rawat authored
      
      Depending on whether compiled with frame-pointer or not, the temporary
      memory location used for the bp parameter in these macros are referenced
      relative to the stack pointer or the frame pointer.
      Hence we can never reference that parameter when we've modified either
      the stack pointer or the frame pointer, because then the compiler would
      generate an incorrect stack reference.
      
      Fix this by pushing the temporary memory parameter on a known location on
      the stack before modifying the stack- and frame pointers.
      
      Also in case of failuire RPCI channel is not closed which lead to vmx
      running out of channels.
      
      Cc: mesa-stable@lists.freedesktop.org
      Signed-off-by: default avatarDeepak Rawat <drawat@vmware.com>
      Reviewed-by: default avatarSinclair Yeh <syeh@vmware.com>
      Reviewed-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      72fc8868
    • Samuel Pitoiset's avatar
      radv: set the subpass before any initial subpass transitions · b9d3a6b6
      Samuel Pitoiset authored
      
      This might fix initial subpass transitions when multiview is used.
      Noticed while implementing sample locations during layout transitions.
      
      Signed-off-by: default avatarSamuel Pitoiset <samuel.pitoiset@gmail.com>
      Reviewed-By: default avatarBas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
      b9d3a6b6
    • Nataraj Deshpande's avatar
      anv: Fix check for isl_fmt in assert · d6724471
      Nataraj Deshpande authored and Tapani Pälli's avatar Tapani Pälli committed
      
      Checking isl_fmt returned value in assert seems appropriate
      instead of format variable.
      
      Fixes: f1654fa7 "anv/android: support creating images from external format"
      Signed-off-by: default avatarNataraj Deshpande <nataraj.deshpande@intel.com>
      Reviewed-by: default avatarTapani Pälli <tapani.palli@intel.com>
      Reviewed-by: Sagar Ghuge's avatarSagar Ghuge <sagar.ghuge@intel.com>
      d6724471
    • Iago Toral's avatar
      v3d: fix scheduling dependency tracking for ALU with small immediates · 09d230c6
      Iago Toral authored
      
      We were not accountint for small immediates in the B mux so the scheduler
      was interpreting these are regular register file accesses, which could
      lead to additional (incorrect) write-read dependencies.
      
      Shader-db changes:
      
      total instructions in shared programs: 9163664 -> 9137263 (-0.29%)
      instructions in affected programs: 3931035 -> 3904634 (-0.67%)
      helped: 12457
      HURT: 2563
      
      total max-temps in shared programs: 1325787 -> 1325597 (-0.01%)
      max-temps in affected programs: 5746 -> 5556 (-3.31%)
      helped: 186
      HURT: 16
      helped stats (abs) min: 1 max: 4 x̄: 1.12 x̃: 1
      helped stats (rel) min: 1.45% max: 22.22% x̄: 4.42% x̃: 3.28%
      HURT stats (abs)   min: 1 max: 3 x̄: 1.12 x̃: 1
      HURT stats (rel)   min: 2.86% max: 10.00% x̄: 5.76% x̃: 5.88%
      95% mean confidence interval for max-temps value: -1.04 -0.84
      95% mean confidence interval for max-temps %-change: -4.16% -3.07%
      Max-temps are helped.
      
      Reviewed-by: default avatarEric Anholt <eric@anholt.net>
      09d230c6
    • Vasily Khoruzhick's avatar
    • Vasily Khoruzhick's avatar
      lima/ppir: fix crash when program uses no registers at all · 5980565a
      Vasily Khoruzhick authored
      
      Program may need no regalloc at all, e.g. in case when program consists
      of single discard op.
      
      Signed-off-by: default avatarVasily Khoruzhick <anarsoul@gmail.com>
      Reviewed-by: default avatarQiang Yu <yuq825@gmail.com>
      5980565a
    • Faith Ekstrand's avatar
      util/hash_table: Assert that keys are not reserved pointers · b38dab10
      Faith Ekstrand authored
      
      If we insert a NULL key, it will appear to succeed but will mess up
      entry counting.  Similar errors can occur if someone accidentally
      inserts the deleted key.  The later is highly unlikely but technically
      possible so we should guard against it too.
      
      Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      Reviewed-by: default avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: default avatarEric Anholt <eric@anholt.net>
      b38dab10
    • Faith Ekstrand's avatar
      util/set: Assert that keys are not reserved pointers · 8306dabc
      Faith Ekstrand authored
      
      If we insert a NULL key, it will appear to succeed but will mess up
      entry counting.  Similar errors can occur if someone accidentally
      inserts the deleted key.  The later is highly unlikely but technically
      possible so we should guard against it too.
      
      Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      Reviewed-by: default avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: default avatarEric Anholt <eric@anholt.net>
      8306dabc
    • Faith Ekstrand's avatar
    • Faith Ekstrand's avatar
      nir/propagate_invariant: Don't add NULL vars to the hash table · d96878a6
      Faith Ekstrand authored
      
      Fixes: 8410cf66 "nir/propagate_invariant: Skip unknown vars"
      Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      Reviewed-by: default avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: default avatarEric Anholt <eric@anholt.net>
      d96878a6
    • Ian Romanick's avatar
      intel/compiler: Treat b32csel as potentially producing a Boolean result for resolve analysis · 1c30d26d
      Ian Romanick authored
      
      If the 2nd and 3rd source are both Boolean values, we can potentially
      avoid a resolve by only resolving the result of the b32csel.
      
      No changes on any Gen6+ Intel platform.
      
      v2: Use ?: instead of cast from bool to unsigned.  Suggested by Caio.
      
      Iron Lake
      total instructions in shared programs: 8142729 -> 8142677 (<.01%)
      instructions in affected programs: 12890 -> 12838 (-0.40%)
      helped: 26
      HURT: 0
      helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
      helped stats (rel) min: 0.25% max: 0.74% x̄: 0.45% x̃: 0.38%
      95% mean confidence interval for instructions value: -2.00 -2.00
      95% mean confidence interval for instructions %-change: -0.52% -0.39%
      Instructions are helped.
      
      total cycles in shared programs: 188549632 -> 188549394 (<.01%)
      cycles in affected programs: 60754 -> 60516 (-0.39%)
      helped: 25
      HURT: 1
      helped stats (abs) min: 2 max: 26 x̄: 9.92 x̃: 8
      helped stats (rel) min: 0.07% max: 2.23% x̄: 0.59% x̃: 0.27%
      HURT stats (abs)   min: 10 max: 10 x̄: 10.00 x̃: 10
      HURT stats (rel)   min: 0.70% max: 0.70% x̄: 0.70% x̃: 0.70%
      95% mean confidence interval for cycles value: -12.91 -5.40
      95% mean confidence interval for cycles %-change: -0.84% -0.23%
      Cycles are helped.
      
      GM45
      total instructions in shared programs: 5013119 -> 50130938 (<.01%)
      instructions in affected programs: 6764 -> 6738 (-0.38%)
      helped: 13
      HURT: 0
      helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
      helped stats (rel) min: 0.24% max: 0.68% x̄: 0.43% x̃: 0.36%
      95% mean confidence interval for instructions value: -2.00 -2.00
      95% mean confidence interval for instructions %-change: -0.52% -0.34%
      Instructions are helped.
      
      total cycles in shared programs: 128977804 -> 128977700 (<.01%)
      cycles in affected programs: 37738 -> 37634 (-0.28%)
      helped: 13
      HURT: 0
      helped stats (abs) min: 8 max: 8 x̄: 8.00 x̃: 8
      helped stats (rel) min: 0.18% max: 0.46% x̄: 0.30% x̃: 0.26%
      95% mean confidence interval for cycles value: -8.00 -8.00
      95% mean confidence interval for cycles %-change: -0.36% -0.24%
      Cycles are helped.
      
      Reviewed-by: default avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>
      1c30d26d
    • Ian Romanick's avatar
      intel/fs: Improve discard_if code generation · 0ba9497e
      Ian Romanick authored
      
      Previously we would blindly emit an sequence like:
      
              mov(1)          f0.1<1>UW       g1.14<0,1,0>UW
              ...
              cmp.l.f0(16)    g7<1>F          g5<8,8,1>F      0x41700000F  /* 15F */
      (+f0.1) cmp.z.f0.1(16)  null<1>D        g7<8,8,1>D      0D
      
      The first move sets the flags based on the initial execution mask.
      Later discard sequences contain a predicated compare that can only
      remove more SIMD channels.  Often times the only user of the result from
      the first compare is the second compare.  Instead, generate a sequence
      like
      
              mov(1)          f0.1<1>UW       g1.14<0,1,0>UW
              ...
              cmp.l.f0(16)    g7<1>F          g5<8,8,1>F      0x41700000F  /* 15F */
      (+f0.1) cmp.ge.f0.1(8)  null<1>F        g5<8,8,1>F      0x41700000F  /* 15F */
      
      If the results stored in g7 and f0.0 are not used, the comparison will
      be eliminated.  This removes an instruction and potentially reduces
      register pressure.
      
      v2: Major re-write of the commit message (including fixing the assembly
      code).  Suggested by Matt.
      
      All Gen8+ platforms had similar results. (Ice Lake shown)
      total instructions in shared programs: 17224434 -> 17198659 (-0.15%)
      instructions in affected programs: 2908125 -> 2882350 (-0.89%)
      helped: 18891
      HURT: 5
      helped stats (abs) min: 1 max: 12 x̄: 1.38 x̃: 1
      helped stats (rel) min: 0.03% max: 25.00% x̄: 1.76% x̃: 1.02%
      HURT stats (abs)   min: 9 max: 105 x̄: 51.40 x̃: 35
      HURT stats (rel)   min: 0.43% max: 4.92% x̄: 2.34% x̃: 1.56%
      95% mean confidence interval for instructions value: -1.39 -1.34
      95% mean confidence interval for instructions %-change: -1.79% -1.73%
      Instructions are helped.
      
      total cycles in shared programs: 361468458 -> 361170679 (-0.08%)
      cycles in affected programs: 38470116 -> 38172337 (-0.77%)
      helped: 16202
      HURT: 1456
      helped stats (abs) min: 1 max: 4473 x̄: 26.24 x̃: 18
      helped stats (rel) min: <.01% max: 28.44% x̄: 2.90% x̃: 2.18%
      HURT stats (abs)   min: 1 max: 5982 x̄: 87.51 x̃: 28
      HURT stats (rel)   min: <.01% max: 51.29% x̄: 5.48% x̃: 1.64%
      95% mean confidence interval for cycles value: -18.24 -15.49
      95% mean confidence interval for cycles %-change: -2.26% -2.14%
      Cycles are helped.
      
      total spills in shared programs: 12147 -> 12176 (0.24%)
      spills in affected programs: 175 -> 204 (16.57%)
      helped: 8
      HURT: 5
      
      total fills in shared programs: 25262 -> 25292 (0.12%)
      fills in affected programs: 269 -> 299 (11.15%)
      helped: 8
      HURT: 5
      
      Haswell
      total instructions in shared programs: 13530316 -> 13502647 (-0.20%)
      instructions in affected programs: 2507824 -> 2480155 (-1.10%)
      helped: 18859
      HURT: 10
      helped stats (abs) min: 1 max: 12 x̄: 1.48 x̃: 1
      helped stats (rel) min: 0.03% max: 27.78% x̄: 2.38% x̃: 1.41%
      HURT stats (abs)   min: 5 max: 39 x̄: 25.70 x̃: 31
      HURT stats (rel)   min: 0.22% max: 1.66% x̄: 1.09% x̃: 1.31%
      95% mean confidence interval for instructions value: -1.49 -1.44
      95% mean confidence interval for instructions %-change: -2.42% -2.34%
      Instructions are helped.
      
      total cycles in shared programs: 377865412 -> 377639034 (-0.06%)
      cycles in affected programs: 40169572 -> 39943194 (-0.56%)
      helped: 15550
      HURT: 1938
      helped stats (abs) min: 1 max: 2482 x̄: 25.67 x̃: 18
      helped stats (rel) min: <.01% max: 37.77% x̄: 3.00% x̃: 2.25%
      HURT stats (abs)   min: 1 max: 4862 x̄: 89.17 x̃: 35
      HURT stats (rel)   min: <.01% max: 67.67% x̄: 6.16% x̃: 2.75%
      95% mean confidence interval for cycles value: -14.42 -11.47
      95% mean confidence interval for cycles %-change: -2.05% -1.91%
      Cycles are helped.
      
      total spills in shared programs: 26769 -> 26814 (0.17%)
      spills in affected programs: 826 -> 871 (5.45%)
      helped: 9
      HURT: 10
      
      total fills in shared programs: 38383 -> 38425 (0.11%)
      fills in affected programs: 834 -> 876 (5.04%)
      helped: 9
      HURT: 10
      
      LOST:   5
      GAINED: 10
      
      Ivy Bridge
      total instructions in shared programs: 12079250 -> 12044139 (-0.29%)
      instructions in affected programs: 2409680 -> 2374569 (-1.46%)
      helped: 16135
      HURT: 0
      helped stats (abs) min: 1 max: 23 x̄: 2.18 x̃: 2
      helped stats (rel) min: 0.07% max: 37.50% x̄: 2.72% x̃: 1.68%
      95% mean confidence interval for instructions value: -2.21 -2.14
      95% mean confidence interval for instructions %-change: -2.76% -2.67%
      Instructions are helped.
      
      total cycles in shared programs: 180116747 -> 179900405 (-0.12%)
      cycles in affected programs: 25439823 -> 25223481 (-0.85%)
      helped: 13817
      HURT: 1499
      helped stats (abs) min: 1 max: 1886 x̄: 26.40 x̃: 18
      helped stats (rel) min: <.01% max: 38.84% x̄: 2.57% x̃: 1.97%
      HURT stats (abs)   min: 1 max: 3684 x̄: 98.99 x̃: 52
      HURT stats (rel)   min: <.01% max: 97.01% x̄: 6.37% x̃: 3.42%
      95% mean confidence interval for cycles value: -15.68 -12.57
      95% mean confidence interval for cycles %-change: -1.77% -1.63%
      Cycles are helped.
      
      LOST:   8
      GAINED: 10
      
      Sandy Bridge
      total instructions in shared programs: 10878990 -> 10863659 (-0.14%)
      instructions in affected programs: 1806702 -> 1791371 (-0.85%)
      helped: 13023
      HURT: 0
      helped stats (abs) min: 1 max: 5 x̄: 1.18 x̃: 1
      helped stats (rel) min: 0.07% max: 13.79% x̄: 1.65% x̃: 1.10%
      95% mean confidence interval for instructions value: -1.18 -1.17
      95% mean confidence interval for instructions %-change: -1.68% -1.62%
      Instructions are helped.
      
      total cycles in shared programs: 154082878 -> 153862810 (-0.14%)
      cycles in affected programs: 20199374 -> 19979306 (-1.09%)
      helped: 12048
      HURT: 510
      helped stats (abs) min: 1 max: 323 x̄: 20.57 x̃: 18
      helped stats (rel) min: 0.03% max: 17.78% x̄: 2.05% x̃: 1.52%
      HURT stats (abs)   min: 1 max: 448 x̄: 54.39 x̃: 16
      HURT stats (rel)   min: 0.02% max: 37.98% x̄: 4.13% x̃: 1.17%
      95% mean confidence interval for cycles value: -17.97 -17.08
      95% mean confidence interval for cycles %-change: -1.84% -1.75%
      Cycles are helped.
      
      LOST:   1
      GAINED: 0
      
      Iron Lake
      total instructions in shared programs: 8155075 -> 8142729 (-0.15%)
      instructions in affected programs: 949495 -> 937149 (-1.30%)
      helped: 5810
      HURT: 0
      helped stats (abs) min: 1 max: 8 x̄: 2.12 x̃: 2
      helped stats (rel) min: 0.10% max: 16.67% x̄: 2.53% x̃: 1.85%
      95% mean confidence interval for instructions value: -2.14 -2.11
      95% mean confidence interval for instructions %-change: -2.59% -2.48%
      Instructions are helped.
      
      total cycles in shared programs: 188584610 -> 188549632 (-0.02%)
      cycles in affected programs: 17274446 -> 17239468 (-0.20%)
      helped: 3881
      HURT: 90
      helped stats (abs) min: 2 max: 168 x̄: 9.08 x̃: 6
      helped stats (rel) min: <.01% max: 23.53% x̄: 0.83% x̃: 0.30%
      HURT stats (abs)   min: 2 max: 10 x̄: 2.80 x̃: 2
      HURT stats (rel)   min: <.01% max: 0.60% x̄: 0.10% x̃: 0.07%
      95% mean confidence interval for cycles value: -9.35 -8.27
      95% mean confidence interval for cycles %-change: -0.85% -0.77%
      Cycles are helped.
      
      GM45
      total instructions in shared programs: 5019308 -> 5013119 (-0.12%)
      instructions in affected programs: 489028 -> 482839 (-1.27%)
      helped: 2912
      HURT: 0
      helped stats (abs) min: 1 max: 8 x̄: 2.13 x̃: 2
      helped stats (rel) min: 0.10% max: 16.67% x̄: 2.46% x̃: 1.81%
      95% mean confidence interval for instructions value: -2.14 -2.11
      95% mean confidence interval for instructions %-change: -2.54% -2.39%
      Instructions are helped.
      
      total cycles in shared programs: 129002592 -> 128977804 (-0.02%)
      cycles in affected programs: 12669152 -> 12644364 (-0.20%)
      helped: 2759
      HURT: 37
      helped stats (abs) min: 2 max: 168 x̄: 9.03 x̃: 4
      helped stats (rel) min: <.01% max: 21.43% x̄: 0.75% x̃: 0.31%
      HURT stats (abs)   min: 2 max: 10 x̄: 3.62 x̃: 4
      HURT stats (rel)   min: <.01% max: 0.41% x̄: 0.10% x̃: 0.04%
      95% mean confidence interval for cycles value: -9.53 -8.20
      95% mean confidence interval for cycles %-change: -0.79% -0.70%
      Cycles are helped.
      
      Reviewed-by: default avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>
      0ba9497e
    • Ian Romanick's avatar
      intel/fs: Add need_dest parameter to fs_visitor::nir_emit_alu · a2887085
      Ian Romanick authored
      
      This is the same as the need_dest parameter to
      prepare_alu_destination_and_sources.  This allows us to not change the
      register that is expected to hold an result if an instruction is
      re-emitted.  This is particularly a problem if the re-emitted
      instruction is a partial write.  A later patch will use this feature.
      
      No shader-db changes on any Intel platform.
      
      v2: Don't do the Boolean resolve when there is no destination.  If the
      ALU instruction didn't write a register, there's nothing to resolve.
      This replaces an earlier patch "intel/fs: Allocate dummy destination
      register when need_dest is false".
      
      Reviewed-by: default avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>
      a2887085
    • Ian Romanick's avatar
      intel/fs: Allow cmod propagation across reads and writes of different flags · e13a5c7d
      Ian Romanick authored
      
      This also helps a later patch (intel/fs: Improve discard_if code
      generation) on about 200 shaders.
      
      v2: Document that other instruction sequences are also valid in
      subtract_merge_with_compare_intervening_mismatch_flag_write.  Suggested
      by Caio.
      
      All Intel platforms had similar results. (Ice Lake shown)
      total instructions in shared programs: 17224438 -> 17224434 (<.01%)
      instructions in affected programs: 296 -> 292 (-1.35%)
      helped: 4
      HURT: 0
      helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
      helped stats (rel) min: 0.99% max: 1.92% x̄: 1.43% x̃: 1.40%
      95% mean confidence interval for instructions value: -1.00 -1.00
      95% mean confidence interval for instructions %-change: -2.04% -0.81%
      Instructions are helped.
      
      total cycles in shared programs: 361468455 -> 361468458 (<.01%)
      cycles in affected programs: 2862 -> 2865 (0.10%)
      helped: 2
      HURT: 2
      helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
      helped stats (rel) min: 0.24% max: 0.39% x̄: 0.31% x̃: 0.31%
      HURT stats (abs)   min: 3 max: 4 x̄: 3.50 x̃: 3
      HURT stats (rel)   min: 0.32% max: 0.70% x̄: 0.51% x̃: 0.51%
      95% mean confidence interval for cycles value: -4.34 5.84
      95% mean confidence interval for cycles %-change: -0.70% 0.90%
      Inconclusive result (value mean confidence interval includes 0).
      
      Reviewed-by: default avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>
      e13a5c7d
    • Ian Romanick's avatar
      intel/fs: Fix flag_subreg handling in cmod propagation · 8030cb75
      Ian Romanick authored
      
      There were two errors.  First, the pass could propagate conditional
      modifiers from an instruction that writes on flag register to an
      instruction that writes a different flag register.  For example,
      
          cmp.nz.f0.0(16) null:F, vgrf6:F, vgrf5:F
          cmp.nz.f0.1(16) null:F, vgrf6:F, vgrf5:F
      
      could be come
      
          cmp.nz.f0.0(16) null:F, vgrf6:F, vgrf5:F
      
      Second, if an instruction writes f0.1 has it's condition propagated, the
      modified instruction will incorrectly write flag f0.0.  For example,
      
          linterp(16) vgrf6:F, g2:F, attr0:F
          cmp.z.f0.1(16) null:F, vgrf6:F, vgrf5:F
          (-f0.1) discard_jump(16) (null):UD
      
      could become
      
          linterp.z.f0.0(16) vgrf6:F, g2:F, attr0:F
          (-f0.1) discard_jump(16) (null):UD
      
      None of these cases will occur currently.  The only time we use f0.1 is
      for generating discard intrinsics.  In all those cases, we generate a
      squence like:
      
          cmp.nz.f0.0(16) vgrf7:F, vgrf6:F, vgrf5:F
          (+f0.1) cmp.z(16) null:D, vgrf7:D, 0d
          (-f0.1) discard_jump(16) (null):UD
      
      Due to the mixed types and incompatible conditions, this sequence would
      never see any cmod propagation.  The next patch will change this.
      
      No shader-db changes on any Intel platform.
      
      v2: Fix typo in comment in test case subtract_delete_compare_other_flag.
      Noticed by Caio.
      
      Reviewed-by: default avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>
      8030cb75
    • Ian Romanick's avatar
      intel/fs: Add missing tests for cmod_propagate_not · 2dd60139
      Ian Romanick authored
      
      Tests like this should have been added in 4467040c ("i965/fs:
      Propagate conditional modifiers from not instructions").
      
      Reviewed-by: default avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>
      2dd60139
  3. Jun 05, 2019
    • Kenneth Graunke's avatar
      i965: Allow signed/unsigned integer conversions in miptree up/download · 6a7d3873
      Kenneth Graunke authored
      
      BLORP now handles this so there's no reason to fall back.
      
      Reviewed-by: default avatarJason Ekstrand <jason@jlekstrand.net>
      6a7d3873
    • Kenneth Graunke's avatar
      intel/blorp: Handle SINT/UINT clamping on blits. · f06c8635
      Kenneth Graunke authored
      
      This patch makes blorp_blit handle SINT<->UINT blit value clamping.
      After reading the source's integer data (which is expanded to 32-bit),
      we either IMAX with 0 (for SINT -> UINT, to clamp negative numbers) or
      UMIN with (1 << 31) - 1 (for UINT -> SINT, to clamp positive numbers
      outside of the representable range).
      
      Such blits are not allowed by the OpenGL or Vulkan APIs directly:
      
         The Vulkan 1.1 spec for vkCmdBlitImage says:
      
         "Integer formats can only be converted to other integer formats with
          the same signedness."
      
         The GL 4.5 spec for glBlitFramebuffer says:
      
         "An INVALID_OPERATION error is generated if format conversions are
          not supported, which occurs under any of the following conditions:
          [...]
          * The read buffer contains unsigned integer values and any draw
            buffer does not contain unsigned integer values.
          * The read buffer contains signed integer values and any draw buffer
            does not contain signed integer values."
      
      However, they are useful for other operations, such as texture upload
      and download, which typically are implemented via blorp_blit().  i965
      has code to fall back in this case (which the next commit will delete),
      and Gallium expects blit() to handle this case for texture upload.
      
      Fixes the following tests on iris:
      - GTF-GL46.gtf32.GL3Tests.packed_pixels.packed_pixels
      - GTF-GL46.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo
      - GTF-GL46.gtf32.GL3Tests.packed_pixels.packed_pixels_pixelstore
      
      Reviewed-by: default avatarJason Ekstrand <jason@jlekstrand.net>
      f06c8635
    • Caio Oliveira's avatar
      anv/pipeline: Move lowering of nir_var_mem_global later · 1aea4cd0
      Caio Oliveira authored
      
      This let deref optimizations apply to globals before lowering them.
      
      Reviewed-by: default avatarJason Ekstrand <jason@jlekstrand.net>
      1aea4cd0
    • Kenneth Graunke's avatar
      st/nir: Don't use GLSL IR's MOD_TO_FLOOR lowering when using NIR. · 4f3c82c7
      Kenneth Graunke authored
      
      Both GLSL IR and NIR perform the same mod -> floor lowering for 32-bit
      types.  But nir_lower_double_ops is slightly more defensive against
      lowered drcp precision loss, and handles mod(x, x) = 0 directly.  This
      works well...assuming nir_lower_double_ops actually gets an fmod op to
      lower in the first place.
      
      The previous patches enabled NIR-based lowering for the remaining
      drivers, so we can stop using the GLSL IR lowering when using NIR.
      
      Fixes KHR-GL45.gpu_shader_fp64.builtin.mod_dvec[234] on iris.
      
      Reviewed-by: default avatarMarek Olšák <marek.olsak@amd.com>
      4f3c82c7
    • Kenneth Graunke's avatar
      radeonsi: Enable NIR's lower_fmod option. · f4d4c426
      Kenneth Graunke authored
      
      Currently, st/mesa is always calling the GLSL IR lower_instructions()
      pass with MOD_TO_FLOOR set, so mod operations will be lowered before
      ever reaching NIR.  This enables the same lowering at the NIR level,
      which will let me shut off the GLSL IR path for NIR-based drivers.
      
      The AMD NIR backend also has code to handle fmod, so we could
      potentially skip this and still be fine.  I don't have an opinion
      on that.
      
      Reviewed-by: default avatarMarek Olšák <marek.olsak@amd.com>
      f4d4c426
    • Kenneth Graunke's avatar
      vc4: Enable NIR's lower_fmod option. · e0641e07
      Kenneth Graunke authored
      
      Currently, st/mesa is always calling the GLSL IR lower_instructions()
      pass with MOD_TO_FLOOR set, so mod operations will be lowered before
      ever reaching NIR.  This enables the same lowering at the NIR level,
      which will let me shut off the GLSL IR path for NIR-based drivers.
      
      Reviewed-by: default avatarMarek Olšák <marek.olsak@amd.com>
      Acked-by: default avatarEric Anholt <eric@anholt.net>
      e0641e07
    • Kenneth Graunke's avatar
      v3d: Enable NIR's lower_fmod option. · b0e3bd79
      Kenneth Graunke authored
      
      Currently, st/mesa is always calling the GLSL IR lower_instructions()
      pass with MOD_TO_FLOOR set, so mod operations will be lowered before
      ever reaching NIR.  This enables the same lowering at the NIR level,
      which will let me shut off the GLSL IR path for NIR-based drivers.
      
      Reviewed-by: default avatarMarek Olšák <marek.olsak@amd.com>
      Acked-by: default avatarEric Anholt <eric@anholt.net>
      b0e3bd79
    • Kenneth Graunke's avatar
      nir: Combine lower_fmod16/32 back into a single lower_fmod. · c7d1b52a
      Kenneth Graunke authored
      
      We originally had a single lower_fmod option.  In commit 2ab2d2e5, Sam
      split 32 and 64-bit lowering into separate flags, with the rationale
      that some drivers might want different options there.  This left 16-bit
      unhandled, so Iago added a lower_fmod16 option in commit ca31df6f.
      
      Now that lower_fmod64 is gone (in favor of nir_lower_doubles and
      nir_lower_dmod), we re-combine lower_fmod16 and lower_fmod32 into a
      single lower_fmod flag again.  I'm not aware of any hardware which
      need lowering for one bitsize and not the other.
      
      Reviewed-by: default avatarMarek Olšák <marek.olsak@amd.com>
      c7d1b52a
    • Kenneth Graunke's avatar
      nir: Drop lower_fmod64 option. · edd45af9
      Kenneth Graunke authored
      
      nir_lower_doubles offers a wide variety of fp64 lowering, including
      lowering fmod@64.  The version there also better handles imprecisions
      due to lowered frcp@64.  Let's consolidate on one version.
      
      Reviewed-by: default avatarMarek Olšák <marek.olsak@amd.com>
      edd45af9
Loading