Skip to content
Snippets Groups Projects
  1. Jul 09, 2019
  2. Jul 05, 2019
  3. Jul 04, 2019
  4. Jul 03, 2019
    • Caio Oliveira's avatar
      spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup · 95cfcc3b
      Caio Oliveira authored and Juan A. Suárez's avatar Juan A. Suárez committed
      
      From OpPtrAccessChain description in the SPIR-V spec (1.4 rev 1):
      
          For objects in the Uniform, StorageBuffer, or PushConstant storage
          classes, the element’s address or location is calculated using a
          stride, which will be the Base-type’s Array Stride when the Base
          type is decorated with ArrayStride. For all other objects, the
          implementation will calculate the element’s address or location.
      
      For non-CL shaders the driver should layout the Workgroup storage
      class, so override any explicitly set ArrayStride in the shader.  This
      currently fixes only the lower_workgroup_access_to_offsets case, which
      is used by anv.
      
      Reviewed-by: default avatarJuan A. Suarez <jasuarez@igalia.com>
      (cherry picked from commit 050eb638)
      95cfcc3b
  5. Jul 02, 2019
  6. Jul 01, 2019
  7. Jun 28, 2019
    • Samuel Pitoiset's avatar
      radv: only enable VK_AMD_gpu_shader_{half_float,int16} on GFX9+ · adbf808e
      Samuel Pitoiset authored and Juan A. Suárez's avatar Juan A. Suárez committed
      
      These two extensions are supported on GFX8 but the throughput
      of 16-bit floats/integers is same as 32-bit. Also, shaderInt16
      is only enabled on GFX9+ for the same reason, be more consistent.
      
      This fixes a crash with Wolfenstein II because it expects
      shaderInt16 to be enabled when VK_AMD_gpu_shader_half_float is
      exposed. Note that AMDVLK only enables these extensions on GFX9+.
      
      Cc: 19.1 <mesa-stable@lists.freedesktop.org>
      Signed-off-by: default avatarSamuel Pitoiset <samuel.pitoiset@gmail.com>
      Reviewed-by: default avatarBas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
      (cherry picked from commit ef1787db)
      [Juan A. Suarez: resolve trivial conflicts]
      Signed-off-by: default avatarJuan A. Suarez Romero <jasuarez@igalia.com>
      
      Conflicts:
      	src/amd/vulkan/radv_extensions.py
      adbf808e
    • Kenneth Graunke's avatar
      gallium: Make util_copy_image_view handle shader_access · d6b1b915
      Kenneth Graunke authored and Juan A. Suárez's avatar Juan A. Suárez committed
      
      A while back, we added a new field, but failed to update the copier.
      I believe iris is the only current user of the new field, and it hasn't
      used the copier, so noone noticed.
      
      Fixes: 8b626a22 st/mesa: Record shader access qualifiers for images
      Reviewed-by: default avatarTimothy Arceri <tarceri@itsqueeze.com>
      (cherry picked from commit 255c71ec)
      d6b1b915
    • Nanley Chery's avatar
      isl: Don't align phys_level0_sa by block dimension · 211bedcf
      Nanley Chery authored and Juan A. Suárez's avatar Juan A. Suárez committed
      
      Aligning phys_level0_sa by the compression block dimension prior to
      mipmap layout causes the layout of compressed surfaces to differ from
      the sampler's expectations in certain cases. The hardware docs agree:
      
      From the BDW PRM, Vol. 5, Compressed Mipmap Layout,
      
         The compressed mipmaps are stored in a similar fashion to
         uncompressed mipmaps [...]
      
         The following exceptions apply to the layout of compressed (vs.
         uncompressed) mipmaps:
            * [...]
            * The dimensions of the mip maps are first determined by applying
      	the sizing algorithm presented in Non-Power-of-Two Mipmaps
      	above. Then, if necessary, they are padded out to compression
      	block boundaries.
      
      The last bullet indicates that alignment should not be done for
      calculating a miplevel's dimensions, but rather for determining miplevel
      placement/padding. Comply with this text by removing the extra
      alignment.
      
      Fixes some fbo-generatemipmap-formats piglit failures on all tested
      platforms (SNB-KBL).
      
      v2:
      - Note fixed platforms.
      - Update some consumers via a helper function.
      
      Cc: <mesa-stable@lists.freedesktop.org>
      Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      (cherry picked from commit 02f6995d)
      211bedcf
    • Nanley Chery's avatar
      intel: Add and use helpers for level0 extent · eef57b81
      Nanley Chery authored and Juan A. Suárez's avatar Juan A. Suárez committed
      
      Prepare for a bug fix by adding and using helpers which convert
      isl_surf::logical_level0_px and isl_surf::phys_level0_sa to units of
      surface elements.
      
      v2:
      - Update iris (Ken).
      - Update anv.
      
      Cc: <mesa-stable@lists.freedesktop.org>
      Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      (cherry picked from commit fb1350c7)
      eef57b81
    • Kenneth Graunke's avatar
      iris: Enable PIPE_CAP_SURFACE_REINTERPRET_BLOCKS · 97b43a81
      Kenneth Graunke authored and Juan A. Suárez's avatar Juan A. Suárez committed
      This makes CompressedTexSubImage from a PBO source do proper GPU
      rendering to upload instead of stalling to map the PBO source on
      the CPU (then copying it on the CPU).
      
      Thanks Bas Nieuwenhuizen for pointing out that Vulkan includes this
      functionality, and to Jason Ekstrand for writing the code I adapted.
      Vulkan only supports a single layer, however, and this code tries to
      support multiple layers as long as it's miplevel 0.
      
      Improves performance in Sid Meier's Civilization VI:
      
         Average frame time (ms):         -3.67423% +/- 1.46201% (n=5)
         99th percentile frame time (ms): -5.09910% +/- 3.87874% (n=5)
      
      (cherry picked from commit a032a966)
      97b43a81
    • Dylan Baker's avatar
      meson: Add support for using cmake for finding LLVM · 421aa4d1
      Dylan Baker authored and Juan A. Suárez's avatar Juan A. Suárez committed
      
      Meson has support for using cmake as a finder for some dependencies,
      including LLVM. Using cmake has a lot of advantages: it needs less meson
      maintenance to keep working (even for llvm updates); it works more
      sanely for cross compiles (as llvm-config is a compiled binary not a
      shell script). Meson 0.51.0 also has a new generic variable getter that
      can be used to get information from either cmake, pkg-config, or
      config-tools dependencies, which is needed for cmake. We continue to
      support using llvm-config if you don't have cmake installed, or if cmake
      cannot find a suitable version.
      
      Fixes: 0d594594
             ("meson: Force the use of config-tool for llvm")
      Reviewed-by: default avatarEric Engestrom <eric.engestrom@intel.com>
      (cherry picked from commit 5157a427)
      421aa4d1
    • Lionel Landwerlin's avatar
      intel/compiler: fix derivative on y axis implementation · a0a6df95
      Lionel Landwerlin authored and Juan A. Suárez's avatar Juan A. Suárez committed
      
      This rewrites the ddy in EXECUTE_4 mode with a loop to make it more
      obvious what is going on and also sets the group each of the 4 threads
      in the groups are supposed to execute.
      
      Fixes the following CTS tests :
      
         dEQP-VK.glsl.derivate.dfdyfine.dynamic_*
      
      Signed-off-by: default avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Co-Authored-by: default avatarJason Ekstrand <jason@jlekstrand.net>
      Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>
      Fixes: 2134ea38 ("intel/compiler/fs: Implement ddy without using align16 for Gen11+")
      (cherry picked from commit 83622584)
      a0a6df95
  8. Jun 26, 2019
    • Sagar Ghuge's avatar
      glsl: Fix round64 conversion function · 6fbe0eea
      Sagar Ghuge authored and Juan A. Suárez's avatar Juan A. Suárez committed
      
      Fix round64 function to handle round to nearest even cases specially
      with positive and negative numbers with fraction part 0.5.
      
      v2: 1) Simplify unused bits (Elie Tournier)
      
      Fixes:
         KHR-GL45.gpu_shader_fp64.builtin.round_dvec2
         KHR-GL45.gpu_shader_fp64.builtin.round_dvec3
         KHR-GL45.gpu_shader_fp64.builtin.round_dvec4
         KHR-GL45.gpu_shader_fp64.builtin.roundeven_double
         KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec2
         KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec3
         KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec4
      
      Signed-off-by: Sagar Ghuge's avatarSagar Ghuge <sagar.ghuge@intel.com>
      Reviewed-by: default avatarElie Tournier <elie.tournier@collabora.com>
      Acked-by: default avatarAnuj Phogat <anuj.phogat@gmail.com>
      (cherry picked from commit 06807e19)
      6fbe0eea
    • Sergii Romantsov's avatar
      i965: leaking of upload-BO with push constants · 3e1c46f2
      Sergii Romantsov authored and Juan A. Suárez's avatar Juan A. Suárez committed
      
      In case of any enabled VS members from: uses_firstvertex,
      uses_baseinstance, uses_drawid, uses_is_indexed_draw
      leaks may happens.
      Call gen6_upload_push_constants allocates
      stage_stat->push_const_bo. It than takes pointer from
      push_const_bo to draw_params_bo (in the call
      brw_prepare_shader_draw_parameters by brw_upload_data)
      and do reference which finally haven't got unreferenced.
      
      Fixes leak:
       136 bytes in 1 blocks are definitely lost in loss record 6 of 13
          at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
          by 0xC2B64B7: bo_alloc_internal (brw_bufmgr.c:596)
          by 0xC2B6748: brw_bo_alloc (brw_bufmgr.c:672)
          by 0xC314BB3: brw_upload_space (intel_upload.c:88)
          by 0xC2EBBC5: gen6_upload_push_constants (gen6_constant_state.c:155)
          by 0xC9E4FA6: gen9_upload_vs_push_constants (genX_state_upload.c:3300)
          by 0xC2E0EDA: check_and_emit_atom (brw_state_upload.c:540)
          by 0xC2E0EDA: brw_upload_pipeline_state (brw_state_upload.c:659)
          by 0xC2E0FF1: brw_upload_render_state (brw_state_upload.c:681)
          by 0xC2C5D2D: brw_draw_single_prim (brw_draw.c:1052)
          by 0xC2C62CB: brw_draw_prims (brw_draw.c:1175)
          by 0xC488AD1: vbo_exec_vtx_flush (vbo_exec_draw.c:386)
          by 0xC485270: vbo_exec_FlushVertices_internal (vbo_exec_api.c:652)
      
      Reviewed-by: default avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Reported-by: default avatarYevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
      Signed-off-by: default avatarSergii Romantsov <sergii.romantsov@globallogic.com>
      (cherry picked from commit 1931c97a)
      3e1c46f2
    • Faith Ekstrand's avatar
      anv/descriptor_set: Only write texture swizzles if we have an image view · 77962816
      Faith Ekstrand authored and Juan A. Suárez's avatar Juan A. Suárez committed
      When immutable samplers are set we call write_image_view with a NULL
      image view.  This causes issues on IVB where we have to fake texture
      swizzling.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110999
      Fixes: d2aa65eb "anv: Emulate texture swizzle in the shader when..."
      (cherry picked from commit 0a364a4a)
      77962816
  9. Jun 25, 2019
Loading