- 25 Jan, 2020 20 commits
-
-
Erico Nunes authored
The src mask can't be calculated from the dest write_mask. Instead, it must be calculated from the swizzled operators of the src. Otherwise, liveness calculation may report incorrect live components for non-ssa registers. Signed-off-by:
Erico Nunes <nunes.erico@gmail.com> Reviewed-by:
Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Marge Bot <mesa/mesa!3502> Part-of: <mesa/mesa!3502>
-
Erico Nunes authored
Those were renamed/merged some time ago but it turns out that ppir_op_undef can't be shared. It was being used for undefined ssa operations and for read-before-write operations that may happen to e.g. uninitialized registers (non-ssa) inside a loop. We really don't want to reserve a register for the undef ssa case, but we must reserve and allocate register for the unitialized register case because when it happens inside a loop it may need to hold its value across iterations. This dummy node might be eliminated with a code refactor in ppir in case we are able to emit the write and allocate the ppir_reg before we emit the read. But a major refactor we need this to keep this code to avoid apparent regressions with the new liveness analysis implementation. Signed-off-by:
Erico Nunes <nunes.erico@gmail.com> Reviewed-by:
Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <mesa/mesa!3502>
-
Erico Nunes authored
The ssa doesn't need to be manually added to block->comp->reg_list. Doing so actually causes other registers to be marked as undef=true later. This patch alone fixes a few deqp tests that have undefs. Signed-off-by:
Erico Nunes <nunes.erico@gmail.com> Reviewed-by:
Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <mesa/mesa!3502>
-
Erico Nunes authored
nir can output writes to dead registers when expanding vec4 operations to non-ssa registers. In that case, some components of the vec4 may be assigned but never read. These are also not currently removed by a nir dead code elimination pass as they are not ssa. In order to prevent regalloc from allocating a live register for this operation, an interference must be assigned to it during liveness analysis. This workaround may be removed in the future if the assignments to dead components can be removed earlier in ppir or nir. Signed-off-by:
Erico Nunes <nunes.erico@gmail.com> Reviewed-by:
Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <mesa/mesa!3502>
-
Marek Olšák authored
Fixes: cd5b99c5 - radeonsi: move VS shader code into si_shader_llvm_vs.c Closes: #2416 Tested-by: Marge Bot <mesa/mesa!3561> Part-of: <mesa/mesa!3561>
-
Marek Olšák authored
to fix a crash in is_multi_part_shader. Fixes: 1a0890dc - radeonsi: change prototypes of si_is_multi_part_shader & si_is_merged_shader Part-of: <mesa/mesa!3561>
-
Faith Ekstrand authored
The previous way we were attempting to handle AUX tables on TGL-LP was very GL-like. We used the same aux table management code that's shared with iris and we updated the table on image create/destroy. The problem with this is that Vulkan allows multiple VkImage objects to be bound to the same memory location simultaneously and the app can ping-pong back and forth between them in the same command buffer. Because the AUX table contains format-specific data, we cannot support this ping-pong behavior with only CPU updates of the AUX table. The new mechanism switches things around a bit and instead makes the aux data part of the BO. At BO creation time, a bit of space is appended to the end of the BO for AUX data and the AUX table is updated in bulk for the entire BO. The problem here, of course, is that we can't insert the format-specific data into the AUX table at BO create time. Fortunately, Vulkan has a requirement that every TILING_OPTIMAL image must be initialized prior to use by transitioning the image from VK_IMAGE_LAYOUT_UNDEFINED to something else. When doing the above described ping-pong behavior, the app has to do such an initialization transition every time it corrupts the underlying memory of the VkImage by using it as something else. We can hook into this initialization and use it to update the AUX-TT entries from the command streamer. This way the AUX table gets its format information, apps get aliasing support, and everyone is happy. One side-effect of this is that we disallow CCS on shared buffers. We'll need to fix this for modifiers on the scanout path but that's a task for another patch. We should be able to do it with dedicated allocations. Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Jordan Justen <jordan.l.justen@intel.com> Tested-by: Marge Bot <mesa/mesa!3519> Part-of: <mesa/mesa!3519>
-
Faith Ekstrand authored
All they do now is take a size, align, and flags and figure out which heap to allocate in. All of the actual code to deal with the BO is in anv_allocator.c. We want to leave anv_vma_alloc/free in anv_device.c because it deals with API-exposed heaps so it still makes sense to have it there. Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Jordan Justen <jordan.l.justen@intel.com> Part-of: <mesa/mesa!3519>
-
Faith Ekstrand authored
This commit moves it in with all the other cache invalidation operations as if it were done by PIPE_CONTROL even though it's a pair of register writes. This means we only have to write the GFX_AUX_TABLE_BASE_ADDR register once at device initialization instead of every invalidate. Invalidates are now a single LRI instead of two. Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Jordan Justen <jordan.l.justen@intel.com> Part-of: <mesa/mesa!3519>
-
Faith Ekstrand authored
Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Jordan Justen <jordan.l.justen@intel.com> Part-of: <mesa/mesa!3519>
-
Faith Ekstrand authored
Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Jordan Justen <jordan.l.justen@intel.com> Part-of: <mesa/mesa!3519>
-
Faith Ekstrand authored
We compute the same thing with the same variable name at the top of the function. Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Jordan Justen <jordan.l.justen@intel.com> Part-of: <mesa/mesa!3519>
-
Faith Ekstrand authored
This breaks add_mapping() into three pieces: 1. get_aux_entry() adds AUX-TT pages as needed and returns the L1 entry index, L1 entry address, and L1 entry map. 2. gen_aux_map_format_bits_for_isl_surf() computes the format- specific information that goes in the AUX-TT entry. 3. add_mapping() is a lot dumber function that now just adds the requested mapping with the requested format bits. This lets us break out some additional helpers in the API which we want to use for more direct AUX-TT management in ANV. Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Jordan Justen <jordan.l.justen@intel.com> Part-of: <mesa/mesa!3519>
-
Faith Ekstrand authored
Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Jordan Justen <jordan.l.justen@intel.com> Part-of: <mesa/mesa!3519>
-
Marek Olšák authored
Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Tested-by: Marge Bot <mesa/mesa!2929> Part-of: <mesa/mesa!2929>
-
Marek Olšák authored
Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <mesa/mesa!2929>
-
Marek Olšák authored
Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <mesa/mesa!2929>
-
Marek Olšák authored
Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <mesa/mesa!2929>
-
Marek Olšák authored
Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <mesa/mesa!2929>
-
Marek Olšák authored
Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <mesa/mesa!2929>
-
- 24 Jan, 2020 20 commits
-
-
Caio Oliveira authored
Fixes: 58907568 ("intel/fs: Add SHADER_OPCODE_[IU]SUB_SAT pseudo-ops") Reviewed-by:
Eric Anholt <eric@anholt.net> Reviewed-by:
Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <mesa/mesa!3558> Part-of: <mesa/mesa!3558>
-
Caio Oliveira authored
Pass down stencil data from the subpass attachment like we do elsewhere. Only stencil attachments will make use of it. Fixes warnings like ../src/intel/vulkan/genX_cmd_buffer.c: In function ‘cmd_buffer_begin_subpass’: ../src/intel/vulkan/genX_cmd_buffer.c:4656:41: warning: ‘target_stencil_layout’ may be used uninitialized in this function [-Wmaybe-uninitialized] 4656 | att_state->current_stencil_layout = target_stencil_layout; | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~ Reviewed-by:
Jason Ekstrand <jason@jlekstrand.net> Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <mesa/mesa!3557> Part-of: <mesa/mesa!3557>
-
Faith Ekstrand authored
Now that aux_usage has a unified meaning, aux_usage == NONE if and only if aux_surface.isl.size_B > 0. In most of these cases, the question we're asking is "does have compression?" and not "have we allocated an aux surface for compression?". Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <mesa/mesa!3556> Part-of: <mesa/mesa!3556>
-
Faith Ekstrand authored
Previously, we set aux_usage=ISL_AUX_USAGE_NONE when we really meant CCS_D. This sort-of made sense before we had anv_layout_to_aux_usage but now that we have that helper. However, in our more modern aux tracking model, all aux usage goes through anv_layout_to_* and we're better off making the meaning of anv_image::planes[]::aux_usage be AUX_USAGE_NONE if and only if there is no compression. Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <mesa/mesa!3556>
-
Samuel Pitoiset authored
This is confusing otherwise. Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <mesa/mesa!3553> Part-of: <mesa/mesa!3553>
-
Faith Ekstrand authored
Reviewed-by:
Kenneth Graunke <kenneth@whitecape.org> Tested-by: Marge Bot <mesa/mesa!3547> Part-of: <mesa/mesa!3547>
-
Faith Ekstrand authored
Reviewed-by:
Kenneth Graunke <kenneth@whitecape.org> Part-of: <mesa/mesa!3547>
-
Faith Ekstrand authored
Instead of emitting g[a0]UD for the indirect descriptor, emit a0<0>UD. This is more correct because there is no GRF involved. Reviewed-by:
Kenneth Graunke <kenneth@whitecape.org> Part-of: <mesa/mesa!3547>
-
Faith Ekstrand authored
The instruction encoding for SENDS changed on Gen12 and it now supports embedding the entire extended message descriptor in the instruction if it's an immediate. Stop falling back to doing an indirect SEND just because we had something in [15:12] of ex_desc.ud. Reviewed-by:
Kenneth Graunke <kenneth@whitecape.org> Part-of: <mesa/mesa!3547>
-
Faith Ekstrand authored
This commit makes two changes: 1. We set pending_pipe_bits instead of emitting PIPE_CONTROL directly for the flush at the end of cmd_buffer_begin_subpass. 2. Because BLORP ops such as vkCmdClearAttachments may come in the middle of a render pass, we have to also flag the need for a cache flush after the blorp op. Fixes: 185630c6 "anv/blorp: Do the gen11 BTI flush" Reviewed-by:
Kenneth Graunke <kenneth@whitecape.org> Part-of: <mesa/mesa!3547>
-
Alyssa Rosenzweig authored
../src/gallium/drivers/panfrost/pan_context.c: In function ‘panfrost_draw_vbo’: ../src/gallium/drivers/panfrost/pan_context.c:1551:70: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] ctx->payloads[PIPE_SHADER_FRAGMENT].prefix.indices = (u64) NULL; ^ Signed-off-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reported-by:
Icecream95 <ixn@keemail.me> Tested-by: Marge Bot <mesa/mesa!3543> Part-of: <mesa/mesa!3543>
-
Alyssa Rosenzweig authored
../src/panfrost/pandecode/decode.c: In function ‘pandecode_compute_fbd’: ../src/panfrost/pandecode/decode.c:789:35: warning: taking address of packed member of ‘struct mali_compute_fbd’ may result in an unaligned pointer value [-Waddress-of-packed-member] 789 | pandecode_u32_slide(num, s->unknown ## num, ARRAY_SIZE(s->unknown ## num)) | ~^~~~~~~~~ ../src/panfrost/pandecode/decode.c:800:9: note: in expansion of macro ‘SHORT_SLIDE’ 800 | SHORT_SLIDE(1); | ^~~~~~~~~~~ Signed-off-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!3543>
-
Alyssa Rosenzweig authored
Unused at the moment. ../src/panfrost/midgard/midgard_compile.c:124:29: warning: ‘m_pack_colour’ defined but not used [-Wunused-function] 124 | static midgard_instruction m_##name(unsigned ssa, unsigned address) { \ | ^~ ../src/panfrost/midgard/midgard_compile.c:145:22: note: in expansion of macro ‘M_LOAD_STORE’ 145 | #define M_LOAD(name) M_LOAD_STORE(name, false) | ^~~~~~~~~~~~ ../src/panfrost/midgard/midgard_compile.c:213:1: note: in expansion of macro ‘M_LOAD’ 213 | M_LOAD(pack_colour); | ^~~~~~ Signed-off-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!3543>
-
Alyssa Rosenzweig authored
Fixes ../src/panfrost/pandecode/decode.c: In function ‘pandecode_jc’: ../src/panfrost/pandecode/decode.c:2859:14: warning: variable ‘last_size’ set but not used [-Wunused-but-set-variable] 2859 | bool last_size; Signed-off-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!3543>
-
Alyssa Rosenzweig authored
Fixes ../src/panfrost/pandecode/public.h:53:33: warning: ‘enum mali_exception_access’ declared inside parameter list will not be visible outside of this definition or declaration Signed-off-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!3543>
-
Samuel Pitoiset authored
CTS should pass, as well as Crucible and the few number of Piglit tests. List of game benchmarks tested: - Dawn of War 3 - Serious Sam 2017 - Shadow of The Tomb Raider - The Talos Principle - Thrones of Britannia - Total Warhammer 2 - Total War: Three Kingdoms Note that F12017 hangs with or without ACO on GFX6 at the moment. My whole pipelinedb (~30 games) doesn't trigger any compiler crashes. Closes: mesa/mesa#2401 Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <mesa/mesa!3533> Part-of: <mesa/mesa!3533>
-
Samuel Pitoiset authored
GFX6 only supports up to 8-bit for the literal offset, so make sure it's copied to a temporary SGPR before emitting a SMEM instruction. The optimizer will propagate the literal offset if possible anyways. Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!3533>
-
Samuel Pitoiset authored
It's required to insert 1 wait state if the dst VGPR of any v_interp_* is followed by a read with v_readfirstlane or v_readlane to fix GPU hangs on GFX6. Note that v_writelane_* is apparently not affected. This hazard isn't documented anywhere but AMD confirmed it. This fixes a GPU hang with the texturemipmapgen Sascha demo on GFX6. Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!3533>
-
Samuel Pitoiset authored
GFX6 (except OLAND and HAINAN) has a bug that it only looks at the X writemask component. Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!3533>
-
Brian Ho authored
Use CP_COND_EXEC and CP_COND_WRITE to conditionally copy the results of a query to a buffer based off the query's availability. Fixes: #2238 Tested-by: Marge Bot <mesa/mesa!3279> Part-of: <mesa/mesa!3279>
-