- Feb 02, 2020
-
-
Keith Packard authored
This allows applications to specify the minimum time that a presented image must be shown. Times may be specified as frames or ns. Signed-off-by: Keith Packard <keithp@keithp.com>
-
- Feb 01, 2020
-
-
Keith Packard authored
This function will be useful in implementing VK_KHR_present_wait Signed-off-by: Keith Packard <keithp@keithp.com>
-
Keith Packard authored
This adds support for the VK_GOOGLE_display timing extension, which provides two things: 1) Detailed information about when frames are displayed, including slack time between GPU execution and display frame. 2) Absolute time control over swapchain queue processing. This allows the application to request frames be displayed at specific absolute times, using the same timebase as that provided in vblank events. Support for this extension has been implemented for the x11 and display backends; adding support to other backends should be reasonable straightforward for one familiar with those systems and should not require any additional device-specific code. v2: Adjust GOOGLE_display_timing earliest value. The earliestPresentTime for an image cannot be before the previous image was displayed, or even a frame later (in FIFO mode). Make GOOGLE_display_timing use render completed time. Switch from VK_PIPELINE_TOP_OF_PIPE_BIT to VK_PIPELINE_STAGE_ALL_COMMANDS_BIT so that the time reported to applications as the end of rendering reflects the latest possible value to ensure that applications don't underestimate the amount of work done in the frame. v3: Adopt Jason Ekstrand's coding conventions. Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v4: Adapt to changes in MESA_query_timestamp extension v5: Squash core bits and anv/radv wrappers into a single patch Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v6: Switch from MESA_query_timestamp to EXT_calibrated_timestamps v7: Ensure we target frame no earlier than desired. This means rounding the target frame up, rather than selecting the nearest one. Suggested-by: Michel Dänzer <michel@daenzer.net> v8: Re-order display_timing in anv_extensions.py. That code now requires extensions in alphabetical order. Rename wsi_mark_time to wsi_present_complete to make the functionality clearer. Signed-off-by: Keith Packard <keithp@keithp.com>
-
- Jan 29, 2020
-
-
Rhys Perry authored
There seems to be more, these are just the ones found in Detroit: Become Human shaders. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <mesa/mesa!3257> Part-of: <mesa/mesa!3257>
-
Rhys Perry authored
It can be used later and we want any uses to not be fixed to exec, so it's definition can't be fixed to exec because of how exec masks interact with register demand calculation. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!3257>
-
Rhys Perry authored
process_block() will use this to determine the register demand of the before the current instruction. Previously, it was filled with zeroes which could result in process_block() only using the register demand of after the current instruction. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!3257>
-
Rhys Perry authored
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!3257>
-
Rhys Perry authored
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!3257>
-
Rhys Perry authored
This would have caught the liveness error fixed in the previous commit. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!3257>
-
Rhys Perry authored
Otherwise, code like this will be broken: loop { if (...) { break; } else { break; } } The continue_or_break block doesn't have any logical predecessors but it's a logical predecessor of the header block. This liveness error breaks the spiller in init_live_in_vars() (under "keep variables spilled on all incoming paths") and eventually creates garbage reloads. Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!3257>
-
Rhys Perry authored
The operand isn't fixed to exec, which can mess up the spiller. This also adds a new situation where a phi is needed. Fixes dEQP-VK.ssbo.layout.random.descriptor_indexing.2 and an assertion when compiling a Detroit: Become Human shader. Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!3257>
-
Rhys Perry authored
We don't need to update it since it won't be used later. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!3257>
-
Rhys Perry authored
Loops without continues create header blocks with only 1 predecessor. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!3257>
-
Rhys Perry authored
A shader might require vgpr spilling but not require sgpr spilling. In that case, the spiller lowers the sgpr target by 5 which could mean sgpr spilling is then required. Then the vgpr target has to be lowered to make space for the linear vgprs. Previously, space wasn't make for the linear vgprs. Found while testing the spiller on the pipeline-db with a lowered limit Fixes: a7ff1bb5 ('aco: simplify calculation of target register pressure when spilling') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!3257>
-
Samuel Pitoiset authored
Now that NGG GS queries are implemented, it should be safe enough to enable NGG GS by default. It can be disabled with RADV_DEBUG=nongg if necessary. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <mesa/mesa!3380> Part-of: <mesa/mesa!3380>
-
Samuel Pitoiset authored
The number of generated primitives is only counted by the hardware if GS uses the legacy path. For NGG GS, we need to accumulate that value in the NGG GS itself. To achieve that, we use a plain GDS atomic operation. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <mesa/mesa!3380>
-
Samuel Pitoiset authored
For implementing NGG GS queries, we decided to use GDS but GDS OA is only required for NGG streamout. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <mesa/mesa!3380>
-
Michel Dänzer authored
When a BO or amdgpu_screen_winsys is destroyed. Should fix leaking such BOs in other DRM file descriptions. v2: * Pass the correct file descriptor to drmIoctl (Pierre-Eric Pelloux-Prayer) * Use _mesa_hash_table_remove v3: * Close handles in amdgpu_winsys_unref as well v4: * Adapt to amdgpu_winsys::sws_list_lock. Closes: mesa/mesa#2270 Fixes: 11a3679e "winsys/amdgpu: Make KMS handles valid for original DRM file descriptor" Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <mesa/mesa!3582> Part-of: <mesa/mesa!3582>
-
Michel Dänzer authored
Namely, if os_same_file_description determined that the DRM file descriptor references the same file description. v2: * Adapt to amdgpu_winsys::sws_list_lock. v3: * Fix comparison of amdgpu_screen_winsys file descriptions, see #2413 . * Lock amdgpu_winsys::sws_list_lock for traversing the sws_list in amdgpu_winsys_create. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <!3582>
-
Faith Ekstrand authored
The name "desc" shadows another variable. Name it "desc_data" like all of the other descriptor data variables in this file.
-
Faith Ekstrand authored
Because softpin block pools are made up of a set of BOs with different maps, it was possible for a single state to end up straddling blocks. To fix this, we pass a contiguous size to anv_block_pool_grow and it ensures that the next allocation in the pool will have at least that size. We also add an assert in anv_block_pool_map to ensure we always get contiguous maps. Prior to the changes to anv_block_pool_grow, the unit tests failed with this assert. With this patch, the tests pass. This was causing problems on Gen12 where we allocate the pages for the AUX table from the dynamic state pool. The first chunk, which gets allocated very early in the pool's history, is 1MB which was enough that it was getting multiple BOs. This caused the gen_aux_map code to write outside of the map and overwrite the instruction state pool buffer which lead to GPU hangs. Fixes: 731c4adc "anv/allocator: Add support for non-userptr" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
-
Faith Ekstrand authored
We intentionally throw away all but one BT block but then we set cmd_buffer->bt_block to ANV_STATE_NULL instead of the one we hung on to. This causes the command buffer to immediately re-emit STATE_BASE_ADDRESS the first time a BT is needed for no good reason. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
-
Faith Ekstrand authored
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
-
Rhys Perry authored
pipeline-db (ACO): Totals from affected shaders: SGPRS: 29200 -> 29200 (0.00 %) VGPRS: 17372 -> 17372 (0.00 %) Spilled SGPRs: 105 -> 105 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 1406576 -> 1389256 (-1.23 %) bytes LDS: 83 -> 83 (0.00 %) blocks Max Waves: 3976 -> 3976 (0.00 %) pipeline-db (LLVM): Totals from affected shaders: SGPRS: 21320 -> 21320 (0.00 %) VGPRS: 17056 -> 17036 (-0.12 %) Spilled SGPRs: 22 -> 22 (0.00 %) Spilled VGPRs: 503 -> 487 (-3.18 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 396 -> 396 (0.00 %) dwords per thread Code Size: 1441244 -> 1423292 (-1.25 %) bytes LDS: 463 -> 463 (0.00 %) blocks Max Waves: 3609 -> 3611 (0.06 %) v2: add pattern for ishr Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Tested-by: Marge Bot <!2271> Part-of: <!2271>
-
Rhys Perry authored
Fixes compilation of a Battlefront 2 shader with ACO by removing VGPR spilling. The reassociation makes it worse on LLVM though. pipeline-db (ACO): Totals from affected shaders: SGPRS: 10704 -> 10688 (-0.15 %) VGPRS: 18736 -> 18528 (-1.11 %) Spilled SGPRs: 70 -> 70 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 909696 -> 885796 (-2.63 %) bytes LDS: 225 -> 225 (0.00 %) blocks Max Waves: 1115 -> 1129 (1.26 %) pipeline-db (LLVM): Totals from affected shaders: SGPRS: 8472 -> 8424 (-0.57 %) VGPRS: 14284 -> 14368 (0.59 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 442 -> 503 (13.80 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 268 -> 396 (47.76 %) dwords per thread Code Size: 862568 -> 853028 (-1.11 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 971 -> 964 (-0.72 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <mesa/mesa!2271>
-
Samuel Pitoiset authored
Only MTBUF supports vec3. Fixes: 03a0d393 ("aco: use MUBUF in some situations instead of splitting vertex fetches") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <mesa/mesa!3615> Part-of: <mesa/mesa!3615>
-
Rhys Perry authored
If the p_wqm ends up creating copies, these need to be in WQM. Helps (but doesn't completely fix) artifacts in Strange Brigade. The actual issue still exists and is harder to fix. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <mesa/mesa!3273> Part-of: <mesa/mesa!3273>
-
Rhys Perry authored
We want any copies to be in WQM. I don't know if this fixes any real application, but I can create a vkrunner test than reproduces the issue. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!3273>
-
Icecream95 authored
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <mesa/mesa!3566> Part-of: <mesa/mesa!3566>
-
Jonathan Marek authored
At the same time, use pre-HALTI2 to use address register for indirect uniform loads, since integers/LOAD instruction isn't always available. Passes all dEQP-GLES3.functional.ubo.* on GC7000L. GC3000 with an extra flush hack passes most of them, but still fails on some of the cases with many loads. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Marge Bot <mesa/mesa!3389> Part-of: <mesa/mesa!3389>
-
Rob Clark authored
And move to new register builders while we are at it. Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Marge Bot <mesa/mesa!3565> Part-of: <mesa/mesa!3565>
-
Rob Clark authored
Logicop in particular is supposed to work for integer formats.. but maybe this situation doesn't happen in gles. The only thing that isn't required for integer formats is blending. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <mesa/mesa!3565>
-
Rob Clark authored
Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <mesa/mesa!3565>
-
Rob Clark authored
This lets us drop a bunch of special handling for xRGB blend. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <mesa/mesa!3565>
-
Thomas Hellstrom authored
Some piglit tests trigger a map depth assert when debug_flush is active. Fix this by increasing the map depth from 16 to 32. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Marge Bot <mesa/mesa!3614> Part-of: <mesa/mesa!3614>
-
Thomas Hellstrom authored
Newer versions of the device code will make discard DMA uploads sub-optimal. Disable them for guest-backed aware code, where we previously had them conditionally enabled. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <mesa/mesa!3614>
-
Thomas Hellstrom authored
If the kernel supports it, enable transhuge pages for graphics buffer objects. Except for the syscall itself, this is never expected to cause any negative performance implications. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <mesa/mesa!3614>
-
Roland Scheidegger authored
Use the new ioctl for logging (rather than duplicating what the kernel is doing). This way it's also independent from the actual guest/host mechanism to do the logging. Signed-off-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Part-of: <mesa/mesa!3614>
-
Samuel Pitoiset authored
It's no longer true. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <mesa/mesa!3597> Part-of: <mesa/mesa!3597>
-
Samuel Pitoiset authored
https://www.khronos.org/conformance/adopters/conformant-products#submission_472 https://www.khronos.org/conformance/adopters/conformant-products#submission_473 https://www.khronos.org/conformance/adopters/conformant-products#submission_474 Fixes dEQP-VK.api.driver_properties.conformance_version. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <mesa/mesa!3597>
-