- 13 Dec, 2021 20 commits
-
-
Dave Airlie authored
-
Dave Airlie authored
-
Dave Airlie authored
-
Dave Airlie authored
-
Dave Airlie authored
-
Dave Airlie authored
-
Dave Airlie authored
I've been informed that reset command must be sent in CmdControlVideoKHR I've patched nv_video_decoder in my github to handle this. Also drop the decode.
-
Dave Airlie authored
bit more h265 now.
-
Dave Airlie authored
This has some workarounds for some fw interaction issues, 1. emits some passing after the create message to align things otherwise bad things happen. 2. Does some unvulkan like things, the first command buffer recorded on a session is magic and sets the context up on the hw encoder. This would need a fw interface change to avoid. 3. The VK_QUERY_TYPE_STATUS_ONLY_KHR is pure junk here. Not sure how best to do that in this case. The vid dec ring isn't exactly flush with queries, maybe could hack feedback to do it. RADV_VIDEO_DECODE=1 is required to enable this.
-
Dave Airlie authored
-
Dave Airlie authored
-
Dave Airlie authored
This doesn't have user fences and can only take sysmem submit path
-
Dave Airlie authored
-
Dave Airlie authored
-
Dave Airlie authored
This just moves the main regs + fw interface structs to a new shared file.
-
Dave Airlie authored
If we introduce another queue type (video decode) we can have a disconnect between the RADV_QUEUE_ enum and the API queue_family_index. currently the driver has GENERAL, COMPUTE, TRANSFER which would end up at QFI 0, 1, <nothing> since we don't create transfer. Now if I add VDEC we get GENERAL, COMPUTE, TRANSFER, VDEC at QFI 0, 1, <nothing>, 2 or if you do nocompute GENERAL, COMPUTE, TRANSFER, VDEC at QFI 0, <nothing>, <nothing>, 1 This means we have to add a remapping table between the API qfi and the internal qf. This patches tries to do that, in theory right now it just adds overhead, but I'd like to exercise these paths.
-
Dave Airlie authored
The single P reference frame lists has to be sorted by frame number The ref pic list modification is applied for P frames The 0 B list is sorted from curr_poc, descending to it, then ascending to it The 1 B list is sorted from curr_poc, ascending to it, then descending. Currently the ref pic list mods are not applied for B frames
-
Dave Airlie authored
Dealing with reference frames meant peaking inside pNext a lot in the driver, just parser stuff out to a temporary stack array from the driver.
-
Dave Airlie authored
This uses the util code to parse rbsp, and is modelled on the code in the omx frontend, but also from reading the spec. Some sps/pps params aren't supported in vulkan video, so aren't parsed. The slice header details are used by the Intel driver to program the hw.
-
Dave Airlie authored
The video session and video session parameters objects can have a common base class the drivers can inherit from if needed. This creates code to parse the h264/h265 parameter sets into common structs. Updated to add more spec compliance around templated updates.
-
- 11 Dec, 2021 20 commits
-
-
It's not exactly 128 because longer loop bodies scale the number down. This improves perf for VP13/Creo and Piano. Most other tests either didn't show any difference or are CPU-bound. v2: - The lowering passes had to be moved to the optimization loop because unrolling creates lowerable variables. - Piano has some pattern that looks like corruption and the pattern changed with loop unrolling. The pattern is present on other drivers as well. v3: - I removed the Piano test from CI traces because the image is random. The output was wrong even before this MR, and now it's randomly wrong. | PERCENTAGE DELTAS | Shaders | SGPRs | VGPRs |SpillSGPR |SpillVGPR | PrivVGPR | Scratch | CodeSize | MaxWaves | |------------------------|----------|----------|----------|----------|----------|----------|----------|----------|----------| | alien_isolation | 2936| . | 0.02 %| . | . | . | . | 0.83 %| . | | deadcore | 76| 18.47 %| . | . | . | . | . | 167.69 %| . | | deus_ex_mankind_div.. | 1410| 0.10 %| 0.15 %| . | . | . | . | 1.70 %| . | | f1-2015 | 775| 0.37 %| 0.16 %| . | . | . | . | 3.25 %| -0.07 %| | hitman | 1413| 0.10 %| -0.03 %| 6.45 %| . | . | . | 0.61 %| 0.03 %| | metro_2033_redux | 2670| . | . | . | . | . | . | 0.13 %| 0.01 %| | pixmark-piano-0.7.0 | 2| . | 14.29 %| -100.00 %| . | . | . | 78.07 %| -4.76 %| | reflections_subway | 98| -0.53 %| . | . | . | . | . | 7.64 %| . | | thea | 172| 0.12 %| -0.81 %| . | . | . | . | 0.65 %| 0.15 %| | ubershaders | 54| . | . | . | . | . | . | 61.13 %| . | | ue4_effects_cave | 290| 0.05 %| . | . | . | . | . | 2.62 %| . | | vp13-creo | 26| -3.38 %| -4.20 %| . | . | . | . | 88.56 %| 2.62 %| | vp13-sw | 100| -0.36 %| -9.14 %| . | -100.00 %| . | -100.00 %| -17.97 %| 0.39 %| | vp20-creo | 22| -0.82 %| -3.33 %| . | . | . | . | 81.59 %| 1.51 %| | vp20-sw | 296| -4.51 %| -0.63 %| . | . | . | . | 58.93 %| 0.20 %| |------------------------|----------|----------|----------|----------|----------|----------|----------|----------|----------| | All affected | 189| 3.05 %| -2.87 %| 500.00 %| -100.00 %| . | -100.00 %| 135.61 %| 1.32 %| |------------------------|----------|----------|----------|----------|----------|----------|----------|----------|----------| | Total | 57794| 0.01 %| -0.02 %| 0.27 %| -3.13 %| . | -2.89 %| 1.73 %| . | Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> (v1) Part-of: <mesa/mesa!13966>
-
Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <mesa/mesa!13966>
-
Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <mesa/mesa!13966>
-
This generally works well. There are new cases that select Wave32, and there are shader profiles which adjust that. Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <mesa/mesa!13966>
-
We need to set it even if Cache == NULL. Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <mesa/mesa!13966>
-
Acked-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <mesa/mesa!13966>
-
Acked-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <mesa/mesa!13966>
-
debian-vulkan but not any other CI pipeline consistently fails with: FileNotFoundError: [Errno 2] No such file or directory: 'nir_tests.xml' I have to assume that either debian-vulkan is broken, or the NIR test infrastructure is broken. That's not all. I got the same failure when I wanted to add a new test, which means the CI is preventing us from adding new NIR tests, which is a very serious problem with the CI or NIR tests. The python error doesn't imply that it's a test failure, so something else is broken. If you don't want such commits to happen again, print better error messages. See also the discussion in the MR. Part-of: <mesa/mesa!13966>
-
Acked-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <mesa/mesa!13966>
-
When calling glXCopySubBuffer, we must resolve the backbuffer before copying it the frontbuffer. Fixes piglit's glx/glx-copy-sub-buffer on virgl. Signed-off-by:
Italo Nicola <italonicola@collabora.com> Reviewed-by:
Gert Wollny <gert.wollny@collabora.com> Part-of: <mesa/mesa!11714>
-
When a resource is multisampled, we usually submit a multisampling resolving blit before we present it or use it in some other way, but currently we don't always flush the cmd buffer before flushing the frontbuffer, this commit fixes that. Fixes piglit's glx/glx-copy-sub-buffer MSAA cases on vtest, in conjunction with other commits of this series. Signed-off-by:
Italo Nicola <italonicola@collabora.com> Reviewed-by:
Gert Wollny <gert.wollny@collabora.com> Part-of: <mesa/mesa!11714>
-
This is required for glXCopySubBufferMESA to work. Signed-off-by:
Italo Nicola <italonicola@collabora.com> Reviewed-by:
Gert Wollny <gert.wollny@collabora.com> Part-of: <mesa/mesa!11714>
-
The displaytarget's resource stride is alignment is currently 64-bytes, where the shared resource stride is unaligned. Signed-off-by:
Italo Nicola <italonicola@collabora.com> Reviewed-by:
Gert Wollny <gert.wollny@collabora.com> Part-of: <mesa/mesa!11714>
-
Also avoid warnings about asprintf result not being checked. Reviewed-by:
Dylan Baker <dylan@pnwbakers.com> Part-of: <mesa/mesa!14054>
-
Replace a bunch of helper functions for checking results with ones from GTest. Reviewed-by:
Dylan Baker <dylan@pnwbakers.com> Part-of: <mesa/mesa!14054>
-
Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <mesa/mesa!13647>
-
We were setting anv_pipeline::sample_shading_enable based on sampleShadingEnable without looking at minSampleShading. We would then pass this value into nir_lower_wpos_center which would add sample_pos to frag_coord. Then the back-end compiler picks up on the existence of sample_pos and forces persample dispatch. This leads to doing per-sample dispatch whenever sampleShadingEnable = VK_TRUE regardless of the value of minSampleShading. This is almost certainly costing us perf somewhere. Cc: mesa-stable@lists.freedesktop.org Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <mesa/mesa!14022>
-
We can't map the CCS on this platform to initialize it into the PASS_THROUGH state. This can cause issues with optimizations in the driver that rely on this state. For example, after rendering to a surface with AUX_NONE, we can then render to it with AUX_CCS_E without an ambiguate in between (if the CCS in the PASS_THROUGH state). If that state was incorrect and the aux was actually compressed, there can be rendering corruption because the contents may be misinterpreted on the second render. Use a more accurate initial aux state to avoid these issues. One notable change in behavior here is that aux surfaces can be created with fast-cleared blocks even though the caller may specify a modifier that doesn't support fast clears. This should be fine, so long as all HW units that can access these surfaces can handle that bit-pattern. We haven't seen an applicable restriction yet. Reviewed-by:
Jordan Justen <jordan.l.justen@intel.com> Part-of: <mesa/mesa!13555>
-
Among other changes, we highlight the fact that we'll map the CCS - something we can't do on XeHP. Reviewed-by:
Jordan Justen <jordan.l.justen@intel.com> Part-of: <mesa/mesa!13555>
-
The assert was introduced in a function that allocated an auxiliary surface BO, iris_resource_alloc_aux. After refactors, the function it's in now, iris_resource_configure_aux, no longer does this allocation. Drop the assert because its purpose is unclear and it's no longer relevant for CCS on XeHP. Reviewed-by:
Jordan Justen <jordan.l.justen@intel.com> Part-of: <mesa/mesa!13555>
-