- Nov 14, 2016
-
-
Emil Velikov authored
Signed-off-by:
Emil Velikov <emil.velikov@collabora.com>
-
Emil Velikov authored
Signed-off-by:
Emil Velikov <emil.velikov@collabora.com>
-
This is a port of commit a4a59172: Add guards to prevent dereferencing NULL dynamic pipeline state. Asserts of pCreateInfo members are moved to the earliest points at which they should not be NULL. This fixes a segfault, related to pColorBlendState, seen in Talos Principle which I've observed after startup is completed and when exiting the menus, depending on when Vulkan rendering is selected. v2: moved the NULL check in radv_pipeline_init_blend_state to after the declarations. Acked-by:
Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit 9b121512)
-
In the event that multiple threads attempt to install a graph concurrently, protect the shared list. Signed-off-by:
Steven Toth <stoth@kernellabs.com> Reviewed-by:
Brian Paul <brianp@vmware.com> Reviewed-by:
Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit 381edca8)
-
We're missing the closedir() to the matching opendir(). Signed-off-by:
Steven Toth <stoth@kernellabs.com> Reviewed-by:
Brian Paul <brianp@vmware.com> Reviewed-by:
Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit 5a583230)
-
Instead of trying to maintain a reference counted list of valid HUD objects, and freeing them accordingly, creating race conditions between unanticipated multiple threads, simply accept they're allocated once and never released until the process terminates. They're a shared resource between multiple threads, so accept they're always available for use. Signed-off-by:
Steven Toth <stoth@kernellabs.com> Reviewed-by:
Brian Paul <brianp@vmware.com> Reviewed-by:
Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit 6ffed086)
-
- Nov 11, 2016
-
-
We had missed a bit of errata - PS scratch needs to be computed as if there were 4 subslices per slice, rather than 3. Skylake Broxton Kabylake GT1 GT2 GT3 GT4 2x6 3x6 GT1 GT1.5 GT2 GT3 GT4 Actual Slices 1 1 2 3 1 1 1 1 1 2 3 Total Subslices 3 3 6 9 2 3 2 3 3 6 9 Subsl. for PS Scratch 4 4 8 12 4 4 4 4 4 8 12 Note that Skylake GT1-3 already worked because we allocated 64 * 9 (trying to use a value that would work on GT4, with 9 subslices), and the actual required values were 64 * 4 or 64 * 8. However, all others (Skylake GT4, Broxton, and Kabylake GT1-4) underallocated, which can lead to scratch writes trashing random process memory, and rendering corruption or GPU hangs. Fixes GPU hangs and rendering corruption on Skylake GT4 in shaders that spill. Particularly, dEQP-GLES31.functional.ubo.all_per_block_buffers.* now runs successfully with no hangs and renders correctly. This may fix problems on Broxton and Kabylake as well. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by:
Kenneth Graunke <kenneth@whitecape.org> Reviewed-by:
Ben Widawsky <ben@bwidawsk.net> (cherry picked from commit aaee3daa)
-
This fixes hangs in Dota2 Reviewed-by:
Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit a6c3d0f9)
-
Reviewed-by:
Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit 1e3e347f)
-
- Nov 10, 2016
-
-
Emil Velikov authored
Port of the anv commit d96345de ("anv: Suffix the intel_icd file with the host CPU"). v2: s/intel_icd/radeon_icd/ in commit summary (Gražvydas) Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by:
Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com> (IRC) (cherry picked from commit 0f434a68) Squashed with commit: radv: automake: list correct file in the EXTRA_DIST Earlier commit renamed the file radeon_icd.json{,.in} but missed one reference of the file - in EXTRA_DIST. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Fixes: 0f434a68 ("radv: Suffix the radeon_icd file with the host CPU") Signed-off-by:
Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit b359f624)
-
Emil Velikov authored
Analogous to previous commit. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by:
Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com> (IRC) (cherry picked from commit abe110df)
-
Emil Velikov authored
Vulkan has introduced the consept of .specVersion which can be used to attribute changes of the said extension. The current loader does not check the value, thus it have gone unnoticed that the driver exposes an old version of the following extensions: VK_KHR_xcb_surface (Rev 6) VK_KHR_xlib_surface (Rev 6) VK_KHR_wayland_surface (Rev 5) - Updated the surface create function to take a pCreateInfo structure VK_KHR_swapchain (Rev 68) - Moved the "validity" include for vkAcquireNextImage to be in its proper place, after the prototype and list of parameters. ... According to the documentation: * pname:specVersion is the version of this extension. It is an integer, incremented with backward compatible changes. Based on the history of vk.xml the above (latest) revision has been available since Vulkan 1.0 so even if they were any backwards incompatible change(s) [as hinted by the revision log] those should be safe. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by:
Emil Velikov <emil.velikov@collabora.com> Reviewed-by:
Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit f373a91a)
-
Emil Velikov authored
The use of regparm causes an error on arm/arm64 builds with clang. fastcall is allowed, but still throws a warning. As both options only have effect on 32-bit x86 builds, limit them to that case. v2: keep the __i386__ within GCC (Nicolai) Cc: 13.0 <mesa-stable@lists.freedesktop.org> Cc: Rob Herring <robh@kernel.org> Cc: Nicolai Hähnle <nhaehnle@gmail.com> Signed-off-by:
Emil Velikov <emil.velikov@collabora.com> Reviewed-by:
Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by:
Marek Olšák <marek.olsak@amd.com> Reviewed-by:
Rob Herring <robh@kernel.org> (cherry picked from commit 190bae76)
-
- Nov 09, 2016
-
-
if a fence is created pre-signaled we should return that in GetFenceStatus even if it hasn't been submitted. Reviewed-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by:
Gustaw Smolarczyk <wielkiegie@gmail.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by:
Dave Airlie <airlied@redhat.com> (cherry picked from commit fb50245a)
-
This fixes a bunch of GPU hangs introduced in some CTS tests like dEQP-VK.memory.pipeline_barrier.host_write_uniform_buffer.65536 It works around an issue seen in the LLVM backend, but also makes the radv code work more like the radeonsi stack. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by:
Dave Airlie <airlied@redhat.com> (cherry picked from commit 3c9af757)
-
This is ported from GLSL and converts if (cond) discard; into discard_if(cond); This removes a block, but also is needed by radv to workaround a bug in the LLVM backend. v2: handle if (a) discard_if(b) (nha) cleanup and drop pointless loop (Matt) make sure there are no dependent phis (Eric) v3: make sure only one instruction in the then block. v4: remove sneaky tabs, add cursor init (Eric) Reviewed-by:
Eric Anholt <eric@anholt.net> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by:
Dave Airlie <airlied@redhat.com> (cherry picked from commit b16dff2d)
-
We are going to start lowering to this in NIR code, so prepare radv for it. v2: handle conversion to kilp properly (nha) Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by:
Dave Airlie <airlied@redhat.com> (cherry picked from commit dd77faec)
-
Since our surface state buffer is shared by all batches, the kernel does a full stall and sync with the CPU between batches every time we call execbuf2 because it refuses to do relocations on an active buffer. Doing them in userspace and passing the NO_RELOC flag to the kernel allows us to perform the relocations without stalling. This improves the performance of Dota 2 by around 30% on a Sky Lake GT2. v2 (Jason Ekstrand): - Better comments (Chris Wilson) - Fixed write_reloc for correct canonical form (Chris Wilson) v3 (Jason Ekstrand): - Skip relocations which aren't needed - Provide an environment variable to always use the kernel - More comments about correctness (Chris Wilson) v4 (Jason Ekstrand): - More comments (Chris Wilson) v5 (Jason Ekstrand): - Rebase on top of moving execbuf2 setup go QueueSubmit Signed-off-by:
Jason Ekstrand <jason@jlekstrand.net> Reviewed-by:
Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit b3a29f2e)
-
Ever since the early days of the Vulkan driver, we've been setting up the lists of relocations at EndCommandBuffer time. The idea behind this was to move some of the CPU load out of QueueSubmit which the client is required to lock around and into command buffer building which could be done in parallel. Then QueueSubmit basically just becomes a bunch of execbuf2 calls. Technically, this works. However, when you start to do more in QueueSubmit than just execbuf2, you start to run into problems. In particular, if a block pool is resized between EndCommandBuffer and QueueSubmit, the list of anv_bo's and the execbuf2 object list can get out of sync. This can cause problems if, for instance, you wanted to do relocations in userspace. Signed-off-by:
Jason Ekstrand <jason@jlekstrand.net> Reviewed-by:
Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit 8b61c570)
-
The original reason for putting it in the batch_bo was to allow primaries to share it across secondaries or something like that. However, the relocation lists in secondary command buffers are are always left alone and copied into the primary command buffer's relocation list. This means that the offset really applies at the command buffer level and putting it in the batch_bo doesn't make sense. This fixes a couple of potential bugs around re-submission of command buffers that are not likely to be hit but are bugs none the less. Signed-off-by:
Jason Ekstrand <jason@jlekstrand.net> Reviewed-by:
Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit 595400d5)
-
This commit adds a little helper struct for storing everything we use to build an execbuf2 call. Since the add_bo function really has nothing to do with a command buffer, it makes sense to break it out a bit. This also reduces some of the churn in the next commit. Signed-off-by:
Jason Ekstrand <jason@jlekstrand.net> Reviewed-by:
Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit 0fe68294)
-
The old version wasn't properly handling large addresses where we have to sign-extend to get it into the "canonical form" expected by the hardware. Also, the new version is capable of doing a clflush of the newly written reloc if requested. Signed-off-by:
Jason Ekstrand <jason@jlekstrand.net> Reviewed-by:
Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit 095c48a4)
-
Since -1 is an invalid GPU address, this lets us know whether or not we have a valid address for a buffer. We don't get a valid address until the first time that buffer is used in an execbuf2 ioctl. Signed-off-by:
Jason Ekstrand <jason@jlekstrand.net> Reviewed-by:
Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit d46bfb62)
-
The previous implementation was being overly clever and using the anv_bo::size field as its mutex. Scratch pool allocations don't happen often, will happen at most a fixed number of times, and never happen in the critical path (they only happen in shader compilation). We can make this much simpler by just using the device mutex. This also means that we can start using anv_bo_init_new directly on the bo and avoid setting fields one-at-a-time. Signed-off-by:
Jason Ekstrand <jason@jlekstrand.net> Reviewed-by:
Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit bd0f8d50)
-
This ensures that we're always setting all of the fields in anv_bo Signed-off-by:
Jason Ekstrand <jason@jlekstrand.net> Reviewed-by:
Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit 6283b6d5)
-
Because our relocation processing happens at EndCommandBuffer time and because RENDER_SURFACE_STATE objects may be shared by batches, we really have no clue whatsoever what address is actually written to the relocation offset in the BO. We need to stop making such claims to the kernel and just let it relocate for us. Signed-off-by:
Jason Ekstrand <jason@jlekstrand.net> Reviewed-by:
Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit ba1eea4f)
-
This puts the actual execbuf2 call in anv_batch_chain.c along with the other relocation stuff. Signed-off-by:
Jason Ekstrand <jason@jlekstrand.net> Reviewed-by:
Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit db9f4b2a)
-
This wrapper ensures that we always update all anv_bo::offset fields based on the offsets returned by the kernel. Signed-off-by:
Jason Ekstrand <jason@jlekstrand.net> Reviewed-by:
Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit 07798c9c)
-
This patch should have been the part of commit e592f7df. In a situation when there are multiple render targets with alpha testing enabled, if fragment shader doesn't write to draw buffer zero, it causes the GPU hang on SKL. No GPU hang is seen on HSW. Simulator gives a warning for all gen6+ h/w: "Illegal render target write message length 0xa expected 0xc" This patch fixes the GPU hang as well as the simulator warning with new piglit test fbo-mrt-alphatest-no-buffer-zero-write: https://patchwork.freedesktop.org/patch/118212 No regressions in Jenkins CI system. Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by:
Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by:
Ben Widawsky <ben@bwidawsk.net> (cherry picked from commit b9df2251)
-
I was getting a random GPU hang in the renderpass simple tests, it turns out sometimes radv emitted the wrong thing "last". This fixes the logic to emit Z/stencil last if they occur, and not mark a color output as last. Also this relies on the Z/STENCIL being the first two fragment outputs, which they are so yay. Fixes: dEQP-VK.renderpass.simple.color_depth (random hangs) Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by:
Dave Airlie <airlied@redhat.com> (cherry picked from commit bafc75b4)
-
At least on Sky Lake, after emitting 3DSTATE_CONSTANT_*, you are required to re-emit the 3DSTATE_BINDING_TABLE_POINTERS packet for the corresponding stage. If you don't, double-buffering may fail and you may get the wrong constants. It turns out that you need to do this even if you have no push constants to speak of or else the next 3DSTATE_CONSTANT packet you emit for that stage may not work correctly. Signed-off-by:
Jason Ekstrand <jason@jlekstrand.net> Reviewed-by:
Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit 406cd9d1)
-
The 1/W was apparently not accurate enough, and we were getting sparklies in the distance. The closed driver also did a N-R step here. Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit 283d4d18)
-
A (latent) bug in VDPAU interop was exposed by commit e5cc84dd. Before that commit, the st_vdpau code created samplers with first_layer == last_layer == 1 that the general texture handling code would immediately delete and re-create, because the layer does not match the information in the GL texture object. This was correct behavior at least in the DMABUF case, because the imported resource is supposed to have the correct offset already applied. In the non-DMABUF case, this was just plain wrong but apparently nobody noticed. After that commit, the state tracker assumes that an existing sampler is correct at all times. Existing samplers are supposed to be deleted when they may become invalid, and they will be created on-demand. This meant that the sampler with first_layer == last_layer == 1 stuck around, leading to rendering artefacts (on radeonsi), command stream failures (on r600), and assertions (in debug builds everywhere). This patch fixes the problem by simply not creating a sampler at all in st_vdpau_map_surface. We rely on the generic texture code to do the right thing, adding the layer_override to make the non-DMABUF case work. v2: add the layer_override Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98512 Cc: 13.0 <mesa-stable@lists.freedesktop.org> Cc: Christian König <deathsimple@vodafone.de> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by:
Christian König <christian.koenig@amd.com> (cherry picked from commit 322483f7)
-
This reverts commit d180de35. This is a radeon specific hack that causes problems on nouveau when combined with the SHARED flag later. If radeonsi needs a fix for this, please fix it in the driver. [chk] Using linear surfaces for this makes sense because tilling isn't beneficial and the surfaces can potentially be shared with other GPUs using the VDPAU OpenGL interop. [airlied] I think we need a flag that isn't SHARED/LINEAR that is more SHARED_OTHER_GPU. [mareko] Does radeonsi need PIPE_BIND_VIDEO_DECODE_OUTPUT that it would translate into linear ? [mareko] My only concern is decoding performance. If the decoder works in 64x1 blocks, tiling will hurt. That's the theory. I don't know how the decoder works. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Acked-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Dave Airlie <airlied@redhat.com> Tested-by:
Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> (I+A) (cherry picked from commit d0d5f760)
-
This fixes a crash in Deus Ex: Mankind Divided. Release builds were unaffected, so it's not too serious. Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by:
Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit 00baaa47)
-
This was broken when the GLAPI use was removed from mesa_glinterop.h. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Reviewed-by:
Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by:
Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit 64c2593a)
-
This was broken when the GLAPI use was removed from mesa_glinterop.h. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Reviewed-by:
Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by:
Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit ee39d445)
-
I need the definition of PUBLIC. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Reviewed-by:
Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by:
Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit bf51b453)
-
When splitting up loads, we have to add 16 bytes to the offset for the high components, just like already happens for stores. Fixes arb_gpu_shader_fp64@shader_storage@layout-std140-fp64-shader. Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by:
Marek Olšák <marek.olsak@amd.com> (cherry picked from commit e4b37880)
-
Assuming the hardware is set up to use a screen coordinate system flipped vertically with respect to the GL's window coordinate system, the SYSTEM_VALUE_SAMPLE_POS vector will also be flipped vertically with respect to the value expected by the GL, so we need to give it the same treatment as gl_FragCoord. Fixes the following CTS tests on i965: ES31-CTS.functional.shaders.multisample_interpolation.interpolate_at_offset.at_sample_position.default_framebuffer ES31-CTS.functional.shaders.sample_variables.sample_pos.correctness.default_framebuffer when run with any multisample configuration, e.g. rgba8888d24s8ms4. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by:
Kenneth Graunke <kenneth@whitecape.org> Reviewed-by:
Anuj Phogat <anuj.phogat@gmail.com> (cherry picked from commit f3d38786)
-