GPU hangs if a shader uses a barrier and a single-plane rep of a multiplane image
Submitted by ato..@..il.com
Assigned to Intel 3D Bugs Mailing List
Link to original bug (#105770)
Description
Hi,
If a shader uses a barrier after filling in some workgroup-shared memory from a single-plane representation of a multi-plane image (not sure if that's related), the GPU will hang.
[67542.848596] i915 0000:00:02.0: Resetting rcs0 after gpu hang
The shader looks like:
#version 460 layout (set = 0, binding = 0) uniform sampler2D input_img; layout (set = 0, binding = 1, rgba8) uniform writeonly image2D output_img;
#define FILTER_RADIUS (ivec2(4, 4)) #define CACHE_SIZE (ivec2(gl_WorkGroupSize) + FILTER_RADIUS*2) shared vec4 cache[AREA(CACHE_SIZE)];
void main() { ivec2 d; const ivec2 pos = ivec2(gl_GlobalInvocationID.xy); const ivec2 w = ivec2(gl_WorkGroupSize); const ivec2 l = ivec2(gl_LocalInvocationID.xy);
for (d.y = l.y; d.y < CACHE_SIZE.y; d.y += w.y) {
for (d.x = l.x; d.x < CACHE_SIZE.x; d.x += w.x) {
const ivec2 np = pos + d - l - FILTER_RADIUS;
cache[d.y*CACHE_SIZE.x + d.x] = texture(input_img, np);
}
}
barrier();
vec4 avg = vec4(0.0f);
ivec2 start = ivec2(0);
ivec2 end = FILTER_RADIUS*2 + 1;
for (d.y = start.y; d.y < end.y; d.y++)
for (d.x = start.x; d.x < end.x; d.x++)
avg += cache[(l.y + d.y)*CACHE_SIZE.x + l.x + d.x];
avg /= (end - start).x * (end - start).y;
imageStore(output_img, pos, avg);
}
Removing the barrier() will make the shader execute fine (with incorrect output of course).
Using an image2D as an input or sampling it like above makes no difference, the GPU still hangs.
As a test case, compile https://github.com/atomnuker/FFmpeg/tree/exp_vulkan with --enable-vulkan and --enable-libshaderc and run "./ffmpeg_g -init_hw_device "vulkan=vk:0" -i <input>
-filter_hw_device vk -vf format=yuv420p,hwupload,unsharp_vulkan -f null -".
Version: git