ext_framebuffer_multisample-alpha-to-coverage-no-draw-buffer-zero-write cause temporary freeze of my system
I have found a cause for a "single" freeze that recover after 1-2 seconds.
$ bin/ext_framebuffer_multisample-alpha-to-coverage-no-draw-buffer-zero-write 2 -auto -fbo
PIGLIT: {"result": "pass" }
but in execution of that test
$ journalctl -f
[...]
kernel: radeon 0000:01:00.0: ring 0 stalled for more than 10258msec
kernel: radeon 0000:01:00.0: GPU lockup (current fence id 0x00000000001a0995 last fence id 0x00000000001a0997 on ring 0)
kernel: radeon 0000:01:00.0: Saved 54 dwords of commands on ring 0.
kernel: radeon 0000:01:00.0: GPU softreset: 0x00000019
kernel: radeon 0000:01:00.0: R_008010_GRBM_STATUS = 0xE57004A1
kernel: radeon 0000:01:00.0: R_008014_GRBM_STATUS2 = 0x00330302
kernel: radeon 0000:01:00.0: R_000E50_SRBM_STATUS = 0x200000C0
kernel: radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x04000000
kernel: radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00010002
kernel: radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00008404
kernel: radeon 0000:01:00.0: R_008680_CP_STAT = 0x80818647
kernel: radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57
kernel: radeon 0000:01:00.0: R_008020_GRBM_SOFT_RESET=0x00007F6B
kernel: radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100
kernel: radeon 0000:01:00.0: R_008010_GRBM_STATUS = 0x00003028
kernel: radeon 0000:01:00.0: R_008014_GRBM_STATUS2 = 0x00000002
kernel: radeon 0000:01:00.0: R_000E50_SRBM_STATUS = 0x200000C0
kernel: radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000
kernel: radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000
kernel: radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000
kernel: radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000
kernel: radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57
kernel: radeon 0000:01:00.0: GPU reset succeeded, trying to resume
kernel: [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0
kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000000014C000).
kernel: radeon 0000:01:00.0: WB enabled
kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00
kernel: radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c
kernel: radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c598
kernel: debugfs: File 'radeon_ring_gfx' in directory '0' already present!
kernel: debugfs: File 'radeon_ring_dma1' in directory '0' already present!
kernel: [drm] ring test on 0 succeeded in 1 usecs
kernel: [drm] ring test on 3 succeeded in 2 usecs
kernel: debugfs: File 'radeon_ring_uvd' in directory '0' already present!
kernel: [drm] ring test on 5 succeeded in 1 usecs
kernel: [drm] UVD initialized successfully.
kernel: [drm] ib test on ring 0 succeeded in 0 usecs
kernel: [drm] ib test on ring 3 succeeded in 0 usecs
kernel: [drm:uvd_v1_0_ib_test [radeon]] ERROR radeon: fence wait timed out.
kernel: [drm:radeon_ib_ring_tests [radeon]] ERROR radeon: failed testing IB on ring 5 (-110).
[...]
valgrind only report "blocks are still reachable" and few "blocks are definitely lost in loss record"
the freeze is cause by this instruction in alpha-to-coverage-no-draw-buffer-zero-write.cpp
draw_test_image(true /* sample_alpha_to_coverage /,
false / sample_alpha_to_one */);\
this function is defined at draw-buffers-common.cpp row 755
and here the problem is cause by:\
draw_pattern(sample_alpha_to_coverage,\
sample_alpha_to_one,\
false /* is_reference_image */,\
color);\
it is called with the following parameters (first two are passed by draw_test_image and color is set to 0.5):
draw_pattern(true,\
false,\
false /* is_reference_image */,\
0.5);
then freeze is generated calling
glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_INT,
(void *) indices);
GL_TRIANGLES -> 4
GL_UNSIGNED_INT -> 0x1405 -> 5125
unsigned indices[6] = {0, 1, 2, 0, 2, 3};\
I have try other examples that use this instruction and all work without problems.
So, I think, it must be something wrong with this test, but I don't know nearly nothing about OpenGL so it is very hard to understand what is wrong and what is not :-(
I have noted also that in the man page: https://docs.gl/gl3/glDrawElements
in the example it say that you need to set texcoords, normals, and vertex, but in the example it seem to set only vertex.
it say that this call must be
glVertexAttribPointer(position_attrib_index, 3, GL_FLOAT, false, 0, vertex_data);
and not the used one
glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, sizeof(vertices[0]), (void *) vertices);
It say that in the call
glDrawElements(GL_TRIANGLES, num_vertices, GL_UNSIGNED_INT, index_data);
num_vertices is the number of verts in your vertex_data.
but
vertex_data, I think, it is vertices so num_vertices is 4 and not 6
???
...
but after that I found that:
#define glDrawElements piglit_dispatch_glDrawElements
than that:
PFNGLDRAWELEMENTSPROC piglit_dispatch_glDrawElements = stub_glDrawElements;
typedef void (APIENTRY *PFNGLDRAWELEMENTSPROC)(GLenum mode, GLsizei count, GLenum type, const void * indices);
static void APIENTRY
stub_glDrawElements(GLenum mode, GLsizei count, GLenum type, const void * indices)
{
check_initialized();
piglit_dispatch_glDrawElements = resolve_glDrawElements();
piglit_dispatch_glDrawElements(mode, count, type, indices);
}
so he don't use the glDrawElements directly...
but this code, present in piglit-dispatch-gen.c, is generate at each compile with
$ make -B ext_framebuffer_multisample-alpha-to-coverage-no-draw-buffer-zero-writ
I will try to understand better this variable part, but if someone more expert can tell me if all this test is formally correct or not and if I can do some test to identify the cause of this problem
For know my system please look at #59
Ciao Davide