r300: Regression of several deqp-gles2@performance@compiler@ deqp-gles2@performance@shader@ tests (23.2-rc1 vs. 23.1.0) on a Radeon X800 GTO [R480]
Spotted some regressions on a RADEON_DEBUG=fp,vp testrun with my Radeon X800 GTO.
Several tests which pass with 23.1.0 show r300: Dynamic loops are not supported on R3xx/R4xx. r300 FP: Cannot translate a shader. Using a dummy shader instead.
but get a pass anyhow on 23.1.0.
On 23.2-rc1 these tests don't complain about missing hardware capabilities but fail. Generated Vertex Programs and Fragment Programs look alike 23.2-rc1 vs. 23.1.0 but the generated code is much shorter on failing 23.2-rc1.
Apart from this I noticed another difference in many of the failing tests. Even early before dynamic loops seem to play a role CONST[0] vector is different in 23.2-rc1 vs. 23.1.0, e.g.
23.2-rc1 deqp-gles2@performance@compiler@cache@loop@dynamic@10_iterations_1_levels_fragment
[...]
CONST[0] = { 0.9967 0.0000 0.0000 0.0000 }
Vertex Program: after 'dead constants'
# Radeon Compiler Program
0: MUL output[0], input[0], const[0].xxxx;
1: MOV output[1], input[1];
Final vertex program code:
0: op: 0x00f00202 dst: 0o op: VE_MULTIPLY
src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W
src1: 0x00000002 reg: 0c swiz: X/ X/ X/ X
src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0
1: op: 0x00f02203 dst: 1o op: VE_ADD
src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W
src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0
src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0
Flow Control Ops: 0x00000000
r300: Initial fragment program
FRAG
PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
DCL IN[0], GENERIC[0], PERSPECTIVE
DCL OUT[0], COLOR
IMM[0] FLT32 { 0.9347, 0.0000, 0.0000, 0.0000}
[...]
vs. 23.1.0 deqp-gles2@performance@compiler@cache@loop@dynamic@10_iterations_1_levels_fragment
[...]
Vertex Program: after 'register allocation'
# Radeon Compiler Program
0: MUL output[0], input[0], const[0].xxxx;
1: MOV output[1], input[1];
CONST[0] = { 0.9524 0.0000 0.0000 0.0000 }
Vertex Program: after 'dead constants'
# Radeon Compiler Program
0: MUL output[0], input[0], const[0].xxxx;
1: MOV output[1], input[1];
Final vertex program code:
0: op: 0x00f00202 dst: 0o op: VE_MULTIPLY
src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W
src1: 0x00000002 reg: 0c swiz: X/ X/ X/ X
src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0
1: op: 0x00f02203 dst: 1o op: VE_ADD
src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W
src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0
src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0
Flow Control Ops: 0x00000000
r300: Initial fragment program
FRAG
PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
DCL IN[0], GENERIC[9], PERSPECTIVE
DCL OUT[0], COLOR
IMM[0] FLT32 { 0.0482, 0.0000, 0.0000, 0.0000}
[...]
Regressed tests concerning r3xx/r4xx dynamic loop support are:
deqp-gles2@performance@compiler@cache@loop@dynamic@
deqp-gles2@performance@compiler@cache@loop@uniform@
deqp-gles2@performance@compiler@cache@loop@static@1000_iterations_1_levels_fragment
deqp-gles2@performance@compiler@cache@loop@static@1000_iterations_1_levels_vertex
deqp-gles2@performance@compiler@cache@loop@static@100_iterations_1_levels_vertex
deqp-gles2@performance@compiler@cache@loop@static@100_iterations_1_levels_fragment
deqp-gles2@performance@compiler@cache@loop@static@10_iterations_3_levels_vertex
deqp-gles2@performance@compiler@cache@mandelbrot@128_iterations
deqp-gles2@performance@compiler@cache_whitespace_comment@static_loop_100_iterations_vertex
deqp-gles2@performance@compiler@cache_whitespace_comment@static_loop_100_iterations_fragment
deqp-gles2@performance@compiler@optimization@loop_invariant_code_motion@32_iterations_fragment
deqp-gles2@performance@compiler@optimization@loop_invariant_code_motion@32_iterations_vertex
deqp-gles2@performance@compiler@valid_shader@loop@dynamic@10_iterations_1_levels_fragment
deqp-gles2@performance@compiler@valid_shader@loop@dynamic@
deqp-gles2@performance@compiler@valid_shader@loop@uniform@
deqp-gles2@performance@compiler@valid_shader@loop@static@1000_iterations_1_levels_fragment
deqp-gles2@performance@compiler@valid_shader@loop@static@1000_iterations_1_levels_vertex
deqp-gles2@performance@compiler@valid_shader@loop@static@100_iterations_1_levels_fragment
deqp-gles2@performance@compiler@valid_shader@loop@static@100_iterations_1_levels_vertex
deqp-gles2@performance@compiler@valid_shader@loop@static@10_iterations_3_levels_vertex
deqp-gles2@performance@compiler@valid_shader@mandelbrot@128_iterations
deqp-gles2@performance@shader@control_statement@do_while@fragment@uniform
deqp-gles2@performance@shader@control_statement@do_while@fragment@varying_stable
deqp-gles2@performance@shader@control_statement@do_while@fragment@varying_unstable
deqp-gles2@performance@shader@control_statement@do_while@vertex@attribute_stable
deqp-gles2@performance@shader@control_statement@do_while@vertex@attribute_unstable
deqp-gles2@performance@shader@control_statement@do_while@vertex@uniform
deqp-gles2@performance@shader@control_statement@for@fragment@varying_stable
deqp-gles2@performance@shader@control_statement@for@fragment@varying_unstable
deqp-gles2@performance@shader@control_statement@for@vertex@attribute_stable
deqp-gles2@performance@shader@control_statement@for@vertex@attribute_unstable
deqp-gles2@performance@shader@control_statement@for@vertex@uniform
deqp-gles2@performance@shader@control_statement@while@fragment@uniform
deqp-gles2@performance@shader@control_statement@while@fragment@varying_stable
deqp-gles2@performance@shader@control_statement@while@fragment@varying_unstable
deqp-gles2@performance@shader@control_statement@while@vertex@attribute_stable
deqp-gles2@performance@shader@control_statement@while@vertex@attribute_unstable
deqp-gles2@performance@shader@control_statement@while@vertex@uniform
deqp-gles2@performance@shader@operator@angle_and_trigonometry@cos@vertex@lowp_float
deqp-gles2@performance@shader@operator@angle_and_trigonometry@degrees@vertex@lowp_float
deqp-gles2@performance@shader@operator@binary_operator@div@vertex@highp_vec3
deqp-gles2@performance@shader@operator@binary_operator@div@vertex@mediump_int
deqp-gles2@performance@shader@operator@binary_operator@div@vertex@mediump_ivec3
deqp-gles2@performance@shader@operator@angle_and_trigonometry@degrees@vertex@lowp_float
deqp-gles2@performance@shader@operator@angle_and_trigonometry@cos@vertex@lowp_float
deqp-gles2@performance@shader@operator@binary_operator@mul@vertex@highp_float
deqp-gles2@performance@shader@operator@binary_operator@mul@vertex@lowp_float
deqp-gles2@performance@shader@operator@binary_operator@mul@vertex@lowp_vec3
deqp-gles2@performance@shader@operator@common_functions@mod@vertex@highp_float
deqp-gles2@performance@shader@operator@common_functions@mod@vertex@lowp_vec4
deqp-gles2@performance@shader@operator@common_functions@mod@vertex@mediump_float
deqp-gles2@performance@shader@operator@binary_operator@sub@vertex@highp_vec3
deqp-gles2@performance@shader@operator@common_functions@step@vertex@highp_float
deqp-gles2@performance@shader@operator@common_functions@step@vertex@highp_vec4
deqp-gles2@performance@shader@operator@common_functions@smoothstep@vertex@mediump_vec4
deqp-gles2@performance@shader@operator@matrix@matrixcompmult@vertex@highp_mat2
If the tests above are determined to fail on r3xx/r4xx due to lacking hardware support they could be skipped perhaps.
Results + summary + comparison attached. r480_23.1.0_deqp.tar.xz r480_23.2.0-rc1_deqp.tar.xz