Skip to content

freedreno/a6xx: Clover support

Rob Clark requested to merge robclark/mesa:fd/clover into main

Some basic things work, missing things to get more of test_basic to pass:

  • hostptr support (ie. CTS hostptr) would require kernel support (SVM) for userptr memory, or work on clover to better emulate.. punting on this for now
  • scalarize vector load_global_ir3/store_global_ir3 (fpmath_floatN, int_mathN)
  • lower 64b load_global_ir3/store_global_ir3 to 32b (intmath_long*)
  • lower 64b phi (local_kernel_scope)
  • better 8b support (hiloeo).. we also might need 8b/16b push constants, ie. support to disable SP_MODE_CONTROL.CONSTANT_DEMOTION_ENABLE)
  • Add CI job(s).. looks like lavapipe already has some opencl ci which we could copy

Probably some other things I'm missing, I didn't analyze all the failures yet.. but those look like they should account for the majority of fails.

Current state of test_basic:

  • hostptr
  • fpmath_float
  • fpmath_float2
  • fpmath_float4
  • intmath_int
  • intmath_int2
  • intmath_int4
  • intmath_long
  • intmath_long2
  • intmath_long4
  • hiloeo
  • if
  • sizeof
  • loop
  • pointer_cast
  • local_arg_def
  • local_kernel_def
  • local_kernel_scope
  • constant
  • constant_source
  • readimage
  • readimage_int16
  • readimage_fp32
  • writeimage
  • writeimage_int16
  • writeimage_fp32
  • mri_one
  • mri_multiple
  • image_r8
  • barrier
  • int2float
  • float2int
  • imagereadwrite
  • imagereadwrite3d
  • readimage3d
  • readimage3d_int16
  • readimage3d_fp32
  • bufferreadwriterect
  • arrayreadwrite
  • arraycopy
  • imagearraycopy
  • imagearraycopy3d
  • imagecopy
  • imagecopy3d
  • imagerandomcopy
  • arrayimagecopy
  • arrayimagecopy3d
  • imagenpot
  • vload_global
  • vload_local
  • vload_constant
  • vload_private
  • vstore_global
  • vstore_local
  • vstore_private
  • createkernelsinprogram
  • imagedim_pow2
  • imagedim_non_pow2
  • image_param
  • image_multipass_integer_coord
  • image_multipass_float_coord
  • explicit_s2v_char
  • explicit_s2v_uchar
  • explicit_s2v_short
  • explicit_s2v_ushort
  • explicit_s2v_int
  • explicit_s2v_uint
  • explicit_s2v_long
  • explicit_s2v_ulong
  • explicit_s2v_float
  • explicit_s2v_double
  • enqueue_map_buffer
  • enqueue_map_image
  • work_item_functions
  • astype
  • async_copy_global_to_local
  • async_copy_local_to_global
  • async_strided_copy_global_to_local
  • async_strided_copy_local_to_global
  • async_copy_global_to_local2D
  • async_copy_local_to_global2D
  • async_copy_global_to_local3D
  • async_copy_local_to_global3D
  • async_work_group_copy_fence_import_after_export_aliased_local
  • async_work_group_copy_fence_import_after_export_aliased_global
  • async_work_group_copy_fence_import_after_export_aliased_global_and_local
  • async_work_group_copy_fence_export_after_import_aliased_local
  • async_work_group_copy_fence_export_after_import_aliased_global
  • async_work_group_copy_fence_export_after_import_aliased_global_and_local
  • prefetch
  • kernel_call_kernel_function
  • host_numeric_constants
  • kernel_numeric_constants
  • kernel_limit_constants
  • kernel_preprocessor_macros
  • parameter_types
  • vector_creation
  • vector_swizzle
  • vec_type_hint
  • kernel_memory_alignment_local
  • kernel_memory_alignment_global
  • kernel_memory_alignment_constant
  • kernel_memory_alignment_private
  • global_work_offsets
  • get_global_offset
  • simple_read_image_pitch
  • simple_write_image_pitch

Note that some OpenCL-CTS kernels trigger hangcheck timeout because they run for a long time, to workaround:

echo 500000 >  /sys/kernel/debug/dri/0/hangcheck_period_ms
Edited by Rob Clark

Merge request reports

Loading