r600/OpenCL/TURKS: Wrong and presumably too many "cl_khr_fp64" features exposed with NIR path on non-native FP64 hardware
Summary
And here follows a new bug report regarding my old TURKS based Radeon HD 6770M in conjunction with OpenCL. It seems that there are too many cl_khr_fp64
(Double-precision Floating-point) OpenCL features exposed with the new NIR path. So when I start clinfo
with current Mesa devel there is displayed:
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
The last entry Support is emulated in software should have value Yes because double-precision is on non-native FP64 TeraScale 2 hardware effectively emulated via software.
At the same time the value of Denormals, Round to zero, Round to infinity and IEEE754-2008 fused multiply-add should be most likely declared with No. That is the original value of single-precision and I doubt double-precision is any different on non-native FP64 TeraScale 2 hardware.
This is not a problem with R600_DEBUG=use_tgsi
because there is simply no FP64 emulation present and ergo no cl_khr_fp64
support exposed.
Addition: According to the information avaiable here the TeraScale 2 architecture supports the IEEE754-2008 fused multiply-add (FMA) operation. So this feature is effectively missing under single-precision FP support.
System information
Mesa 23.0.0-devel (git-d6d772d3 2022-11-21 jammy-oibaf-ppa)
test@iMac-test:~$ inxi -b
System:
Host: iMac-test Kernel: 5.15.0-53-generic x86_64 bits: 64
Desktop: KDE Plasma 5.24.6 Distro: Ubuntu 22.04.1 LTS (Jammy Jellyfish)
Machine:
Type: Desktop System: Apple product: iMac12,2 v: 1.0
serial: <superuser required>
Mobo: Apple model: Mac-942B59F58194171B v: iMac12,2
serial: <superuser required> UEFI: Apple v: IM121.88Z.004F.B00.1804101150
date: 04/10/18
CPU:
Info: quad core Intel Core i5-2400 [MCP] speed (MHz): avg: 1600
min/max: 1600/3400
Graphics:
Device-1: Intel 2nd Generation Core Processor Family Integrated Graphics
driver: i915 v: kernel
Device-2: AMD Whistler [Radeon HD 6730M/6770M/7690M XT] driver: radeon
v: kernel
Device-3: Apple FaceTime HD Camera (Built-in) type: USB driver: uvcvideo
Display: x11 server: X.Org v: 1.21.1.3 driver: X:
loaded: ati,modesetting,radeon unloaded: fbdev,vesa gpu: radeon
resolution: 2560x1440~60Hz
OpenGL: renderer: AMD TURKS (DRM 2.50.0 / 5.15.0-53-generic LLVM 15.0.4)
v: 4.5 Mesa 23.0.0-devel (git-d6d772d 2022-11-21 jammy-oibaf-ppa)
Network:
Device-1: Broadcom NetXtreme BCM57765 Gigabit Ethernet PCIe driver: tg3
Device-2: Qualcomm Atheros AR93xx Wireless Network Adapter driver: ath9k
Drives:
Local Storage: total: 961.01 GiB used: 99.8 GiB (10.4%)
Info:
Processes: 233 Uptime: 5d 13h 56m Memory: 15.6 GiB used: 4.69 GiB (30.1%)
Shell: Bash inxi: 3.3.13
Regression
Only the new NIR path is affected by this and not the old TGSI.
Any extra information would be greatly appreciated
The "cl_khr_fp64" double-precision floating-point features could not be tested more thoroughly because clover is broken on TeraScale 2 hardware for years. An example would be bug #586 which shows a long standing problem when running clpeak
. More information with current longs can be found in that bug report.
This may have also some relevance in regard to the the upcoming new rusticl OpenCL implementation.