r600/OpenCL/TURKS: Wrong and presumably too many "cl_khr_fp64" features exposed with NIR path on non-native FP64 hardware

Summary

And here follows a new bug report regarding my old TURKS based Radeon HD 6770M in conjunction with OpenCL. It seems that there are too many cl_khr_fp64 (Double-precision Floating-point) OpenCL features exposed with the new NIR path. So when I start clinfo with current Mesa devel there is displayed:

  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No

The last entry Support is emulated in software should have value Yes because double-precision is on non-native FP64 TeraScale 2 hardware effectively emulated via software.

At the same time the value of Denormals, Round to zero, Round to infinity and IEEE754-2008 fused multiply-add should be most likely declared with No. That is the original value of single-precision and I doubt double-precision is any different on non-native FP64 TeraScale 2 hardware.

This is not a problem with R600_DEBUG=use_tgsi because there is simply no FP64 emulation present and ergo no cl_khr_fp64 support exposed.

Addition: According to the information avaiable here the TeraScale 2 architecture supports the IEEE754-2008 fused multiply-add (FMA) operation. So this feature is effectively missing under single-precision FP support.

System information

Mesa 23.0.0-devel (git-d6d772d3 2022-11-21 jammy-oibaf-ppa)

test@iMac-test:~$ inxi -b
System:
  Host: iMac-test Kernel: 5.15.0-53-generic x86_64 bits: 64
    Desktop: KDE Plasma 5.24.6 Distro: Ubuntu 22.04.1 LTS (Jammy Jellyfish)
Machine:
  Type: Desktop System: Apple product: iMac12,2 v: 1.0
    serial: <superuser required>
  Mobo: Apple model: Mac-942B59F58194171B v: iMac12,2
    serial: <superuser required> UEFI: Apple v: IM121.88Z.004F.B00.1804101150
    date: 04/10/18
CPU:
  Info: quad core Intel Core i5-2400 [MCP] speed (MHz): avg: 1600
    min/max: 1600/3400
Graphics:
  Device-1: Intel 2nd Generation Core Processor Family Integrated Graphics
    driver: i915 v: kernel
  Device-2: AMD Whistler [Radeon HD 6730M/6770M/7690M XT] driver: radeon
    v: kernel
  Device-3: Apple FaceTime HD Camera (Built-in) type: USB driver: uvcvideo
  Display: x11 server: X.Org v: 1.21.1.3 driver: X:
    loaded: ati,modesetting,radeon unloaded: fbdev,vesa gpu: radeon
    resolution: 2560x1440~60Hz
  OpenGL: renderer: AMD TURKS (DRM 2.50.0 / 5.15.0-53-generic LLVM 15.0.4)
    v: 4.5 Mesa 23.0.0-devel (git-d6d772d 2022-11-21 jammy-oibaf-ppa)
Network:
  Device-1: Broadcom NetXtreme BCM57765 Gigabit Ethernet PCIe driver: tg3
  Device-2: Qualcomm Atheros AR93xx Wireless Network Adapter driver: ath9k
Drives:
  Local Storage: total: 961.01 GiB used: 99.8 GiB (10.4%)
Info:
  Processes: 233 Uptime: 5d 13h 56m Memory: 15.6 GiB used: 4.69 GiB (30.1%)
  Shell: Bash inxi: 3.3.13

Regression

Only the new NIR path is affected by this and not the old TGSI.

Any extra information would be greatly appreciated

The "cl_khr_fp64" double-precision floating-point features could not be tested more thoroughly because clover is broken on TeraScale 2 hardware for years. An example would be bug #586 which shows a long standing problem when running clpeak. More information with current longs can be found in that bug report.

This may have also some relevance in regard to the the upcoming new rusticl OpenCL implementation.

Edited Nov 21, 2022 by lorn10

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information