[Regression][Bisected] White background rendered in OpenGL windows when DRI_PRIME=1 is used
Hi, sent here by the IOMMU upstream maintainer:
This is likely a bug in the DRM code for your GPU. The recent changes in the AMD IOMMU driver might cause sg-list entries to be merged by the DMA-API. Some drivers probably don't handle this correctly. I've seen a similar problem with RDMA devices, caused by the same change.
(RDMA thread mentioned above: https://www.spinics.net/lists/linux-nfs/msg76396.html)
Bug report:
Windows of OpenGL applications ran with
DRI_PRIME=1
environment variable (hybrid graphics) are filled with white backgrounds when scaled instead of being rendered properly.How reproducible:
Always
Steps to Reproduce:
- Boot with kernel version 5.5rc1 or higher and login to your xorg session (I use xmonad without a compositor)
- Run any OpenGL application with DRI_PRIME=1 set. I used glxgears for this test (tested mpv with --vo=opengl, radium are known to be affected as well) For example:
DRI_PRIME=1 glxgears
Actual results:
Half of the window is filled with white, white pixels do not go away as you resize the window.
Expected results:
Application is rendered correctly.
Bisect results:
commit be62dbf554c5b50718a54a359372c148cd9975c7 Author: Tom Murphy <murphyt7@tcd.ie> Date: Sun Sep 8 09:56:41 2019 -0700 iommu/amd: Convert AMD iommu driver to the dma-iommu api Convert the AMD iommu driver to the dma-iommu api. Remove the iova handling and reserve region code from the AMD iommu driver. Signed-off-by: Tom Murphy <murphyt7@tcd.ie> Signed-off-by: Joerg Roedel <jroedel@suse.de> drivers/iommu/Kconfig | 1 + drivers/iommu/amd_iommu.c | 692 +++++----------------------------------------- 2 files changed, 68 insertions(+), 625 deletions(-)
Extra info:
dmesg prints this from radeon/amd-vi once when glxgears is started:
AMD-Vi: Event logged [IO_PAGE_FAULT device=08:00.0 domain=0x0000 address=0xfffffff2c0 flags=0x0010]
awk -f scripts/ver_linux
GNU C 9.2.0 GNU Make 4.3 Binutils 2.33.1 Util-linux 2.35.1 Mount 2.35.1 Module-init-tools 26 E2fsprogs 1.45.5 Jfsutils 1.1.15 Reiserfsprogs 3.6.27 Xfsprogs 5.4.0 Bison 3.5.1 Flex 2.6.4 Linux C Library 2.30 Dynamic linker (ldd) 2.30 Linux C++ Library 6.0.27 Procps 3.3.15 Net-tools 2.10 Kbd 2.2.0 Console-tools 2.2.0 Sh-utils 8.31 Udev 244 Modules Loaded acpi_cpufreq aesni_intel agpgart ahci amdgpu asus_wmi async_memcpy async_pq async_raid6_recov async_tx async_xor battery bcache blake2b_generic btrfs ccp crc32c_ge neric crc32c_intel crc32_pclmul crc64 crct10dif_pclmul cryptd crypto_simd dca dm_mod dm_raid drm drm_kms_helper eeepc_wmi evdev fat fb_sys_fops ghash_clmulni_intel glue_helper gpio_amdpt gpu_sched hid hid_generic i2c_algo_bit i2c_piix4 igb input_leds ip_tables irqbypass jc42 joydev k10temp kvm kvm_amd libahci libata libcrc32c mac_hid macvlan macvtap mc md_mod mousedev mxm _wmi nls_cp437 nls_iso8859_1 pcspkr pinctrl_amd radeon raid1 raid456 raid6_pq rfkill rng_core scsi_mod sd_mod snd snd_aloop snd_hda_codec snd_hda_codec_hdmi snd_hda_core snd_hda_intel snd _hwdep snd_intel_dspcfg snd_pcm snd_rawmidi snd_timer snd_usb_audio snd_usbmidi_lib soundcore sparse_keymap syscopyarea sysfillrect sysimgblt tap ttm tun usbhid vfat vfio vfio_iommu_type1 vfio_pci vfio_virqfd vhost vhost_net wmi wmi_bmof xfs xhci_hcd xhci_pci xor x_tables
Known fixes:
Reverting commit
be62dbf554c5b50718a54a359372c148cd9975c7
on 5.5 tag or disabling IOMMU in bios resolves the issue and glxgears is rendered correctly.