BAR writes are slow with certain machine configurations
Brief summary of the problem:
BAR (DEVICE_LOCAL | HOST_COHERENT
) writes can be excessively slow for certain regions.
Hardware description:
- CPU: Ryzen 7 3700X (Zen 2)
- GPU: Radeon RX 5700 XT (PowerColor Red Devil)
- MB: Gigabyte X570 Aorus Elite
- System Memory: 16GB
System information:
- Distro name and Version: Arch Linux
- Kernel version: 5.15/5.16 tested
- Custom kernel: Arch stock/xanmod-tt/drm-next tested
- AMD official driver version: N/A (issue happens regardless of userland driver implementation)
How to reproduce the issue:
This linked benchmark tests BAR clearing speeds while allocating up to 90% of VRAM (this is hardcoded to 16GB, so manually adjust line 31 to your VRAM size).
On my setup, roughly half of the regions can be wiped quickly (~1.5ms), while the remaining shows a behavior that is excessively slow (~20ms).
After a BIOS update, the slow regions improved to ~10ms, but this is still unacceptable performance.
The issue reproduces when both Above 4G decoding and Resizable BAR is enabled in the BIOS.
For me, it goes away when either Resizable BAR is disabled in the BIOS, or I manually patch the kernel to rebind the BAR memory to somewhere else (by patching out this check).
Related issues
Past reports related to poor SAM performance are likely due to this issue. 10x upload time is a reasonable explanation for halved frame rate.
Another user on Discord with X370 and Navi 2 report the same issue but with a different configuration (no reBAR support in BIOS, above 4G enabled, BAR resized by kernel).