[Regression][Bisected][20.2][radeonsi] American Truck Simulator continually allocates memory until OOM
System information
System: Host: mcoffin-dev-tower Kernel: 5.9.0-rc6-1-amd-staging-drm-next-git-00279-g57016a79d712 x86_64 bits: 64
compiler: clang v: 11.0.0 Desktop: sway 1.5-b7f28cd6 dm: N/A Distro: Arch Linux
CPU: Info: 24-Core (3-Die) model: AMD Ryzen Threadripper 3960X bits: 64 type: MT MCP MCM arch: Zen 2 L2 cache: 12.0 MiB
flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 364269
Speed: 3254 MHz min/max: 2200/3800 MHz Core speeds (MHz): 1: 2603 2: 1946 3: 2198 4: 2197 5: 2196 6: 2195 7: 2196
8: 1864 9: 1864 10: 2798 11: 2051 12: 3595 13: 1947 14: 2051 15: 2196 16: 2199 17: 2192 18: 3307 19: 2132 20: 2128
21: 2194 22: 2192 23: 2184 24: 2195 25: 2195 26: 2196 27: 2195 28: 2195 29: 2195 30: 2193 31: 2195 32: 2195
33: 2198 34: 1863 35: 2197 36: 1861 37: 2794 38: 1989 39: 4445 40: 2017 41: 2195 42: 2191 43: 2196 44: 2196
45: 2196 46: 2196 47: 2196 48: 2190
Graphics: Device-1: Advanced Micro Devices [AMD/ATI] Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT] driver: amdgpu
v: kernel bus ID: 03:00.0 chip ID: 1002:731f
Device-2: Advanced Micro Devices [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] vendor: XFX Pine
driver: amdgpu v: kernel bus ID: 21:00.0 chip ID: 1002:67df
Device-3: Valve type: USB driver: uvcvideo bus ID: 8-2.1.1:4 chip ID: 28de:2400
Display: server: X.Org 1.20.9 compositor: sway driver: amdgpu resolution: 1: 2560x1440~144Hz 2: 1920x1200~60Hz
s-dpi: 96
OpenGL: renderer: AMD Radeon RX 5700 XT (NAVI10 DRM 3.40.0 5.9.0-rc6-1-amd-staging-drm-next-git-00279-g57016a79d712
LLVM 11.0.0)
v: 4.6 Mesa 20.2.0-rc4 (git-6195f7b703) direct render: Yes
Describe the issue
American truck simulator, as of 283ad85944b5d9082f0ede7ab41fb353db53fee8
, hangs and leaks memory until OOM conditions. Worked fine on all 20.1 versions
Regression
Last release it worked on - 20.1.8
First bad commit - 283ad85944b5d9082f0ede7ab41fb353db53fee8
Reverting the bad commit on top of 20.2 (currently: 6195f7b70306a42d42e6115fb787cd1896d3cc62
) resolves the issue. (patch used to revert in testing)
Bisect log:
# bad: [0b8f4381b1cfeb78a35dffc1aafc58658ef7a442] VERSION: bump for 20.2.0-rc1
# good: [e60a00a35653ef8d7eddc1905a66a74026ed843d] VERSION: bump to release 20.1.8
git bisect start 'mesa-20.2.0-rc1' 'mesa-20.1.8'
# skip: [3e1b93ec4fa31014c322b970f7d8a057fdec04fe] turnip: fix wrong substream size in parse_multisample_and_color_blend
git bisect skip 3e1b93ec4fa31014c322b970f7d8a057fdec04fe
# good: [167fa2887f0928042dcb21bbc2fa89ae9a29897d] nir/validate: validate intr->num_components
git bisect good 167fa2887f0928042dcb21bbc2fa89ae9a29897d
# skip: [9cc99baa4ad64685d8f24683613d836706713366] radv: add support for dynamic depth/stencil states
git bisect skip 9cc99baa4ad64685d8f24683613d836706713366
# skip: [786325fdb02b6561f243c82d359da8e5b3360a73] nouveau: Only call nir_lower_io on shader_in/out
git bisect skip 786325fdb02b6561f243c82d359da8e5b3360a73
# skip: [2ac5cce1a1325a15afcec54ff8ca90bae64c48aa] radv: require LLVM 11+ for GFX 10.3 if not using ACO
git bisect skip 2ac5cce1a1325a15afcec54ff8ca90bae64c48aa
# good: [9cc99baa4ad64685d8f24683613d836706713366] radv: add support for dynamic depth/stencil states
git bisect good 9cc99baa4ad64685d8f24683613d836706713366
# good: [589d8665f012805f589e2e0ab6e9e04f7a8da96f] ci: Use half as many parallel softpipe / virgl test jobs
git bisect good 589d8665f012805f589e2e0ab6e9e04f7a8da96f
# good: [ba9d502d246ec408761f6d44c6a3fde227ef87a6] freedreno/ir3: add missing track_ubo_use()
git bisect good ba9d502d246ec408761f6d44c6a3fde227ef87a6
# good: [24f55eb6e808cab74ff21aa809742dc644c5c900] freedreno/rnn: rework RNN_DEF_PATH construction
git bisect good 24f55eb6e808cab74ff21aa809742dc644c5c900
# good: [4640e7da04a253388e9214f4a88252e115bf84e6] ac/nir: consider an image load/store intrinsic's access
git bisect good 4640e7da04a253388e9214f4a88252e115bf84e6
# bad: [283ad85944b5d9082f0ede7ab41fb353db53fee8] radeonsi: call nir_split_array_vars/shrink_vec_array_vars/opt_find_array_copies
git bisect bad 283ad85944b5d9082f0ede7ab41fb353db53fee8
# good: [5ae7098ebab1d15fa903d8888a1a73058e5976ff] gallium/android: Rewrite backtrace helper for android
git bisect good 5ae7098ebab1d15fa903d8888a1a73058e5976ff
# good: [a7fe711a30f1c32d4e1e187a9a240be5b9527be6] vulkan: Allow global symbol HMI for Android
git bisect good a7fe711a30f1c32d4e1e187a9a240be5b9527be6
# good: [141b295311aed28d64a850531490d2044f5b6a78] freedreno: allow fence_fd fences to be recycled
git bisect good 141b295311aed28d64a850531490d2044f5b6a78
# good: [0294eaed809fb5117c45a4c3f2e686fea4e27196] radeonsi: extend workaround for KHR-GL45.texture_view.view_classes on gfx9
git bisect good 0294eaed809fb5117c45a4c3f2e686fea4e27196
# good: [47beee2eb3f1bf73d20e9695cfc06d50193ba6ca] radeonsi: reorder NIR optimizations
git bisect good 47beee2eb3f1bf73d20e9695cfc06d50193ba6ca
# first bad commit: [283ad85944b5d9082f0ede7ab41fb353db53fee8] radeonsi: call nir_split_array_vars/shrink_vec_array_vars/opt_find_array_copies
Log files as attachment
-
dmesg
- dmesg.log - Backtrace - N/A because the program loops continually allocating memory
- Gpu hang details - N/A because GPU doesn't hang
Screenshots/video files (if applicable)
I can get some if necessary, but largely uninteresting. The "crash" (hang?) occurs immediately after hitting the "Drive" button, and before the loading screen appears.
Any extra information would be greatly appreciated
This happens on both a POLARIS10 card (RX 590) and a NAVI10 card (5700XT). Reverting the offending commit on top of 20.2 resolves the issue (see regression section).
I have tried both on native X11 and through Xwayland (sway), with the same results.