amdgpu: Locking bug within amdgpu_cs_submit() on drm-tip at v5.5-rc4
I got the following stack trace while starting gnome-shell. The kernel is drm-tip b70a5ffaee31 based on v5.5-rc4.
linux-l6tv login: [ 77.207852] ------------[ cut here ]------------
[ 77.212511] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
[ 77.212519] WARNING: CPU: 0 PID: 2358 at kernel/locking/mutex.c:938 __mutex_lock+0x8d5/0x8f0
[ 77.225907] Modules linked in: amdgpu(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) ip_tables(E) x_tables(E) af_packet(E) ppdev(E) parport_pc(E) parport(E) fuse(E) vmw_vsock_vmci_transport(E) vsoc)
[ 77.225935] glue_helper(E) thermal(E) button(E) btrfs(E) blake2b_generic(E) libcrc32c(E) xor(E) raid6_pq(E) hid_logitech_hidpp(E) hid_logitech_dj(E) crc32c_intel(E) sr_mod(E) cdrom(E) wmi(E) video(E) hid_generic(E) uas(E) usbhid(E) u)
[ 77.342262] CPU: 0 PID: 2358 Comm: gnome-shel:cs0 Tainted: G E 5.5.0-rc4-1-default+ #186
[ 77.351648] Hardware name: Dell Inc. OptiPlex 9020/0N4YC8, BIOS A24 10/24/2018
[ 77.358865] RIP: 0010:__mutex_lock+0x8d5/0x8f0
[ 77.363306] Code: c0 0f 84 a2 f7 ff ff 44 8b 05 67 4b cc 00 45 85 c0 0f 85 92 f7 ff ff 48 c7 c6 ba d4 38 92 48 c7 c7 af 5b 38 92 e8 1b b2 57 ff <0f> 0b e9 78 f7 ff ff 4d 89 fd 4c 8b bd 58 ff ff ff e9 17 fb ff ff
[ 77.382063] RSP: 0018:ffffbef34140ba30 EFLAGS: 00010286
[ 77.387286] RAX: 0000000000000028 RBX: 0000000000000000 RCX: 0000000000000000
[ 77.394416] RDX: 0000000000000002 RSI: ffffffff9111fe5d RDI: 0000000000000246
[ 77.401546] RBP: ffffbef34140baf0 R08: 0000000000000000 R09: 0000000000000028
[ 77.408676] R10: ffffffff9425b140 R11: 000000009425ad73 R12: 0000000000000000
[ 77.415804] R13: ffff9af21b861c20 R14: ffff9af222bfc678 R15: ffffbef34140bb80
[ 77.422934] FS: 00007ffb15b85700(0000) GS:ffff9af23ce00000(0000) knlGS:0000000000000000
[ 77.431019] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 77.436759] CR2: 00005587389f9028 CR3: 00000007e14d6006 CR4: 00000000001606f0
[ 77.443887] Call Trace:
[ 77.446335] ? __is_kernel_percpu_address+0x37/0xa0
[ 77.451212] ? module_assert_mutex_or_preempt+0x14/0x40
[ 77.456433] ? __module_address+0x28/0xf0
[ 77.460509] ? amdgpu_cs_submit.isra.0+0x77/0x350 [amdgpu]
[ 77.465992] ? rcu_read_lock_sched_held+0x32/0x40
[ 77.470693] ? __raw_spin_lock_init+0x2d/0x50
[ 77.475050] ? dma_fence_init+0xe1/0x110
[ 77.478972] ? drm_sched_fence_create+0xb1/0xc0 [gpu_sched]
[ 77.484588] ? amdgpu_cs_submit.isra.0+0x77/0x350 [amdgpu]
[ 77.490115] amdgpu_cs_submit.isra.0+0x77/0x350 [amdgpu]
[ 77.495467] amdgpu_cs_ioctl+0x2c6/0x580 [amdgpu]
[ 77.500223] ? amdgpu_cs_vm_handling+0x400/0x400 [amdgpu]
[ 77.505623] drm_ioctl_kernel+0x86/0xd0
[ 77.509458] drm_ioctl+0x1e4/0x36b
[ 77.512901] ? amdgpu_cs_vm_handling+0x400/0x400 [amdgpu]
[ 77.518300] ? sched_clock+0x5/0x10
[ 77.521785] ? sched_clock_cpu+0xc/0xa0
[ 77.525621] ? lockdep_hardirqs_on+0xf0/0x180
[ 77.530017] amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[ 77.534632] do_vfs_ioctl+0x31f/0x750
[ 77.538297] ksys_ioctl+0x5e/0x90
[ 77.541609] __x64_sys_ioctl+0x16/0x20
[ 77.545356] do_syscall_64+0x5a/0x240
[ 77.549016] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 77.554066] RIP: 0033:0x7ffb2336e387
[ 77.557637] Code: 00 00 90 48 8b 05 f9 9a 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c9 9a 0c 00 f7 d8 64 89 01 48
[ 77.576394] RSP: 002b:00007ffb15b84888 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 77.583958] RAX: ffffffffffffffda RBX: 00007ffb15b848f0 RCX: 00007ffb2336e387
[ 77.591089] RDX: 00007ffb15b848f0 RSI: 00000000c0186444 RDI: 000000000000000f
[ 77.598218] RBP: 00000000c0186444 R08: 00007ffb15b84a00 R09: 0000000000000020
[ 77.605347] R10: 00007ffb15b84a00 R11: 0000000000000246 R12: 000055873874b510
[ 77.612475] R13: 000000000000000f R14: 000055873865898c R15: 0000558738650798
[ 77.619609] irq event stamp: 85
[ 77.622747] hardirqs last enabled at (85): [<ffffffff912c4500>] kmem_cache_alloc+0x1f0/0x630
[ 77.631267] hardirqs last disabled at (84): [<ffffffff912c4391>] kmem_cache_alloc+0x81/0x630
[ 77.639700] softirqs last enabled at (0): [<ffffffff91097f9a>] copy_process+0x71a/0x1df0
[ 77.647872] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 77.654131] ---[ end trace fbc5eb12c627d323 ]---
Edited by Thomas Zimmermann