ACO: Gpu Hang with Vega 56 in Strange Brigade Benchmark and Witcher 3
Hi ,
i get a gpu hang in Witcher 3 and strange brigade (benchmark) using master up to b93a1952 when aco is enabled. This does not happen with llvm.This is a pretty new issue, worked fine in the past.
Witcher 3 error message:
2020-03-14T16:21:47.494082+01:00 gamebox kernel: [27679.369998] amdgpu 0000:09:00.0: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32785, for process witcher3.exe pid 31088 thread witcher3.exe pid 31234) 2020-03-14T16:21:47.494095+01:00 gamebox kernel: [27679.370001] amdgpu 0000:09:00.0: in page starting at address 0x000000010007d000 from client 27 2020-03-14T16:21:47.494096+01:00 gamebox kernel: [27679.370002] amdgpu 0000:09:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00601431 2020-03-14T16:21:47.494097+01:00 gamebox kernel: [27679.370003] amdgpu 0000:09:00.0: MORE_FAULTS: 0x1 2020-03-14T16:21:47.494097+01:00 gamebox kernel: [27679.370003] amdgpu 0000:09:00.0: WALKER_ERROR: 0x0 2020-03-14T16:21:47.494102+01:00 gamebox kernel: [27679.370011] amdgpu 0000:09:00.0: PERMISSION_FAULTS: 0x3 2020-03-14T16:21:47.494102+01:00 gamebox kernel: [27679.370012] amdgpu 0000:09:00.0: MAPPING_ERROR: 0x0 2020-03-14T16:21:47.494103+01:00 gamebox kernel: [27679.370012] amdgpu 0000:09:00.0: RW: 0x0 2020-03-14T16:21:58.076073+01:00 gamebox kernel: [27689.950733] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered 2020-03-14T16:22:08.322322+01:00 gamebox kernel: [27700.198563] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=638980, emitted seq=638982 2020-03-14T16:22:08.322332+01:00 gamebox kernel: [27700.198635] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process witcher3.exe pid 31088 thread witcher3.exe pid 31234 2020-03-14T16:22:08.322333+01:00 gamebox kernel: [27700.198638] amdgpu 0000:09:00.0: GPU reset begin! 2020-03-14T16:22:08.493139+01:00 gamebox kernel: [27700.369508] amdgpu 0000:09:00.0: GPU BACO reset 2020-03-14T16:22:09.016073+01:00 gamebox kernel: [27700.890986] amdgpu 0000:09:00.0: GPU reset succeeded, trying to resume 2020-03-14T16:22:09.494005+01:00 gamebox kernel: [27701.369272] amdgpu 0000:09:00.0: ring gfx uses VM inv eng 0 on hub 0 2020-03-14T16:22:09.494006+01:00 gamebox kernel: [27701.369273] amdgpu 0000:09:00.0: ring comp_1.0.0 uses VM inv eng 1 on hub 0 2020-03-14T16:22:09.494006+01:00 gamebox kernel: [27701.369273] amdgpu 0000:09:00.0: ring comp_1.1.0 uses VM inv eng 4 on hub 0 2020-03-14T16:22:09.494007+01:00 gamebox kernel: [27701.369274] amdgpu 0000:09:00.0: ring comp_1.2.0 uses VM inv eng 5 on hub 0 2020-03-14T16:22:09.494008+01:00 gamebox kernel: [27701.369275] amdgpu 0000:09:00.0: ring comp_1.3.0 uses VM inv eng 6 on hub 0 2020-03-14T16:22:09.494009+01:00 gamebox kernel: [27701.369276] amdgpu 0000:09:00.0: ring comp_1.0.1 uses VM inv eng 7 on hub 0 2020-03-14T16:22:09.494010+01:00 gamebox kernel: [27701.369276] amdgpu 0000:09:00.0: ring comp_1.1.1 uses VM inv eng 8 on hub 0 2020-03-14T16:22:09.494011+01:00 gamebox kernel: [27701.369277] amdgpu 0000:09:00.0: ring comp_1.2.1 uses VM inv eng 9 on hub 0
Strange brigade (sadly cut off)
2020-03-14T19:20:56.694658+01:00 gamebox kernel: [10009.563944] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered 2020-03-14T19:21:06.934645+01:00 gamebox kernel: [10019.803675] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
Strange-Brigade renderdoc: https://drive.google.com/open?id=1G9AAuoLU9HNYJuFYVnAF2AtvsGuK4MPX Witcher 3 renderdoc: https://drive.google.com/open?id=18xc1eNlCXYJKePC7mkiTPcnq3GHqsL9F
Many thanks ! Christian