mce hardware error while playing games
Brief summary of the problem:
While playing game, I have random mce hardware error, which cause hard reboots.
- CPU: AMD Ryzen 5 3600 6-Core
- GPU: AMD Radeon RX 5700 XT (NAVI10, DRM 3.40.0, 5.10.7-3-MANJARO, LLVM 11.0.1)
- System Memory: 16 Go
- Display(s): 1920x1080 144 Hz (display port) + 1280x1024 75 Hz (HDMI)
- Distro name and Version: Manjaro Linux
- Kernel version: 5.10.7-3-MANJARO
- AMD package version: Mesa 20.3.3
How to reproduce the issue:
The best way to reproduce the problem is to play Boneworks (VR game). The issue happens almost 100% of the time, when I reach the main menu. I am using a Valve Index.
It happens in other VR games like Half-Life Alyx, but I also experienced it in the Witcher 3 (non VR). But with those games it is really random. Sometime no problem playing two hours straight, but sometimes it happens approximately once every hour.
Here is a short list of games where it happened :
- Witcher 3
- Half-Life Alyx (native Linux version)
- Skyrim VR
- Boneworks Maybe others but I am not sure.
This is also the computer I use for my daily work as a PHP developer. The mce error happens only in game, never in other situation.
Note that a similar issue is reported here : https://bugzilla.kernel.org/show_bug.cgi?id=206903
- journalctl.log This log corresponds to a reboot after a mce to another mce.