Asus AMD dGPU Hainan overheat and reboot
Brief summary of the problem:
< i have a problem on Ubuntu with a discreet gpu HAINAN CHIP, because it get reboot always i try to play DRI_PRIME=1 games in an Asus x555qg:>
Hardware description:
- CPU: <AMD A12-9720P RADEON R7 4C+8G (4) @ 2.700GHz > **- GPU: **
Graphics: Device-1: Advanced Micro Devices [AMD/ATI] Wani [Radeon R5/R6/R7 Graphics] driver: amdgpu v: 5.6.19 Device-2: AMD Sun XT [Radeon HD 8670A/8670M/8690M / R5 M330 / M430 / Radeon 520 Mobile] driver: amdgpu v: 5.6.19 Display: x11 server: X.Org 1.20.8 driver: amdgpu,radeon resolution: 1366x76860Hz, 1280x102460Hz OpenGL: renderer: AMD Radeon R7 Graphics (CARRIZO DRM 3.40.0 5.4.0-58-generic LLVM 11.0.0) v: 4.6 Mesa 20.3
- System Memory: < 10990MiB >
- Display(s): <hdmi+vga+laptop screen>
- Type of Diplay Connection: <TODO: DP, HDMI, DVI, etc>
System infomration:
- Distro name and Version: <Ubuntu 20.04.1 x64>
- Kernel version: < 5.4.0-58-generic>
- AMD package version: <"No package">
How to reproduce the issue:
< DRI_PRIME=1 on any game and set the profile for the gpu as "performance" or "Auto", it just go up until 1030mhz >
< NOTE: I did post this issue on radeon-profile https://github.com/marazmista/radeon-profile/issues/247 will paste here some parts:
i have give DC and DPM parameters to the kernel grub but can't really configure a lot of things there, i did try corectrl before radeon-profile and there i only had 2 options "Low" = 300mhz(clk) & 300mhz(mclk); "High" 1030mzh (clk) & 900mhz(mclk) oh and "Auto" but this is always the high state 1030mhz x 900mhz same as High <- this is what burn my laptop btw.
Until here i just decide to unistall corecontrol and try with Radeon Profile, and it give me more information and a new mode more than core ctrl is working here 400 mhz (clk) & 600(mclk) (that is making playable some games and i really apreciate that from radeon profile thanks) But, i always have the doubt cause if this gpu can get to 1030mhz with a 105°c of temperature, maybe there is a way to get less than High state 1030mhz x 900mhz and maybe more than Auto - Battery state with 400mhz x 600mhz , i suppose it can be stable on 600mhz x 600mhz maybe? and i will happy with that, but doing $sudo echo > /pp_* files did'nt work for me never yet :(
DRM_IOCTL_AMDGPU_INFO: Invalid argument
pp_dpm_get_mclk_od was not implemented.
pp_dpm_get_sclk_od was not implemented. < this thing is worrying me
Now the other thing i was thinking are a way to put ranges on the events but the less temperature event is working over the high temperature event is not working as i think with profile "high" on less than 90°c and a way to up the cycles number to get less than 90°c.... Thanks for atention, sorry for my bad english.
Edit: This output is from amdcovc (easy to copy paste.):
Adapter 0: Wani [Radeon R5/R6/R7 Graphics]
Device Topology: 0:1:0
Vendor ID: 4098 (0x1002)
Device ID: 39028 (0x9874)
Current CoreClock: 626 MHz
Current MemoryClock: 667 MHz
Core Overdrive: 0
Memory Overdrive: 0
Current Voltage: 875 mV
Performance Control: high
GPU Load: 3%
Current BusSpeed: 0
Current BusLanes: 0
Temperature (edge): 72°C
Critical temperature: 0 C
FanSpeed Min (Value): 0
FanSpeed Max (Value): 0
Current FanSpeed: -nan%
Controlled FanSpeed: no
Core Clocks:
300MHz
480MHz
533MHz
576MHz
626MHz
685MHz
720MHz
757MHz
Memory Clocks:
667MHz
933MHz
Adapter 1: Sun XT [Radeon HD 8670A/8670M/8690M / R5 M330 / M430 / Radeon 520 Mobile]
Device Topology: 3:0:0
Vendor ID: 4098 (0x1002)
Device ID: 26208 (0x6660)
Current CoreClock: 0 MHz
Current MemoryClock: 0 MHz
Core Overdrive: 0
Memory Overdrive: 0
Performance Control: auto
Current BusSpeed: 0
Current BusLanes: 0
Temperature (edge): 72°C
Critical temperature: 120 C
FanSpeed Min (Value): 0
FanSpeed Max (Value): 0
Current FanSpeed: -nan%
Controlled FanSpeed: no
With Radeon Profile R5 SunXT Hainan give that Frequency:
300mhz x 300mhz (This is what i get in low profile)
400mhz x 600mhz (This is auto + battery)
400mhz x 900mhz
800mhz x 900mhz (If i get this as a fixed state i will be very happy, but this just happen in the credits on games and it fast scalate to high never mantain that one )
955mhz x 900mhz
1030mhz x 900mhz (this is the auto, medium, and high profiles on the app, and always get reboot when i go ingame)
there seems to be other combinations but i dont know how to force it cause i dont have some device files for that :3 an when i try with sudo echo, the output "permission denied" or "bad parameter" thing.
And almost i forgot to tell you THANK YOU for your great work, and thanks for pay attention.
Attached files:
Xorg log: https://justpaste.me/OgzI1