shutdown threshold temperature sometimes isn't restored properly after hibernate
Submitted by Mr-4
Assigned to Nouveau Project
Description
This is what I had about an hour or so ago after restore from hibernate:
Aug 26 13:04:36 test1 kernel: nouveau E[ PFIFO][0000:01:00.0] DMA_PUSHER - ch 1 [Xorg[1928]] get 0x00021cfc put 0x0001dcc8 state 0x8002b8c8 (err: INVALID_CMD) push 0x00000000
Aug 26 13:04:36 test1 kernel: nouveau E[ PFIFO][0000:01:00.0] DMA_PUSHER - ch 1 [Xorg[1928]] get 0x0002daa8 put 0x00042324 state 0x80000000 (err: INVALID_CMD) push 0x5f000000
Aug 26 13:04:36 test1 kernel: nouveau E[ PFIFO][0000:01:00.0] DMA_PUSHER - ch 1 [Xorg[1928]] get 0x0004232c put 0x00008800 state 0x80000000 (err: INVALID_CMD) push 0xff010000
Aug 26 13:04:36 test1 kernel: nouveau E[ PFIFO][0000:01:00.0] DMA_PUSHER - ch 1 [Xorg[1928]] get 0x00008810 put 0x0000dd88 state 0x80000000 (err: INVALID_CMD) push 0xff010000
Aug 26 13:04:36 test1 kernel: nouveau E[ PFIFO][0000:01:00.0] DMA_PUSHER - ch 1 [Xorg[1928]] get 0x0000dd88 put 0x0000a0cc state 0x00000000 (err: NONE) push 0x4d011000
Aug 26 13:04:36 test1 kernel: nouveau E[ PFIFO][0000:01:00.0] DMA_PUSHER - ch 1 [Xorg[1928]] get 0x0000a0d0 put 0x00008800 state 0x80000000 (err: INVALID_CMD) push 0xff010000
Aug 26 13:04:36 test1 kernel: nouveau [ PTHERM][0000:01:00.0] temperature (0 C) hit the 'shutdown' threshold
Aug 26 13:04:36 test1 kernel: nouveau E[ PFIFO][0000:01:00.0] DMA_PUSHER - ch 1 [Xorg[1928]] get 0x00008810 put 0x80002264 state 0x80000000 (err: INVALID_CMD) push 0xff011000
Aug 26 13:04:36 test1 kernel: nouveau W[ PFIFO][0000:01:00.0] unknown intr 0x00010000, ch 3
Aug 26 13:04:36 test1 kernel: nouveau W[ PFIFO][0000:01:00.0] unknown intr 0x00010000, ch 9
[...]
Aug 26 13:04:42 test1 kernel: nouveau E[Xorg[1928]] failed to idle channel 0xcccc0000 [Xorg[1928]]
Aug 26 13:04:43 test1 acpid: exiting
Aug 26 13:04:45 test1 kernel: nouveau E[Xorg[1928]] failed to idle channel 0xcccc0000 [Xorg[1928]]
Aug 26 13:04:45 test1 kernel: nouveau W[ PFIFO][0000:01:00.0] unknown intr 0x00010000, ch 2
Aug 26 13:04:45 test1 kernel: nouveau W[ PFIFO][0000:01:00.0] unknown intr 0x00010000, ch 2
Aug 26 13:04:45 test1 kernel: nouveau W[ PFIFO][0000:01:00.0] unknown intr 0x00010000, ch 2
Aug 26 13:04:45 test1 kernel: nouveau W[ PFIFO][0000:01:00.0] unknown intr 0x00010000, ch 2
[...]
Aug 26 13:04:45 test1 kernel: nouveau E[ PFIFO][0000:01:00.0] still angry after 101 spins, halt
As evident from the above logs, either the shutdown threshold temperature or my current card temperature is 0C (or both) for some reason, which causes my video card to freeze for a couple of seconds and then shut itself down.
My guess is that these values are not restored properly after hibernate. This doesn't happen very often though - it is the first time I am seeing this after about 20+ hibernate/restore cycles.
Also worth pointing out that I have the nouveau patches for bug #66177 applied and I am in auto fan management mode, which has been working properly.