Frequent Xorg lockups with GT710 after rad or write faults
Submitted by ken moffat
Assigned to Nouveau Project
Link to original bug (#105814)
Description
This is my second problem on my new machine (ryzen, GT710 (GK208B) - I needed a vga connector), linux-4.15.12, gcc-7.3.0, glibc-2.27, mesa-17.3.7, xorg-server-1.19.6, mouveau-1.0.15).
I am seeing frequent lockups in X (keyboard and mouse not working, but MagicSysRQ works). Arguably this is similar to #104448, but I'm using "regular" glibc.
There is no particular pattern to when this bug bites - the first time, I had just built icewm, rerun startx, and was evaluating the various themes. Later, I was installing firefox extension ublock-origin. As with the other bug I just raised (105813) I tried using the modesetting Xorg driver instead of nouveau, but the problem occurred again (left it compiling, came back to a frozen xscreensaver). After that I turned off xscreensaver and it has now been up for perhaps 2 hours, but will maybe go down again soon ;-(
From these failures I have extracted the following from my syslog, several (but not all) show UNSUPPORTED_KIND. The 'scheduled for recovery' messages imply it might self-correct, but it doesn't seem to.
Mar 28 18:50:05 origin kernel: [ 7715.267689] nouveau 0000:26:00.0: fifo: FB_FLUSH_TIMEOUT
Mar 28 18:50:05 origin kernel: [ 7715.273911] nouveau 0000:26:00.0: fifo: write fault at 0003228000 engine 1b [CE2] client 18 [GR_CE] reason 0c [UNSUPPORTED_KIND] on channel 2 [003fbfa000 Xorg[22520]]
Mar 28 18:50:05 origin kernel: [ 7715.273921] nouveau 0000:26:00.0: fifo: channel 2: killed
Mar 28 18:50:05 origin kernel: [ 7715.273923] nouveau 0000:26:00.0: fifo: runlist 0: scheduled for recovery
Mar 28 18:50:05 origin kernel: [ 7715.273934] nouveau 0000:26:00.0: fifo: engine 0: scheduled for recovery
Mar 28 18:50:05 origin kernel: [ 7715.273938] nouveau 0000:26:00.0: fifo: engine 6: scheduled for recovery
Mar 28 18:50:05 origin kernel: [ 7715.273965] nouveau 0000:26:00.0: Xorg[22520]: channel 2 killed!
Mar 28 19:33:30 origin kernel: [ 2215.090852] nouveau 0000:26:00.0: fifo: read fault at 000171c000 engine 1b [CE2] client 18 [GR_CE] reason 02 [PTE] on channel 2 [003fbfa000 Xorg[1576]]
Mar 28 19:33:30 origin kernel: [ 2215.090861] nouveau 0000:26:00.0: fifo: channel 2: killed
Mar 28 19:33:30 origin kernel: [ 2215.090863] nouveau 0000:26:00.0: fifo: runlist 0: scheduled for recovery
Mar 28 19:33:30 origin kernel: [ 2215.090873] nouveau 0000:26:00.0: fifo: engine 0: scheduled for recovery
Mar 28 19:33:30 origin kernel: [ 2215.090877] nouveau 0000:26:00.0: fifo: engine 6: scheduled for recovery
Mar 28 19:33:30 origin kernel: [ 2215.090891] nouveau 0000:26:00.0: Xorg[1576]: channel 2 killed!
Mar 28 22:27:26 origin kernel: [ 3793.782622] nouveau 0000:26:00.0: fifo: CHSW_ERROR 00000004
Mar 28 22:27:27 origin kernel: [ 3794.949062] nouveau 0000:26:00.0: fifo: CHSW_ERROR 00000002
Mar 28 22:27:27 origin kernel: [ 3794.949066] nouveau 0000:26:00.0: fifo: FB_FLUSH_TIMEOUT
Mar 28 22:27:27 origin kernel: [ 3795.465905] nouveau 0000:26:00.0: fifo: CHSW_ERROR 00000002
Mar 28 22:27:27 origin kernel: [ 3795.465909] nouveau 0000:26:00.0: fifo: FB_FLUSH_TIMEOUT
Mar 28 22:27:27 origin kernel: [ 3795.532524] nouveau 0000:26:00.0: fifo: FB_FLUSH_TIMEOUT
Mar 28 22:27:27 origin kernel: [ 3795.537738] nouveau 0000:26:00.0: fifo: read fault at 0001f08000 engine 1b [CE2] client 18 [GR_CE] reason 0c [UNSUPPORTED_KIND] on channel 2 [003fbfa000 Xorg[1290]]
Mar 28 22:27:27 origin kernel: [ 3795.537745] nouveau 0000:26:00.0: fifo: channel 2: killed
Mar 28 22:27:27 origin kernel: [ 3795.537747] nouveau 0000:26:00.0: fifo: runlist 0: scheduled for recovery
Mar 28 22:27:27 origin kernel: [ 3795.537757] nouveau 0000:26:00.0: fifo: engine 0: scheduled for recovery
Mar 28 22:27:27 origin kernel: [ 3795.537761] nouveau 0000:26:00.0: fifo: engine 6: scheduled for recovery
Mar 29 21:19:49 origin kernel: [13387.804276] nouveau 0000:26:00.0: fifo: read fault at 0004740000 engine 1b [CE2] client 18 [GR_CE] reason 0c [UNSUPPORTED_KIND] on channel 2 [003fbfa000 Xorg[1343]]
Mar 29 21:19:49 origin kernel: [13387.804286] nouveau 0000:26:00.0: fifo: channel 2: killed
Mar 29 21:19:49 origin kernel: [13387.804289] nouveau 0000:26:00.0: fifo: runlist 0: scheduled for recovery
Mar 29 21:19:49 origin kernel: [13387.804299] nouveau 0000:26:00.0: fifo: engine 0: scheduled for recovery
Mar 29 21:19:49 origin kernel: [13387.804303] nouveau 0000:26:00.0: fifo: engine 6: scheduled for recovery
Mar 29 21:19:49 origin kernel: [13387.804308] nouveau 0000:26:00.0: Xorg[1343]: channel 2 killed!
(that last one was using modesetting, the others were using the Xorg nouveau driver).