Second gpu removed in multi gpu system
Hi,
I am facing the problem that on a multi gpu system sometimes the second GPU is not available in XOrg after reboot. This happens only around 1 out of 50 reboots.
I extended the logging in config/udev.c and this is the output.
1058 [ 9.148] (II) config/udev: handling udev action add
1059 [ 9.148] (II) config/udev: Ignore adding device (/dev/dri/renderD128) because sysname (renderD128) does not start with card
1060 [ 9.149] (II) config/udev: handling udev action add
1061 [ 9.149] (II) config/udev: Ignore adding device (/dev/dri/renderD129) because sysname (renderD129) does not start with card
1062 [ 9.149] (II) config/udev: handling udev action add
1063 [ 9.149] (II) config/udev: removing GPU device /sys/devices/pci0000:00/0000:00:08.1/0000:04:00.0/drm/card1 /dev/dri/card1
1064 [ 9.149] xf86: remove device 1 /sys/devices/pci0000:00/0000:00:08.1/0000:04:00.0/drm/card1
1065 [ 9.149] failed to find screen to remove
1066 [ 9.149] (II) config/udev: Ignore adding device (/dev/dri/card1) because it was already added by xf86platformProbe
1067 [ 9.149] (II) config/udev: handling udev action add
1068 [ 9.149] (II) config/udev: removing GPU device /sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/drm/card0 /dev/dri/card0
1069 [ 9.149] xf86: remove device 0 /sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/drm/card0
1070 [ 9.151] (II) UnloadModule: "amdgpu"
1071 [ 9.151] (II) UnloadSubModule: "glamoregl"
1072 [ 9.151] (II) Unloading glamoregl
1073 [ 9.151] (II) UnloadSubModule: "fb"
1074 [ 9.151] (II) Unloading fb
1075 [ 9.151] (II) systemd-logind: releasing fd for 226:0
1076 [ 9.152] (II) config/udev: Adding drm device (/dev/dri/card0)
1077 [ 9.152] (II) xfree86: Adding drm device (/dev/dri/card0)
1078 [ 9.153] (II) systemd-logind: got fd for /dev/dri/card0 226:0 fd 22 paused 0
1079 [ 9.153] (II) LoadModule: "modesetting"
1080 [ 9.154] (II) Loading /usr/lib/xorg/modules/drivers/modesetting_drv.so
1081 [ 9.154] (II) Module modesetting: vendor="X.Org Foundation"
1082 [ 9.154] compiled for 1.20.7, module version = 1.20.7
1083 [ 9.154] Module class: X.Org Video Driver
1084 [ 9.154] ABI class: X.Org Video Driver, version 24.1
1085 [ 9.154] (II) systemd-logind: releasing fd for 226:0
1086 [ 9.155] xf86: found device 1
I'm wondering why those events are even send because both GPUS are connected via PCI and are neither added or removed during runtime.
Both GPUs are also detected during startup from xf86
54 [ 7.981] (II) systemd-logind: took control of session /org/freedesktop/login1/session/c1
55 [ 7.982] (II) xfree86: Adding drm device (/dev/dri/card0)
56 [ 7.985] (II) systemd-logind: got fd for /dev/dri/card0 226:0 fd 11 paused 0
57 [ 7.985] (II) xfree86: Adding drm device (/dev/dri/card1)
58 [ 8.006] (II) systemd-logind: got fd for /dev/dri/card1 226:1 fd 12 paused 0
59 [ 8.009] (--) PCI: (1@0:0:0) 1002:699f:103c:ea73 rev 129, Mem @ 0xe0000000/268435456, 0xf0000000/2097152, 0xfea00000/262144, I/O @ 0x0000f000/256, BI OS @ 0x????????/131072
60 [ 8.009] (--) PCI:*(4@0:0:0) 1002:15dd:103c:8522 rev 130, Mem @ 0xc0000000/268435456, 0xd0000000/2097152, 0xfe700000/524288, I/O @ 0x0000d000/256