Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this wiki page for instructions on how to get full permissions. Sorry for the inconvenience.
Admin message
The migration is almost done, at least the rest should happen in the background. There are still a few technical difference between the old cluster and the new ones, and they are summarized in this issue. Please pay attention to the TL:DR at the end of the comment.
Phoenix laptop with updated uefi firmware does not boot on Arch Linux unless you put vcn_4_0_2.bin.zst in your initramfs or disable early kms (which is enabled by default)
I've noticed that Arch Linux does not boot on my laptop on the latest firmware with default settings:
The latest uefi ships updated amdgpu firmware and it requires you to either put /lib/firmware/amdgpu/vcn_4_0_2.bin.zst into your initramfs or to disable early kms altogether.
If you don't do so gdm will fail to start.
The driver needs firmwares available when it loads. If you are using an initrd, the firmwares need to be in the initrd. This is always the case, regardless of the system bios version.
That somehow wasn't the case before. Early kms is the default in Arch Linux: if it needed firmware in the initramfs to be able to load gdm there would be tons of people complaining, including here: mesa/mesa#8044 (closed)
Before getting the new firmware for 8044 I didn't have any firmware in my initramfs yet I was definitely able to load gdm and even doing hardware decoding. Where did it get the firmware from? I guess the system bios somehow preloaded some firmware, otherwise I really have no idea why it was working. I wasn't able to load the experimental firmware until I put it into the initramfs, so I was definitely using early kms.
I suspect you just have different timing of when the rootfs was mounted from before. Can you share the journal from a failed boot? it will show a lot more about what happened.
Mar 24 21:06:00 arch-phoenix kernel: amdgpu 0000:c3:00.0: Direct firmware load for amdgpu/vcn_4_0_2.bin failed with error -2Mar 24 21:06:00 arch-phoenix kernel: [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <vcn_v4_0> failed -19Mar 24 21:06:00 arch-phoenix kernel: amdgpu 0000:c3:00.0: [drm:jpeg_v4_0_early_init [amdgpu]] JPEG decode is enabled in VM modeMar 24 21:06:00 arch-phoenix kernel: amdgpu 0000:c3:00.0: amdgpu: Fatal error during GPU initMar 24 21:06:00 arch-phoenix kernel: amdgpu 0000:c3:00.0: amdgpu: amdgpu: finishing device.
You can see amdgpu failed to load because the firmware wasn't available.
Mar 24 21:06:00 arch-phoenix systemd[1]: Starting Remount Root and Kernel File Systems...
Next you can see that your rootfs is mounted (where the firmware is presumably).
Mar 24 21:06:18 arch-phoenix systemd[1]: Started Session 1 of User gdm.
You can see after all that happened that GDM tried to start.
I don't see anything that can be done here from amdgpu. Either keep amdgpu out of the initramfs so it gets loaded after rootfs is loaded or keep the related firmware for it in.
You can't have it split up and expect it to work every time.
If you want amdgpu out of the initramfs I suggest enabling simpledrm like Fedora does so you can use the framebuffer built by the UEFI GOP driver early in the boot.
One more thing - if you choose to use simpledrm in the initramfs instead of amdgpu, modern GDM will handle it properly to ensure that amdgpu is loaded by the time GDM starts.
See https://gitlab.gnome.org/GNOME/mutter/-/issues/2909 for more details.
I don't see anything that can be done here from amdgpu
I understand that, what I don't understand is how did it manage to somehow work before.
I've put the firmware into my initramfs on the 6th of December 2023 because I wanted to update it but before I was still using early kms and it worked just fine.
As you can see from the previous picture early kms is the default in Arch Linux since more than a year. I've installed this system in September of the last year, so I was definitely using early kms, which is also confirmed by the fact that on the 6th December 2023 I had to add FILES=/lib/firmware/amdgpu/vcn_4_0_2.bin.zst) to /etc/mkinitcpio.conf to test the new experimental VCN firmware. Somehow it was working before, not sure if it's because of the older system bios, because of the older kernel or what else.
If you want amdgpu out of the initramfs I suggest enabling simpledrm like Fedora does so you can use the framebuffer built by the UEFI GOP driver early in the boot.
simpledrm is already enabled in Arch Linux's kernel. I think the reason why they default to early kms has something to do with the Nvidia proprietary drivers. I've removed the kms hook because I see no advantage in loading amdgpu early in the boot, but I still wonder how is it possible that between September 2023 and at least the 6th of December 2023 it used to work nevertheless.
Unfortunately there is also little point in simpledrm for me because I have been told HP's BIOS forbids any kind of video output from the framebuffer built by the UEFI GOP driver. For security reasons I have been told. Unfortunately my HP Thunderbolt Dock G4 is also VERY slow to enable its video output afterwards, so 90% of the times I don't see any output at all on my external monitors until gdm loads.
By the way DP-MST is still very buggy on Phoenix, this is one of the many random crashes that I get even with latest 6.8.1:
journal-mst.log
Somehow it was working before, not sure if it's because of the older system bios, because of the older kernel or what else.
but I still wonder how is it possible that between September 2023 and at least the 6th of December 2023 it used to work nevertheless.
Look at historical journal output for boots that it worked. I suspect it's a race or ordering issue from the filesystem that contains the firmware being available.
By the way DP-MST is still very buggy on Phoenix, this is one of the many random crashes that I get even with latest 6.8.1:
Stick to one issue per bug. If you have MST issues that should be a separate bug.
Look at historical journal output for boots that it worked
I've already looked for them but I started putting the experimental firmware in the initramfs back in December 2023 and unfortunately logs don't go that much back in time. I've looked at zfs snapshots as well but the older ones have already been pruned. I'm sure there must be a dmesg of mine somewhere on the internet from that period but I still have to find it.
I've also looked at the old live usb that I used to install this system back in September in the hope that the regression was in the kernel and not in the bios, but I noticed that unfortunately Arch installation medias ship every possible firmware in their initramfs making the test useless.
I also tried to downgrade the BIOS, but I can only do so up to late December builds because that was the first release to be distributed via LVFS. That didn't help either, but I cannot exclude that the bios might be the culprit because I used to use much older versions back then.
Honestly I don't know at this point, I just find very strange that there aren't dozens of users complaining that their freshly installed Arch Linux systems don't work because Arch Linux is currently definitely NOT compatible by default with Phoenix.
I'll try filing an issue in mkinitcpio and see what Arch maintainers think about this.
Honestly I don't know at this point, I just find very strange that there aren't dozens of users complaining that their freshly installed Arch Linux systems don't work because Arch Linux is currently definitely NOT compatible by default with Phoenix.
Maybe everyone else always puts the firmware in the initramfs? I think modifying the Arch initramfs building infrastructure to always put the matching amdgpu related firmware will fix this bug whether amdgpu is loaded early or late.
The files are matched by globbing and I made a backup of the original firmware before testing the experimental one, so instead of /usr/lib/firmware/amdgpu/vcn_4_0_2.bin.zst, /usr/lib/firmware/amdgpu/vcn_4_0_2.bin.zst.orig is being added instead!
That explains why I've only seen few reports of this and only from people who tried the experimental firmware. I guess I'm not the only one making backups in the same directory
I'm closing this. Here is the other report for the DP-MST issues: #3301