Steam Deck Virtual Displays Freeze (amdgpu virtual_display=0000:04:00.0,2)
Brief summary of the problem:
create /etc/modprobe.d/amdgpu.conf
with the following contents:
options amdgpu virtual_display=0000:04:00.0,2
then reboot.
System appears to freeze on next boot. At the moment, I cannot fetch any dmesg output because a reboot results in an unresponsive black screen, and I cannot ssh into the machine, which suggests a broader failure. I have to boot into another OS, mount the filesystem, and delete /etc/modprobe.d/amdgpu.conf
to get back into linux.
Background: I'm using Immersed and a Quest 3 VR headset as a replacement for physical monitors in travel. Virtual displays allow me to create multiple, large virtual monitors in a Virtual Reality space.
I first encountered a similar problem using xrandr (may or may not be related). Without virtual_display=
enabled, if I add new modes and initialize unused heads (i.e. DisplayPort-0, DisplayPort-1, etc.) The video card seems to completely freeze in the same manner. Here's how to reproduce that:
xrandr --newmode "3440x1440" 419.50 3440 3696 4064 4688 1440 1443 1453 1493 -hsync +vsync
xrandr --addmode DisplayPort-0 3440x1440
xrandr --output eDP --mode 800x1280 --rotate right --pos 0x0 --output DisplayPort-0 --mode 3440x1440 --above eDP
After this last command, the display on my steam deck stops responding locally. I can SSH in, but even a shutdown command seems to hang. For the xrandr scenario, here are all new lines output by dmesg after executing the above commands:
[ 10.358868] [drm] Failed to add display topology, DTM TA is not initialized.
[ 25.735617] systemd-journald[329]: File /var/log/journal/5a3a4ecefd9d4e8a869e05d1b3d9d99a/user-1000.journal corrupted or uncleanly shut down, renaming and replacing.
[ 26.182078] Key type trusted registered
[ 26.201660] Key type encrypted registered
[ 28.827165] Bluetooth: RFCOMM TTY layer initialized
[ 28.827177] Bluetooth: RFCOMM socket layer initialized
[ 28.827182] Bluetooth: RFCOMM ver 1.11
[ 29.008012] wlo1: authenticate with 38:94:ed:5c:57:6e
[ 29.241490] cs35l41 spi-VLV1776:00: DSP1: Firmware version: 3
[ 29.241498] cs35l41 spi-VLV1776:00: DSP1: cirrus/cs35l41-dsp1-spk-prot.wmfw: Fri 24 Jun 2022 14:55:56 GMT Daylight Time
[ 29.352020] cs35l41 spi-VLV1776:00: DSP1: Firmware: 400a4 vendor: 0x2 v0.58.0, 2 algorithms
[ 29.352291] cs35l41 spi-VLV1776:00: DSP1: Protection: e:\workspace\workspace\tibranch_release_playback_6.76_2\ormis\staging\default_tunings\internal\CS35L53\Fixed_Attenuation_Mono_48000_29.78.0\full\Fixed_Attenuation_Mono_48000_29.78.0_full.bin
[ 29.356933] cs35l41 spi-VLV1776:00: DSP1: Legacy support not available
[ 29.360816] cs35l41 spi-VLV1776:01: DSP1: Firmware version: 3
[ 29.360826] cs35l41 spi-VLV1776:01: DSP1: cirrus/cs35l41-dsp1-spk-prot.wmfw: Fri 24 Jun 2022 14:55:56 GMT Daylight Time
[ 29.465924] wlo1: send auth to 38:94:ed:5c:57:6e (try 1/3)
[ 29.469267] wlo1: authenticated
[ 29.470411] wlo1: associate with 38:94:ed:5c:57:6e (try 1/3)
[ 29.471564] cs35l41 spi-VLV1776:01: DSP1: Firmware: 400a4 vendor: 0x2 v0.58.0, 2 algorithms
[ 29.471831] cs35l41 spi-VLV1776:01: DSP1: Protection: e:\workspace\workspace\tibranch_release_playback_6.76_2\ormis\staging\default_tunings\internal\CS35L53\Fixed_Attenuation_Mono_48000_29.78.0\full\Fixed_Attenuation_Mono_48000_29.78.0_full.bin
[ 29.476757] cs35l41 spi-VLV1776:01: DSP1: Legacy support not available
[ 29.580437] wlo1: associate with 38:94:ed:5c:57:6e (try 2/3)
[ 29.633905] wlo1: RX AssocResp from 38:94:ed:5c:57:6e (capab=0x1511 status=0 aid=4)
[ 29.634248] wlo1: associated
[ 29.709715] wlo1: Limiting TX power to 27 (30 - 3) dBm as advertised by 38:94:ed:5c:57:6e
[ 29.833952] IPv6: ADDRCONF(NETDEV_CHANGE): wlo1: link becomes ready
[ 34.349675] input: LOGI M240 Mouse as /devices/virtual/misc/uhid/0005:046D:B03A.0006/input/input20
[ 34.349888] hid-generic 0005:046D:B03A.0006: input,hidraw5: BLUETOOTH HID v0.07 Mouse [LOGI M240] on 2c:3b:70:ef:95:20
[ 268.159534] BUG: kernel NULL pointer dereference, address: 0000000000000054
[ 268.159544] #PF: supervisor read access in kernel mode
[ 268.159548] #PF: error_code(0x0000) - not-present page
[ 268.159551] PGD 0 P4D 0
[ 268.159556] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 268.159561] CPU: 2 PID: 1502 Comm: kworker/u32:12 Tainted: G OE 6.1.71-1-MANJARO #1 1187ded1140d09a102ea92cb86d9a9d11226c96d
[ 268.159567] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0119 10/24/2023
[ 268.159570] Workqueue: events_unbound commit_work
[ 268.159579] RIP: 0010:resource_build_scaling_params+0x90/0xf50 [amdgpu]
[ 268.160109] Code: 00 00 41 8b 85 68 02 00 00 49 89 fc ba 09 00 00 00 83 f8 15 77 07 8b 14 85 00 dc 3d c2 49 8b 5c 24 08 41 89 94 24 a8 00 00 00 <8b> 43 54 01 83 3c 02 00 00 49 8b 44 24 08 8b 53 6c 01 90 40 02 00
[ 268.160113] RSP: 0018:ffffb7c6c85738b0 EFLAGS: 00010297
[ 268.160117] RAX: 0000000000000003 RBX: 0000000000000000 RCX: 0000000000000000
[ 268.160120] RDX: 0000000000000003 RSI: 0000000000000500 RDI: ffff8a1cca940a98
[ 268.160123] RBP: 0000000000000d70 R08: 0000000000000000 R09: 0000000000000000
[ 268.160125] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a1cca940a98
[ 268.160127] R13: ffff8a1c64c4b000 R14: 0000000000000d70 R15: ffff8a1c1dcc0000
[ 268.160130] FS: 0000000000000000(0000) GS:ffff8a1f2fe80000(0000) knlGS:0000000000000000
[ 268.160134] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 268.160136] CR2: 0000000000000054 CR3: 0000000359010000 CR4: 0000000000350ee0
[ 268.160140] Call Trace:
[ 268.160146] <TASK>
[ 268.160152] ? __die_body.cold+0x1a/0x1f
[ 268.160159] ? page_fault_oops+0x15a/0x2d0
[ 268.160167] ? exc_page_fault+0x7c/0x180
[ 268.160173] ? asm_exc_page_fault+0x26/0x30
[ 268.160182] ? resource_build_scaling_params+0x90/0xf50 [amdgpu 5cda98c1cc6e1aeac9813d0a76ee13654aca08de]
[ 268.160690] dcn30_internal_validate_bw+0x906/0xa20 [amdgpu 5cda98c1cc6e1aeac9813d0a76ee13654aca08de]
[ 268.161204] dcn30_validate_bandwidth+0xb2/0x2b0 [amdgpu 5cda98c1cc6e1aeac9813d0a76ee13654aca08de]
[ 268.161727] dc_validate_global_state+0x30a/0x3e0 [amdgpu 5cda98c1cc6e1aeac9813d0a76ee13654aca08de]
[ 268.162252] dc_commit_state+0xb8/0x130 [amdgpu 5cda98c1cc6e1aeac9813d0a76ee13654aca08de]
[ 268.162758] amdgpu_dm_atomic_commit_tail+0x5d8/0x3750 [amdgpu 5cda98c1cc6e1aeac9813d0a76ee13654aca08de]
[ 268.163270] ? load_balance+0xac3/0xe60
[ 268.163279] ? task_numa_assign+0x250/0x270
[ 268.163286] ? psi_group_change+0x1bc/0x350
[ 268.163292] ? psi_task_switch+0xd6/0x230
[ 268.163298] ? __switch_to_asm+0x3e/0x60
[ 268.163304] ? finish_task_switch.isra.0+0x94/0x2f0
[ 268.163310] ? __schedule+0x378/0x12c0
[ 268.163316] ? __wake_up_common+0x76/0x180
[ 268.163322] ? schedule+0x5e/0xd0
[ 268.163325] ? wq_worker_running+0xe/0x50
[ 268.163331] ? schedule_timeout+0x11c/0x150
[ 268.163336] ? dma_fence_default_wait+0x93/0x280
[ 268.163342] ? __bpf_trace_dma_fence+0x10/0x10
[ 268.163347] ? wait_for_completion_timeout+0x13e/0x170
[ 268.163353] commit_tail+0x94/0x130
[ 268.163358] process_one_work+0x1c7/0x3a0
[ 268.163365] worker_thread+0x51/0x390
[ 268.163370] ? process_one_work+0x3a0/0x3a0
[ 268.163375] kthread+0xde/0x110
[ 268.163379] ? kthread_complete_and_exit+0x20/0x20
[ 268.163384] ret_from_fork+0x22/0x30
[ 268.163393] </TASK>
[ 268.163395] Modules linked in: ccm rfcomm ecb ecryptfs cbc encrypted_keys trusted asn1_encoder tee uhid qrtr cmac algif_hash algif_skcipher af_alg bnep ext4 intel_rapl_msr intel_rapl_common snd_acp5x_i2s snd_acp5x_pcm_dma edac_mce_amd mbcache joydev snd_soc_acp5x_mach jbd2 snd_sof_amd_rembrandt kvm_amd xhci_plat_hcd snd_sof_amd_renoir amdgpu snd_sof_amd_acp rtw88_8822ce kvm rtw88_8822c snd_sof_pci irqbypass snd_sof crct10dif_pclmul rtw88_pci dwc3 crc32_pclmul snd_sof_utils polyval_clmulni polyval_generic gf128mul snd_pci_ps rtw88_core roles ghash_clmulni_intel btusb sha512_ssse3 snd_hda_codec_hdmi snd_soc_cs35l41_spi snd_rpl_pci_acp6x ulpi snd_acp_pci snd_soc_cs35l41 sha1_ssse3 hid_multitouch udc_core snd_pci_acp6x btrtl snd_hda_intel snd_soc_wm_adsp gpu_sched mac80211 cs_dsp aesni_intel btbcm snd_soc_nau8821 snd_soc_cs35l41_lib drm_buddy snd_pci_acp5x snd_intel_dspcfg crypto_simd drm_ttm_helper snd_rn_pci_acp3x snd_intel_sdw_acpi libarc4 btintel cryptd btmtk snd_acp_config snd_soc_core
[ 268.163489] snd_hda_codec cfg80211 vfat ttm sp5100_tco bluetooth wdat_wdt fat snd_compress snd_hda_core snd_soc_acpi drm_display_helper pcspkr rapl ac97_bus snd_hwdep snd_pcm_dmaengine snd_pcm i2c_piix4 ecdh_generic cdc_acm snd_pci_acp3x crc16 mmc_block dwc3_pci rfkill cec ccp video snd_timer ltrf216a opt3001 wmi i2c_hid_acpi snd mousedev 8250_dw industrialio i2c_hid soundcore acpi_cpufreq mac_hid uinput v4l2loopback(OE) videodev mc i2c_dev crypto_user fuse dm_mod loop nfnetlink bpf_preload ip_tables x_tables usbhid btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq serio_raw atkbd sdhci_pci libps2 vivaldi_fmap cqhci nvme sdhci crc32c_intel sha256_ssse3 nvme_core i8042 xhci_pci xhci_pci_renesas mmc_core nvme_common serio spi_amd
[ 268.163581] CR2: 0000000000000054
[ 268.163584] ---[ end trace 0000000000000000 ]---
[ 268.163586] RIP: 0010:resource_build_scaling_params+0x90/0xf50 [amdgpu]
[ 268.164088] Code: 00 00 41 8b 85 68 02 00 00 49 89 fc ba 09 00 00 00 83 f8 15 77 07 8b 14 85 00 dc 3d c2 49 8b 5c 24 08 41 89 94 24 a8 00 00 00 <8b> 43 54 01 83 3c 02 00 00 49 8b 44 24 08 8b 53 6c 01 90 40 02 00
[ 268.164091] RSP: 0018:ffffb7c6c85738b0 EFLAGS: 00010297
[ 268.164095] RAX: 0000000000000003 RBX: 0000000000000000 RCX: 0000000000000000
[ 268.164097] RDX: 0000000000000003 RSI: 0000000000000500 RDI: ffff8a1cca940a98
[ 268.164099] RBP: 0000000000000d70 R08: 0000000000000000 R09: 0000000000000000
[ 268.164102] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a1cca940a98
[ 268.164104] R13: ffff8a1c64c4b000 R14: 0000000000000d70 R15: ffff8a1c1dcc0000
[ 268.164107] FS: 0000000000000000(0000) GS:ffff8a1f2fe80000(0000) knlGS:0000000000000000
[ 268.164110] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 268.164112] CR2: 0000000000000054 CR3: 0000000359010000 CR4: 0000000000350ee0
[ 268.164116] note: kworker/u32:12[1502] exited with irqs disabled
[ 268.164125] note: kworker/u32:12[1502] exited with preempt_count 1
Hardware description:
Stock, original Steam Deck (non-OLED version)
System information:
- Distro name and Version: Manjaro
- Kernel version: 6.1.69-1-MANJARO