UBSAN: array-index-out-of-bounds in display/dc/bios/bios_parser2.c:145:46, index 8 is out of range for type 'atom_display_object_path_v2 [8]'
Overview
On boot, amdgpu
's VBIOS parsing code in display/dc/bios/bios_parser2.c
generates two out-of-bound memory accesses, which are identified by UBSAN (full log is attached at the end of this report).
UBSAN: array-index-out-of-bounds in /var/tmp/portage/sys-kernel/gentoo-kernel-6.10.8/work/linux-6.10/drivers/gpu/drm/amd/amdgpu/../display/dc/bios/bios_parser2.c:145:46
index 8 is out of range for type 'atom_display_object_path_v2 [8]'
UBSAN: array-index-out-of-bounds in /var/tmp/portage/sys-kernel/gentoo-kernel-6.10.8/work/linux-6.10/drivers/gpu/drm/amd/amdgpu/../display/dc/bios/bios_parser2.c:264:29
index 8 is out of range for type 'atom_display_object_path_v2 [8]'
After adding some debugging printk()
code, I found the value of number_of_path
before the crash was 9
:
tbl->v1_4->number_of_path: 9
Both the variable number_of_path
and the array display_path[8]
are defined in atomfirmware.h
under struct display_object_info_table_v1_4
.
Thus, the maximum number of display_path[]
supported is hardcoded to 8, but the number_of_path
encountered was 9
on my machine.
This suggests two problems: First, it seems that the BIOS parser bios_parser2.c
takes the value advertised by AtomFirmware at face value without doing a sanity check, so a malformed AtomFirmware can crash the driver across many GPU generations. This should never happen. The variable number_of_path
must be ranged-checked in bios_parser2_construct()
before used by other rounites (and perhaps generate a warning as well, if it exceeds the hardcoded value).
Also, the next question is whether the length 9
of display_path
is in fact a legal value in theory, or whether that just a malformed value from a bad OEM VBIOS (I suspect it's just a malformed VBIOS value, as the GPU in question is a non-standard variant of AMD Radeon Pro VII made by some OEMs in China during the mining craze of 2020s due to its high GPGPU performance). The GPU has a single DisplayPort connector. The problem occurs regardless of whether the monitor is plugged in.
As a hack, I'm using the following patch to make the problem go away:
diff '--color=auto' -uprN linux-6.10.8-gentoo/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c linux-6.10.8-gentoo-amdhack/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
--- linux-6.10.8-gentoo/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c 2024-09-05 20:21:56.935314853 +0000
+++ linux-6.10.8-gentoo-amdhack/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c 2024-09-05 22:03:21.483281432 +0000
@@ -3642,6 +3642,12 @@ static bool bios_parser2_construct(
return false;
bp->object_info_tbl.v1_4 = tbl_v1_4;
+
+ if (tbl_v1_4->number_of_path > 8) {
+ pr_warn("Firmware tbl_v1_4->number_of_path value %d is too large and has been ignored\n",
+ tbl_v1_4->number_of_path);
+ tbl_v1_4->number_of_path = 8;
+ }
} else if (bp->object_info_tbl.revision.major == 1
&& bp->object_info_tbl.revision.minor == 5) {
struct display_object_info_table_v1_5 *tbl_v1_5;
To make it acceptable for the upstream:
- Both
tbl_v1_4
andtbl_v1_5
code paths must be checked. - Add a
MAX_NUMBER_OF_PATH
macro inatomfirmware.h
?
Hardware description:
- GPU: AMD Radeon Pro VII (gfx906), mining special edition
- Type of Display Connection: No Connection, or DisplayPort
System information:
- Gentoo ~amd64
- Kernel version: 6.10.8
How to reproduce the issue:
- Boot Linux
- Check Dmesg
Log files (for system lockups / game freezes / crashes)
[ 5.603200] ------------[ cut here ]------------
[ 5.603202] UBSAN: array-index-out-of-bounds in /var/tmp/portage/sys-kernel/gentoo-kernel-6.10.8/work/linux-6.10/drivers/gpu/drm/amd/amdgpu/../display/dc/bios/bios_parser2.c:145:46
[ 5.603206] index 8 is out of range for type 'atom_display_object_path_v2 [8]'
[ 5.603209] CPU: 0 PID: 467 Comm: kworker/0:4 Not tainted 6.10.8-gentoo-dist #1
[ 5.603213] Hardware name: HUANANZHI /X99-F8D, BIOS 5.11 03/22/2023
[ 5.603216] Workqueue: events work_for_cpu_fn
[ 5.603223] Call Trace:
[ 5.603226] <TASK>
[ 5.603228] dump_stack_lvl+0x64/0x80
[ 5.603237] __ubsan_handle_out_of_bounds+0x98/0xd0
[ 5.603243] dal_cmd_table_helper_encoder_id_to_atom2+0x222a/0x32b0 [amdgpu]
[ 5.603455] dc_process_hdcp_msg+0x1ca6/0x2bc0 [amdgpu]
[ 5.603580] dc_create+0x403/0x790 [amdgpu]
[ 5.603703] amdgpu_dm_update_connector_after_detect+0x129e/0x3dc0 [amdgpu]
[ 5.603856] ? phm_wait_for_register_unequal+0x62/0xa0 [amdgpu]
[ 5.604002] ? phm_wait_for_register_unequal+0x62/0xa0 [amdgpu]
[ 5.604153] amdgpu_dm_update_connector_after_detect+0x3732/0x3dc0 [amdgpu]
[ 5.604309] amdgpu_device_init+0x2320/0x2ec0 [amdgpu]
[ 5.604413] amdgpu_driver_load_kms+0x19/0xb0 [amdgpu]
[ 5.604517] amdgpu_drm_ioctl+0x78a/0xfa0 [amdgpu]
[ 5.604619] local_pci_probe+0x45/0xa0
[ 5.604624] work_for_cpu_fn+0x1a/0x30
[ 5.604628] process_one_work+0x17e/0x390
[ 5.604633] worker_thread+0x265/0x380
[ 5.604636] ? __pfx_worker_thread+0x10/0x10
[ 5.604639] kthread+0xd2/0x100
[ 5.604644] ? __pfx_kthread+0x10/0x10
[ 5.604647] ret_from_fork+0x34/0x50
[ 5.604652] ? __pfx_kthread+0x10/0x10
[ 5.604655] ret_from_fork_asm+0x1a/0x30
[ 5.604660] </TASK>
[ 5.604662] ---[ end trace ]---
[ 5.604673] ------------[ cut here ]------------
[ 5.604676] UBSAN: array-index-out-of-bounds in /var/tmp/portage/sys-kernel/gentoo-kernel-6.10.8/work/linux-6.10/drivers/gpu/drm/amd/amdgpu/../display/dc/bios/bios_parser2.c:264:29
[ 5.604680] index 8 is out of range for type 'atom_display_object_path_v2 [8]'
[ 5.604683] CPU: 0 PID: 467 Comm: kworker/0:4 Not tainted 6.10.8-gentoo-dist #1
[ 5.604686] Hardware name: HUANANZHI /X99-F8D, BIOS 5.11 03/22/2023
[ 5.604688] Workqueue: events work_for_cpu_fn
[ 5.604693] Call Trace:
[ 5.604695] <TASK>
[ 5.604697] dump_stack_lvl+0x64/0x80
[ 5.604701] __ubsan_handle_out_of_bounds+0x98/0xd0
[ 5.604704] dal_cmd_table_helper_encoder_id_to_atom2+0xc6e/0x32b0 [amdgpu]
[ 5.604885] link_destroy+0x51c/0xee0 [amdgpu]
[ 5.605066] link_create+0x1e0/0x240 [amdgpu]
[ 5.605245] ? dal_cmd_table_helper_encoder_id_to_atom2+0x222a/0x32b0 [amdgpu]
[ 5.605405] dc_process_hdcp_msg+0x1df1/0x2bc0 [amdgpu]
[ 5.605534] dc_create+0x403/0x790 [amdgpu]
[ 5.605661] amdgpu_dm_update_connector_after_detect+0x129e/0x3dc0 [amdgpu]
[ 5.605819] ? phm_wait_for_register_unequal+0x62/0xa0 [amdgpu]
[ 5.605968] ? phm_wait_for_register_unequal+0x62/0xa0 [amdgpu]
[ 5.606116] amdgpu_dm_update_connector_after_detect+0x3732/0x3dc0 [amdgpu]
[ 5.606270] amdgpu_device_init+0x2320/0x2ec0 [amdgpu]
[ 5.606374] amdgpu_driver_load_kms+0x19/0xb0 [amdgpu]
[ 5.606477] amdgpu_drm_ioctl+0x78a/0xfa0 [amdgpu]
[ 5.606578] local_pci_probe+0x45/0xa0
[ 5.606582] work_for_cpu_fn+0x1a/0x30
[ 5.606586] process_one_work+0x17e/0x390
[ 5.606589] worker_thread+0x265/0x380
[ 5.606592] ? __pfx_worker_thread+0x10/0x10
[ 5.606595] kthread+0xd2/0x100
[ 5.606600] ? __pfx_kthread+0x10/0x10
[ 5.606604] ret_from_fork+0x34/0x50
[ 5.606607] ? __pfx_kthread+0x10/0x10
[ 5.606611] ret_from_fork_asm+0x1a/0x30
[ 5.606615] </TASK>
[ 5.606621] ---[ end trace ]---