In the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the respective
init function - drm_sched_init() - was executed successfully. Happens that
we faced a driver probe failure in the Steam Deck recently, and the
function drm_sched_fini() was called even without its counter-part had been
previously called, causing the following oops: amdgpu: probe of
0000:04:00.0 failed with error -110 BUG: kernel NULL pointer dereference,
address: 0000000000000090 PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022 RIP:
0010:drm_sched_fini+0x84/0xa0 [gpu_sched] […] Call Trace: <TASK>
amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
devm_drm_dev_init_release+0x49/0x70 […] To prevent that, check if the
drm_sched was properly initialized for a given ring before calling its fini
counter-part. Notice ideally we’d use sched.ready for that; such field is
set as the latest thing on drm_sched_init(). But amdgpu seems to “override”
the meaning of such field - in the above oops for example, it was a GFX
ring causing the crash, and the sched.ready field was set to true in the
ring init routine, regardless of the state of the DRM scheduler. Hence, we
ended-up using sched.ops as per Christian’s suggestion [0], and also
removed the no_scheduler check [1]. [0]
https://lore.kernel.org/amd-gfx/[email protected]/
[1]
https://lore.kernel.org/amd-gfx/[email protected]/
OS | Version | Architecture | Package | Version | Filename |
---|---|---|---|---|---|
ubuntu | 22.04 | noarch | linux | < any | UNKNOWN |
ubuntu | 22.04 | noarch | linux-aws | < any | UNKNOWN |
ubuntu | 20.04 | noarch | linux-aws-5.15 | < any | UNKNOWN |
ubuntu | 22.04 | noarch | linux-azure | < any | UNKNOWN |
ubuntu | 20.04 | noarch | linux-azure-5.15 | < any | UNKNOWN |
ubuntu | 22.04 | noarch | linux-azure-fde | < 5.15.0-1038.45.1 | UNKNOWN |
ubuntu | 20.04 | noarch | linux-azure-fde-5.15 | < 5.15.0-1038.45~20.04.1.1 | UNKNOWN |
ubuntu | 22.04 | noarch | linux-gcp | < any | UNKNOWN |
ubuntu | 20.04 | noarch | linux-gcp-5.15 | < any | UNKNOWN |
ubuntu | 22.04 | noarch | linux-gke | < any | UNKNOWN |
git.kernel.org/linus/5ad7bbf3dba5c4a684338df1f285080f2588b535 (6.2-rc8)
git.kernel.org/stable/c/2bcbbef9cace772f5b7128b11401c515982de34b
git.kernel.org/stable/c/2e557c8ca2c585bdef591b8503ba83b85f5d0afd
git.kernel.org/stable/c/5ad7bbf3dba5c4a684338df1f285080f2588b535
launchpad.net/bugs/cve/CVE-2023-52738
nvd.nist.gov/vuln/detail/CVE-2023-52738
security-tracker.debian.org/tracker/CVE-2023-52738
www.cve.org/CVERecord?id=CVE-2023-52738