In the Linux kernel, the following vulnerability has been resolved:
mm/sparsemem: fix race in accessing memory_section->usage The below race is
observed on a PFN which falls into the device memory region with the system
memory configuration where PFN’s are such that [ZONE_NORMAL ZONE_DEVICE
ZONE_NORMAL]. Since normal zone start and end pfn contains the device
memory PFN’s as well, the compaction triggered will try on the device
memory PFN’s too though they end up in NOP(because pfn_to_online_page()
returns NULL for ZONE_DEVICE memory sections). When from other core, the
section mappings are being removed for the ZONE_DEVICE region, that the PFN
in question belongs to, on which compaction is currently being operated is
resulting into the kernel crash with CONFIG_SPASEMEM_VMEMAP enabled. The
crash logs can be seen at [1]. compact_zone() memunmap_pages -------------
--------------- __pageblock_pfn_to_page … (a)pfn_valid():
valid_section()//return true (b)__remove_pages()->
sparse_remove_section()-> section_deactivate(): [Free the array ms->usage
and set ms->usage = NULL] pfn_section_valid() [Access ms->usage which is
NULL] NOTE: From the above it can be said that the race is reduced to
between the pfn_valid()/pfn_section_valid() and the section deactivate with
SPASEMEM_VMEMAP enabled. The commit b943f045a9af(“mm/sparse: fix kernel
crash with pfn_section_valid check”) tried to address the same problem by
clearing the SECTION_HAS_MEM_MAP with the expectation of valid_section()
returns false thus ms->usage is not accessed. Fix this issue by the below
steps: a) Clear SECTION_HAS_MEM_MAP before freeing the ->usage. b) RCU
protected read side critical section will either return NULL when
SECTION_HAS_MEM_MAP is cleared or can successfully access ->usage. c) Free
the ->usage with kfree_rcu() and set ms->usage = NULL. No attempt will be
made to access ->usage after this as the SECTION_HAS_MEM_MAP is cleared
thus valid_section() return false. Thanks to David/Pavan for their inputs
on this patch. [1]
https://lore.kernel.org/linux-mm/[email protected]/
On Snapdragon SoC, with the mentioned memory configuration of PFN’s as
[ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL], we are able to see bunch of issues
daily while testing on a device farm. For this particular issue below is
the log. Though the below log is not directly pointing to the
pfn_section_valid(){ ms->usage;}, when we loaded this dump on T32
lauterbach tool, it is pointing. [ 540.578056] Unable to handle kernel NULL
pointer dereference at virtual address 0000000000000000 [ 540.578068] Mem
abort info: [ 540.578070] ESR = 0x0000000096000005 [ 540.578073] EC = 0x25:
DABT (current EL), IL = 32 bits [ 540.578077] SET = 0, FnV = 0 [
540.578080] EA = 0, S1PTW = 0 [ 540.578082] FSC = 0x05: level 1 translation
fault [ 540.578085] Data abort info: [ 540.578086] ISV = 0, ISS =
0x00000005 [ 540.578088] CM = 0, WnR = 0 [ 540.579431] pstate: 82400005
(Nzcv daif +PAN -UAO +TCO -DIT -SSBSBTYPE=–) [ 540.579436] pc :
__pageblock_pfn_to_page+0x6c/0x14c [ 540.579454] lr :
compact_zone+0x994/0x1058 [ 540.579460] sp : ffffffc03579b510 [ 540.579463]
x29: ffffffc03579b510 x28: 0000000000235800 x27:000000000000000c [
540.579470] x26: 0000000000235c00 x25: 0000000000000068
x24:ffffffc03579b640 [ 540.579477] x23: 0000000000000001 x22:
ffffffc03579b660 x21:0000000000000000 [ 540.579483] x20: 0000000000235bff
x19: ffffffdebf7e3940 x18:ffffffdebf66d140 [ 540.579489] x17:
00000000739ba063 x16: 00000000739ba063 x15:00000000009f4bff [ 540.579495]
x14: 0000008000000000 x13: 0000000000000000 x12:0000000000000001 [
540.579501] x11: 0000000000000000 x10: 0000000000000000 x9
:ffffff897d2cd440 [ 540.579507] x8 : 0000000000000000 x7 : 0000000000000000
x6 :ffffffc03579b5b4 [ 540.579512] x5 : 0000000000027f25 x4 :
ffffffc03579b5b8 x3 :0000000000000 —truncated—
Author | Note |
---|---|
rodrigo-zaiden | fix for this issue introduces a new issue, CVE-2024-26639. USN-6765-1 for linux-oem-6.5 wrongly stated that this CVE was fixed in version 6.5.0-1022.23. The mentioned notice was revoked and the state of the fix for linux-oem-6.5 was recovered to the previous state. |
OS | Version | Architecture | Package | Version | Filename |
---|---|---|---|---|---|
ubuntu | 20.04 | noarch | linux | < any | UNKNOWN |
ubuntu | 22.04 | noarch | linux | < 5.15.0-106.116 | UNKNOWN |
ubuntu | 23.10 | noarch | linux | < 6.5.0-41.41 | UNKNOWN |
ubuntu | 20.04 | noarch | linux-aws | < any | UNKNOWN |
ubuntu | 22.04 | noarch | linux-aws | < 5.15.0-1061.67 | UNKNOWN |
ubuntu | 23.10 | noarch | linux-aws | < 6.5.0-1021.21 | UNKNOWN |
ubuntu | 20.04 | noarch | linux-aws-5.15 | < 5.15.0-1061.67~20.04.1 | UNKNOWN |
ubuntu | 22.04 | noarch | linux-aws-6.5 | < any | UNKNOWN |
ubuntu | 20.04 | noarch | linux-azure | < any | UNKNOWN |
ubuntu | 22.04 | noarch | linux-azure | < 5.15.0-1063.72 | UNKNOWN |
git.kernel.org/linus/5ec8e8ea8b7783fab150cf86404fc38cb4db8800 (6.8-rc1)
git.kernel.org/stable/c/3a01daace71b521563c38bbbf874e14c3e58adb7
git.kernel.org/stable/c/5ec8e8ea8b7783fab150cf86404fc38cb4db8800
git.kernel.org/stable/c/68ed9e33324021e9d6b798e9db00ca3093d2012a
git.kernel.org/stable/c/70064241f2229f7ba7b9599a98f68d9142e81a97
git.kernel.org/stable/c/90ad17575d26874287271127d43ef3c2af876cea
git.kernel.org/stable/c/b448de2459b6d62a53892487ab18b7d823ff0529
launchpad.net/bugs/cve/CVE-2023-52489
nvd.nist.gov/vuln/detail/CVE-2023-52489
security-tracker.debian.org/tracker/CVE-2023-52489
ubuntu.com/security/notices/USN-6766-1
ubuntu.com/security/notices/USN-6766-2
ubuntu.com/security/notices/USN-6766-3
ubuntu.com/security/notices/USN-6795-1
ubuntu.com/security/notices/USN-6818-1
ubuntu.com/security/notices/USN-6818-2
ubuntu.com/security/notices/USN-6818-3
ubuntu.com/security/notices/USN-6818-4
ubuntu.com/security/notices/USN-6819-1
ubuntu.com/security/notices/USN-6819-2
ubuntu.com/security/notices/USN-6819-3
ubuntu.com/security/notices/USN-6819-4
ubuntu.com/security/notices/USN-6828-1
www.cve.org/CVERecord?id=CVE-2023-52489