Lucene search

K
vulnrichmentLinuxVULNRICHMENT:CVE-2024-41010
HistoryJul 17, 2024 - 6:10 a.m.

CVE-2024-41010 bpf: Fix too early release of tcx_entry

2024-07-1706:10:12
Linux
github.com
2
linux kernel
vulnerability fix
bpf
tcx_entry
use after free
uaf
ingress qdisc
clsact qdisc
tc block
network namespace
tcf block
chain0
clsact destroy

AI Score

6.5

Confidence

Low

SSVC

Exploitation

none

Automatable

no

Technical Impact

partial

In the Linux kernel, the following vulnerability has been resolved:

bpf: Fix too early release of tcx_entry

Pedro Pinto and later independently also Hyunwoo Kim and Wongi Lee reported
an issue that the tcx_entry can be released too early leading to a use
after free (UAF) when an active old-style ingress or clsact qdisc with a
shared tc block is later replaced by another ingress or clsact instance.

Essentially, the sequence to trigger the UAF (one example) can be as follows:

  1. A network namespace is created

  2. An ingress qdisc is created. This allocates a tcx_entry, and
    &tcx_entry->miniq is stored in the qdisc’s miniqp->p_miniq. At the
    same time, a tcf block with index 1 is created.

  3. chain0 is attached to the tcf block. chain0 must be connected to
    the block linked to the ingress qdisc to later reach the function
    tcf_chain0_head_change_cb_del() which triggers the UAF.

  4. Create and graft a clsact qdisc. This causes the ingress qdisc
    created in step 1 to be removed, thus freeing the previously linked
    tcx_entry:

    rtnetlink_rcv_msg()
    => tc_modify_qdisc()
    => qdisc_create()
    => clsact_init() [a]
    => qdisc_graft()
    => qdisc_destroy()
    => __qdisc_destroy()
    => ingress_destroy() [b]
    => tcx_entry_free()
    => kfree_rcu() // tcx_entry freed

  5. Finally, the network namespace is closed. This registers the
    cleanup_net worker, and during the process of releasing the
    remaining clsact qdisc, it accesses the tcx_entry that was
    already freed in step 4, causing the UAF to occur:

    cleanup_net()
    => ops_exit_list()
    => default_device_exit_batch()
    => unregister_netdevice_many()
    => unregister_netdevice_many_notify()
    => dev_shutdown()
    => qdisc_put()
    => clsact_destroy() [c]
    => tcf_block_put_ext()
    => tcf_chain0_head_change_cb_del()
    => tcf_chain_head_change_item()
    => clsact_chain_head_change()
    => mini_qdisc_pair_swap() // UAF

There are also other variants, the gist is to add an ingress (or clsact)
qdisc with a specific shared block, then to replace that qdisc, waiting
for the tcx_entry kfree_rcu() to be executed and subsequently accessing
the current active qdisc’s miniq one way or another.

The correct fix is to turn the miniq_active boolean into a counter. What
can be observed, at step 2 above, the counter transitions from 0->1, at
step [a] from 1->2 (in order for the miniq object to remain active during
the replacement), then in [b] from 2->1 and finally [c] 1->0 with the
eventual release. The reference counter in general ranges from [0,2] and
it does not need to be atomic since all access to the counter is protected
by the rtnl mutex. With this in place, there is no longer a UAF happening
and the tcx_entry is freed at the correct time.

AI Score

6.5

Confidence

Low

SSVC

Exploitation

none

Automatable

no

Technical Impact

partial