Skip to content

Conversation

@kernel-patches-daemon-bpf
Copy link

Pull request for series with
subject: bpf: Introduce BPF_F_CPU and BPF_F_ALL_CPUS flags for percpu maps
version: 12
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1027896

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 5262cb2
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1027896
version: 12

@kernel-patches-review-bot
Copy link

commit f227d6ba8f6a3b241839089527c81d0466604c4e
Author: Leon Hwang <[email protected]>

bpf: Add BPF_F_CPU and BPF_F_ALL_CPUS flags support for percpu_hash and
lru_percpu_hash maps

This commit adds BPF_F_CPU and BPF_F_ALL_CPUS flag support to percpu_hash
and lru_percpu_hash maps, extending the functionality previously added to
percpu_array maps. The flags allow per-CPU updates and lookups through both
element-wise and batch APIs.

> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 01a99e3a3..f79d2ae27 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h

[ ... ]

> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index c8a9b27f8..441ff5bc5 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c

[ ... ]

> @@ -1249,6 +1258,15 @@ static long htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value
>  	return ret;
>  }
>
> +static int htab_map_check_update_flags(bool onallcpus, u64 map_flags)
> +{
> +	if (unlikely(!onallcpus && map_flags > BPF_EXIST))
> +		return -EINVAL;
> +	if (unlikely(onallcpus && ((map_flags & BPF_F_LOCK) || (u32)map_flags > BPF_F_ALL_CPUS)))
> +		return -EINVAL;
> +	return 0;
> +}
> +

Can htab_map_check_update_flags() correctly validate the new flags?

When onallcpus=false (the common case for normal map_update_elem
operations via htab_percpu_map_update_elem()->
htab_map_update_elem_in_place()), the check at the first if statement
compares the full 64-bit map_flags value against BPF_EXIST (2).

When BPF_F_CPU (8) is set, or when the CPU number is embedded in the
upper 32 bits (e.g., 0x0000000100000008 for CPU 1), map_flags will be
greater than BPF_EXIST, causing this function to return -EINVAL.

This appears inconsistent with the equivalent check in
bpf_percpu_array_update() (commit 51f40f83b), which uses:

  if (unlikely((map_flags & BPF_F_LOCK) || (u32)map_flags > BPF_F_ALL_CPUS))

The percpu_array check casts to u32 to examine only the lower 32 bits,
allowing the CPU number in the upper 32 bits. The hash table check for
onallcpus=false doesn't perform this cast, so it will reject valid flag
combinations that include BPF_F_CPU.

Would this prevent the BPF_F_CPU functionality from working for normal
percpu_hash update operations?

> @@ -1262,9 +1280,9 @@ static long htab_map_update_elem_in_place(struct bpf_map *map, void *key,
>  	u32 key_size, hash;
>  	int ret;
>
> -	if (unlikely(map_flags > BPF_EXIST))
> -		/* unknown flags */
> -		return -EINVAL;
> +	ret = htab_map_check_update_flags(onallcpus, map_flags);
> +	if (unlikely(ret))
> +		return ret;
>
>  	WARN_ON_ONCE(!bpf_rcu_lock_held());
>

[ ... ]

> @@ -1324,9 +1342,9 @@ static long __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
>  	u32 key_size, hash;
>  	int ret;
>
> -	if (unlikely(map_flags > BPF_EXIST))
> -		/* unknown flags */
> -		return -EINVAL;
> +	ret = htab_map_check_update_flags(onallcpus, map_flags);
> +	if (unlikely(ret))
> +		return ret;

[ ... ]



AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

In-Reply-To-Subject: bpf: Add BPF_F_CPU and BPF_F_ALL_CPUS flags support for percpu_hash and lru_percpu_hash maps
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19708012130

@kernel-patches-review-bot
Copy link

commit 6d8f6c801060f2691310ca6532b5ad6a30021594
Author: Leon Hwang <[email protected]>

bpf: Add BPF_F_CPU and BPF_F_ALL_CPUS flags support for percpu_cgroup_storage maps

This commit introduces BPF_F_CPU and BPF_F_ALL_CPUS flag support for
percpu_cgroup_storage maps, allowing per-CPU lookups and updates.

> diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
> index aedf573bd..013f4db99 100644
> --- a/include/linux/bpf-cgroup.h
> +++ b/include/linux/bpf-cgroup.h

[ ... ]

> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index f79d2ae27..9e756db5e 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h

[ ... ]

> diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c
> index 2ab4b60ff..1ccbf28b2 100644
> --- a/kernel/bpf/local_storage.c
> +++ b/kernel/bpf/local_storage.c
> @@ -180,7 +180,7 @@ static long cgroup_storage_update_elem(struct bpf_map *map, void *key,
>  }
>
>  int bpf_percpu_cgroup_storage_copy(struct bpf_map *_map, void *key,
> -				   void *value)
> +				   void *value, u64 map_flags)
>  {
>  	struct bpf_cgroup_storage_map *map = map_to_storage(_map);
>  	struct bpf_cgroup_storage *storage;
> @@ -198,11 +198,17 @@ int bpf_percpu_cgroup_storage_copy(struct bpf_map *_map, void *key,
>  	 * access 'value_size' of them, so copying rounded areas
>  	 * will not leak any kernel data
>  	 */
> +	if (map_flags & BPF_F_CPU) {
> +		cpu = map_flags >> 32;
> +		copy_map_value(_map, value, per_cpu_ptr(storage->percpu_buf, cpu));
> +		goto unlock;
> +	}

Should check_and_init_map_value() be called here after copy_map_value()?

The similar implementations in bpf_percpu_array_copy() and
bpf_percpu_hash_copy() both call check_and_init_map_value() after
copying a single CPU's value when BPF_F_CPU is set:

  bpf_percpu_array_copy():
    if (map_flags & BPF_F_CPU) {
        cpu = map_flags >> 32;
        copy_map_value(map, value, per_cpu_ptr(pptr, cpu));
        check_and_init_map_value(map, value);
        goto unlock;
    }

  bpf_percpu_hash_copy():
    if (map_flags & BPF_F_CPU) {
        cpu = map_flags >> 32;
        copy_map_value(map, value, per_cpu_ptr(pptr, cpu));
        check_and_init_map_value(map, value);
        goto out;
    }

The check_and_init_map_value() call initializes special BTF fields
(timers, kptrs, etc.) when copying map values to userspace. Without it,
these fields might not be properly initialized if percpu_cgroup_storage
map values contain such fields.

>  	size = round_up(_map->value_size, 8);
>  	for_each_possible_cpu(cpu) {
>  		copy_map_value_long(_map, value + off, per_cpu_ptr(storage->percpu_buf, cpu));
>  		off += size;
>  	}
> +unlock:
>  	rcu_read_unlock();
>  	return 0;
>  }

[ ... ]



AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

In-Reply-To-Subject: bpf: Add BPF_F_CPU and BPF_F_ALL_CPUS flags support for percpu_cgroup_storage maps
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19708012130

@kernel-patches-review-bot
Copy link

commit 51f40f83b013ec90531e5d14a849755d40774bb0
Author: Leon Hwang <[email protected]>

bpf: Add BPF_F_CPU and BPF_F_ALL_CPUS flags support for percpu_array maps

This commit introduces support for BPF_F_CPU and BPF_F_ALL_CPUS flags in
percpu_array maps, enabling per-CPU and broadcast update/lookup operations
via the map syscall APIs.

> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index d84af3719..01a99e3a3 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h

[ ... ]

> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> index 1eeb31c5b..67e9e811d 100644
> --- a/kernel/bpf/arraymap.c
> +++ b/kernel/bpf/arraymap.c

[ ... ]

> @@ -398,10 +405,11 @@ int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value,
>  	struct bpf_array *array = container_of(map, struct bpf_array, map);
>  	u32 index = *(u32 *)key;
>  	void __percpu *pptr;
> -	int cpu, off = 0;
> +	void *ptr, *val;
>  	u32 size;
> +	int cpu;
>
> -	if (unlikely(map_flags > BPF_EXIST))
> +	if (unlikely((map_flags & BPF_F_LOCK) || (u32)map_flags > BPF_F_ALL_CPUS))
>  		/* unknown flags */
>  		return -EINVAL;
>
> @@ -409,7 +417,7 @@ int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value,
>  		/* all elements were pre-allocated, cannot insert a new one */
>  		return -E2BIG;
>
> -	if (unlikely(map_flags & BPF_NOEXIST))
> +	if (unlikely(map_flags == BPF_NOEXIST))
                              ^^

Does changing this from bitwise-AND to equality break flag combination
handling? The old code checked (map_flags & BPF_NOEXIST), which would
trigger for any flag combination including BPF_NOEXIST. The new code only
checks (map_flags == BPF_NOEXIST), so if userspace passes
(BPF_F_CPU | BPF_NOEXIST), the BPF_NOEXIST flag gets silently ignored
instead of returning -EEXIST as before.

While all elements in percpu_array maps are pre-allocated and BPF_NOEXIST
should always fail anyway, silently ignoring a flag the user explicitly set
seems inconsistent with the principle of rejecting invalid flag combinations
at validation time.

>  		/* all elements already exist */
>  		return -EEXIST;
>

[ ... ]



AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

In-Reply-To-Subject: bpf: Add BPF_F_CPU and BPF_F_ALL_CPUS flags support for percpu_array maps
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19708012130

@kernel-patches-daemon-bpf
Copy link
Author

Forwarding comment 3581755475 via email
In-Reply-To: [email protected]
Patch: https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/

@kernel-patches-daemon-bpf
Copy link
Author

Forwarding comment 3581759340 via email
In-Reply-To: [email protected]
Patch: https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/

@kernel-patches-daemon-bpf
Copy link
Author

Forwarding comment 3581762109 via email
In-Reply-To: [email protected]
Patch: https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 688b745
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1027896
version: 12

Introduce BPF_F_CPU and BPF_F_ALL_CPUS flags and check them for
following APIs:

* 'map_lookup_elem()'
* 'map_update_elem()'
* 'generic_map_lookup_batch()'
* 'generic_map_update_batch()'

And, get the correct value size for these APIs.

Acked-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Leon Hwang <[email protected]>
…maps

Introduce support for the BPF_F_ALL_CPUS flag in percpu_array maps to
allow updating values for all CPUs with a single value for both
update_elem and update_batch APIs.

Introduce support for the BPF_F_CPU flag in percpu_array maps to allow:

* update value for specified CPU for both update_elem and update_batch
APIs.
* lookup value for specified CPU for both lookup_elem and lookup_batch
APIs.

The BPF_F_CPU flag is passed via:

* map_flags of lookup_elem and update_elem APIs along with embedded cpu
info.
* elem_flags of lookup_batch and update_batch APIs along with embedded
cpu info.

Signed-off-by: Leon Hwang <[email protected]>
…nd lru_percpu_hash maps

Introduce BPF_F_ALL_CPUS flag support for percpu_hash and lru_percpu_hash
maps to allow updating values for all CPUs with a single value for both
update_elem and update_batch APIs.

Introduce BPF_F_CPU flag support for percpu_hash and lru_percpu_hash
maps to allow:

* update value for specified CPU for both update_elem and update_batch
APIs.
* lookup value for specified CPU for both lookup_elem and lookup_batch
APIs.

The BPF_F_CPU flag is passed via:

* map_flags along with embedded cpu info.
* elem_flags along with embedded cpu info.

Signed-off-by: Leon Hwang <[email protected]>
…ge maps

Copy map value using 'copy_map_value_long()'. It's to keep consistent
style with the way of other percpu maps.

No functional change intended.

Signed-off-by: Leon Hwang <[email protected]>
@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: bd5bdd2
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1027896
version: 12

…_storage maps

Introduce BPF_F_ALL_CPUS flag support for percpu_cgroup_storage maps to
allow updating values for all CPUs with a single value for update_elem
API.

Introduce BPF_F_CPU flag support for percpu_cgroup_storage maps to
allow:

* update value for specified CPU for update_elem API.
* lookup value for specified CPU for lookup_elem API.

The BPF_F_CPU flag is passed via map_flags along with embedded cpu info.

Signed-off-by: Leon Hwang <[email protected]>
Add libbpf support for the BPF_F_CPU flag for percpu maps by embedding the
cpu info into the high 32 bits of:

1. **flags**: bpf_map_lookup_elem_flags(), bpf_map__lookup_elem(),
   bpf_map_update_elem() and bpf_map__update_elem()
2. **opts->elem_flags**: bpf_map_lookup_batch() and
   bpf_map_update_batch()

And the flag can be BPF_F_ALL_CPUS, but cannot be
'BPF_F_CPU | BPF_F_ALL_CPUS'.

Behavior:

* If the flag is BPF_F_ALL_CPUS, the update is applied across all CPUs.
* If the flag is BPF_F_CPU, it updates value only to the specified CPU.
* If the flag is BPF_F_CPU, lookup value only from the specified CPU.
* lookup does not support BPF_F_ALL_CPUS.

Acked-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Leon Hwang <[email protected]>
Add test coverage for the new BPF_F_CPU and BPF_F_ALL_CPUS flags support
in percpu maps. The following APIs are exercised:

* bpf_map_update_batch()
* bpf_map_lookup_batch()
* bpf_map_update_elem()
* bpf_map__update_elem()
* bpf_map_lookup_elem_flags()
* bpf_map__lookup_elem()

For lru_percpu_hash map, set max_entries to
'libbpf_num_possible_cpus() + 1' and only use the first
'libbpf_num_possible_cpus()' entries. This ensures a spare entry is always
available in the LRU free list, avoiding eviction.

When updating an existing key in lru_percpu_hash map:

1. l_new = prealloc_lru_pop();  /* Borrow from free list */
2. l_old = lookup_elem_raw();   /* Found, key exists */
3. pcpu_copy_value();           /* In-place update */
4. bpf_lru_push_free();         /* Return l_new to free list */

Also add negative tests to verify that non-percpu array and hash maps
reject the BPF_F_CPU and BPF_F_ALL_CPUS flags.

Signed-off-by: Leon Hwang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants