-
Notifications
You must be signed in to change notification settings - Fork 10
[LTS 9.4] nfsd: don't ignore the return code of svc_proc_register() #564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
PlaidCat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this makes sense to me
![]()
bmastbergen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥌
jira VULN-64896 cve CVE-2025-22026 commit-author Jeff Layton <[email protected]> commit 930b64c upstream-diff | nfsd underwent considerable architectural changes related to the exposition of network stats in the user space since `ciqlts9_4' branched off, which are assumed by the upstream fix 930b64c to be in place - see patches d98416c, 93483ac, 4b14885, e41ee44, 16fb980. This backport addresses the core of the issue without pulling in all of these changes, which is checking the value returned by `svc_proc_register'. Currently, nfsd_proc_stat_init() ignores the return value of svc_proc_register(). If the procfile creation fails, then the kernel will WARN when it tries to remove the entry later. Fix nfsd_proc_stat_init() to return the same type of pointer as svc_proc_register(), and fix up nfsd_net_init() to check that and fail the nfsd_net construction if it occurs. svc_proc_register() can fail if the dentry can't be allocated, or if an identical dentry already exists. The second case is pretty unlikely in the nfsd_net construction codepath, so if this happens, return -ENOMEM. Reported-by: [email protected] Closes: https://lore.kernel.org/linux-nfs/[email protected]/ Cc: [email protected] # v6.9 Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]> (cherry picked from commit 930b64c) Signed-off-by: Marcin Wcisło <[email protected]>
3e4c5d5 to
3d32877
Compare
|
kselftests–ciqlts94-CVE-2025-22026–fix–run1.log Same results, except for Module loading: |
bmastbergen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥌
JIRA: https://issues.redhat.com/browse/RHEL-115358 CVE: CVE-2025-38267 commit 4fc78a7 Author: Steven Rostedt <[email protected]> Date: Wed May 28 12:15:55 2025 -0400 ring-buffer: Do not trigger WARN_ON() due to a commit_overrun When reading a memory mapped buffer the reader page is just swapped out with the last page written in the write buffer. If the reader page is the same as the commit buffer (the buffer that is currently being written to) it was assumed that it should never have missed events. If it does, it triggers a WARN_ON_ONCE(). But there just happens to be one scenario where this can legitimately happen. That is on a commit_overrun. A commit overrun is when an interrupt preempts an event being written to the buffer and then the interrupt adds so many new events that it fills and wraps the buffer back to the commit. Any new events would then be dropped and be reported as "missed_events". In this case, the next page to read is the commit buffer and after the swap of the reader page, the reader page will be the commit buffer, but this time there will be missed events and this triggers the following warning: ------------[ cut here ]------------ WARNING: CPU: 2 PID: 1127 at kernel/trace/ring_buffer.c:7357 ring_buffer_map_get_reader+0x49a/0x780 Modules linked in: kvm_intel kvm irqbypass CPU: 2 UID: 0 PID: 1127 Comm: trace-cmd Not tainted 6.15.0-rc7-test-00004-g478bc2824b45-dirty #564 PREEMPT Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:ring_buffer_map_get_reader+0x49a/0x780 Code: 00 00 00 48 89 fe 48 c1 ee 03 80 3c 2e 00 0f 85 ec 01 00 00 4d 3b a6 a8 00 00 00 0f 85 8a fd ff ff 48 85 c0 0f 84 55 fe ff ff <0f> 0b e9 4e fe ff ff be 08 00 00 00 4c 89 54 24 58 48 89 54 24 50 RSP: 0018:ffff888121787dc0 EFLAGS: 00010002 RAX: 00000000000006a2 RBX: ffff888100062800 RCX: ffffffff8190cb49 RDX: ffff888126934c00 RSI: 1ffff11020200a15 RDI: ffff8881010050a8 RBP: dffffc0000000000 R08: 0000000000000000 R09: ffffed1024d26982 R10: ffff888126934c17 R11: ffff8881010050a8 R12: ffff888126934c00 R13: ffff8881010050b8 R14: ffff888101005000 R15: ffff888126930008 FS: 00007f95c8cd7540(0000) GS:ffff8882b576e000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f95c8de4dc0 CR3: 0000000128452002 CR4: 0000000000172ef0 Call Trace: <TASK> ? __pfx_ring_buffer_map_get_reader+0x10/0x10 tracing_buffers_ioctl+0x283/0x370 __x64_sys_ioctl+0x134/0x190 do_syscall_64+0x79/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f95c8de48db Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b 04 25 28 00 00 RSP: 002b:00007ffe037ba110 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffe037bb2b0 RCX: 00007f95c8de48db RDX: 0000000000000000 RSI: 0000000000005220 RDI: 0000000000000006 RBP: 00007ffe037ba180 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffe037bb6f8 R14: 00007f95c9065000 R15: 00005575c7492c90 </TASK> irq event stamp: 5080 hardirqs last enabled at (5079): [<ffffffff83e0adb0>] _raw_spin_unlock_irqrestore+0x50/0x70 hardirqs last disabled at (5080): [<ffffffff83e0aa83>] _raw_spin_lock_irqsave+0x63/0x70 softirqs last enabled at (4182): [<ffffffff81516122>] handle_softirqs+0x552/0x710 softirqs last disabled at (4159): [<ffffffff815163f7>] __irq_exit_rcu+0x107/0x210 ---[ end trace 0000000000000000 ]--- The above was triggered by running on a kernel with both lockdep and KASAN as well as kmemleak enabled and executing the following command: # perf record -o perf-test.dat -a -- trace-cmd record --nosplice -e all -p function hackbench 50 With perf interjecting a lot of interrupts and trace-cmd enabling all events as well as function tracing, with lockdep, KASAN and kmemleak enabled, it could cause an interrupt preempting an event being written to add enough events to wrap the buffer. trace-cmd was modified to have --nosplice use mmap instead of reading the buffer. The way to differentiate this case from the normal case of there only being one page written to where the swap of the reader page received that one page (which is the commit page), check if the tail page is on the reader page. The difference between the commit page and the tail page is that the tail page is where new writes go to, and the commit page holds the first write that hasn't been committed yet. In the case of an interrupt preempting the write of an event and filling the buffer, it would move the tail page but not the commit page. Have the warning only trigger if the tail page is also on the reader page, and also print out the number of events dropped by a commit overrun as that can not yet be safely added to the page so that the reader can see there were events dropped. Cc: [email protected] Cc: Mathieu Desnoyers <[email protected]> Cc: Vincent Donnefort <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: fe832be ("ring-buffer: Have mmapped ring buffer keep track of missed events") Reviewed-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Jerome Marchand <[email protected]>
[LTS 9.4]
CVE-2025-22026
VULN-64896
Problem
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=930b64ca0c511521f0abdd1d57ce52b2a6e3476b
Affected: yes
The affected
nfsdmodule is enabled withCONFIG_NFSDwhich ismfor all config variantsThe return value of
svc_proc_register(…)function is clearly not being checked:kernel-src-tree/fs/nfsd/stats.c
Line 129 in 3713f88
Solution
Cherry-picking the mainline fix 930b64c is meaningless, as the code underwent considerable architectural changes related to the exposition of network stats in the user space since
ciqlts9_4branched off. Picking them as "cve-pre" would be impractical and risky, could also affect the/procinterface. It was chosen instead to apply a change equivalent to 930b64c in meaning but fittingciqlts9_4codebase. As a result it differs completely from the upstream.nfsd_stat_init(…)instead ofnfsd_proc_stat_init(…). It's logically the same function, but was renamed in 93483ac. Observe the evolution ofnfsd_stat_init(…)betweenciqlts9_4andkernel-mainlinefor the full context, with the last change coming from the upstream CVE-2025-22026 fix:kernel-src-tree/fs/nfsd/stats.c
Lines 121 to 132 in 3713f88
kernel-src-tree/fs/nfsd/stats.c
Lines 121 to 124 in 93483ac
kernel-src-tree/fs/nfsd/stats.c
Lines 118 to 123 in 16fb980
kernel-src-tree/fs/nfsd/stats.c
Lines 76 to 81 in 930b64c
intreturn value ofnfsd_stat_init(…)was left unchanged, unlike thevoid→struct proc_dir_entry *change in the upstream. The function in the older version already returns a status code so it could have been utilized. This nullified any changes to the header filefs/nfsd/stats.hfound in the upstream.-ENOMEMreturn code in case ofsvc_proc_register(…)call failure was retained, based on the justification given in the upstream fix which remained valid:nfsd_net_init(…)function offs/nfsd/nfsctl.cin casenfsd_proc_stat_init(…)failed: 930b64c#diff-540a099ea05e266ccb0e9fa515be1e018baf46cef98f0f169e509fa43e1b4878R2228-R2229In
ciqlts9_4this is not necessary, becausenfsd_stat_init(…)is not even used insidenfsd_net_init(…). Instead it's called ininit_nfsd()(and only there) in a way which already covers a jump to an appropriate cleanup routine whennfsd_stat_init(…)fails:kernel-src-tree/fs/nfsd/nfsctl.c
Lines 1578 to 1580 in 3713f88
As a result there are no changes to the
fs/nfsd/nfsctl.cfile, unlike in the upstream.kABI check: passed
Boot test: passed
boot-test.log
Kselftests: passed relative
Coverage
bpf(onlytest_lpm_map,test_lru_map,test_sock,test_verifier,test_cgroup_storage,test_tag,test_tcpnotify_user,test_sysctl),breakpoints(onlybreakpoint_test),capabilities,clone3,cpu-hotplug,cpufreq,drivers/dma-buf,drivers/net/bonding(all exceptbond_macvlan.sh),drivers/net/team,exec,filesystems/binderfs,filesystems/epoll,firmware,fpu,ftrace,futex,gpio,intel_pstate,iommu,ipc,ir,kcmp,kexec,kvm,landlock,lib,livepatch,membarrier,memfd,memory-hotplug,mincore,mount,mqueue,nci,net/forwarding(all exceptsch_red.sh,sch_ets.sh,sch_tbf_ets.sh,vxlan_bridge_1d_ipv6.sh,ip6gre_inner_v6_multipath.sh,mirror_gre_vlan_bridge_1q.sh,router_bridge_lag.sh,mirror_gre_bridge_1d_vlan.sh,bridge_igmp.sh,sch_tbf_root.sh,tc_police.sh,q_in_vni.sh,dual_vxlan_bridge.sh,ipip_hier_gre_keys.sh,gre_inner_v6_multipath.sh,sch_tbf_prio.sh,tc_actions.sh,router_bridge_1d_lag.sh),net/hsr,net/mptcp(all exceptuserspace_pm.sh,simult_flows.sh,mptcp_join.sh),net(all exceptreuseport_addr_any.sh,srv6_end_dt6_l3vpn_test.sh,srv6_end_flavors_test.sh,udpgro_fwd.sh,udpgso_bench.sh,fib_nexthops.sh,ip_defrag.sh,reuseaddr_conflict,srv6_end_dt46_l3vpn_test.sh,srv6_end_dt4_l3vpn_test.sh,gro.sh,xfrm_policy.sh,txtimestamp.sh),netfilter(all exceptnft_trans_stress.sh),nsfs,pid_namespace,pidfd,proc(all exceptproc-pid-vm,proc-uptime-001),pstore,ptrace,rlimits,rseq,seccomp,sgx,sigaltstack,size,splice,static_keys,syscall_user_dispatch,tc-testing,tdx,timens,timers,tmpfs,tpm2,tty,vDSO,x86,zramReference
kselftests–ciqlts9_4–run1.log
Patch
kselftests–ciqlts9_4-CVE-2025-22026–run1.log
kselftests–ciqlts9_4-CVE-2025-22026–run2.log
Comparison
The tests results between the reference and patched kernel are the same
Specific tests: passed
Considering that the change deals with module's initialization routine a simple loading of the
nfsdmodule constitutes a reasonable patch test, at least of the success branch.The nfsd statistics implemented by the modified file are available in
/proc/net/rpc/nfsd: