Commit Graph

1413286 Commits

Author SHA1 Message Date
Jakub Kicinski
869f3f7f1f Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2026-01-20 (ice, idpf)

For ice:
Cody Haas breaks dependency of needing both RSS key and LUT for
ice_get_rxfh() as ethtool ioctls do not always supply both.

Paul fixes issues related to devlink reload; adding missing deinit HW
call and moving hwmon exit function to the proper call chain.

For idpf:
Mina Almasry moves a register read call into the time sandwich to ensure
the register is properly flushed.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  idpf: read lower clock bits inside the time sandwich
  ice: fix devlink reload call trace
  ice: add missing ice_deinit_hw() in devlink reinit path
  ice: Fix persistent failure in ice_get_rxfh
====================

Link: https://patch.msgid.link/20260120224430.410377-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21 19:54:21 -08:00
Vladimir Oltean
dfca045cd4 net: dsa: fix off-by-one in maximum bridge ID determination
Prior to the blamed commit, the bridge_num range was from
0 to ds->max_num_bridges - 1. After the commit, it is from
1 to ds->max_num_bridges.

So this check:
	if (bridge_num >= max)
		return 0;
must be updated to:
	if (bridge_num > max)
		return 0;

in order to allow the last bridge_num value (==max) to be used.

This is easiest visible when a driver sets ds->max_num_bridges=1.
The observed behaviour is that even the first created bridge triggers
the netlink extack "Range of offloadable bridges exceeded" warning, and
is handled in software rather than being offloaded.

Fixes: 3f9bb0301d ("net: dsa: make dp->bridge_num one-based")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260120211039.3228999-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21 19:52:29 -08:00
Justin Chen
bbb11b8d75 net: bcmasp: Fix network filter wake for asp-3.0
We need to apply the tx_chan_offset to the netfilter cfg channel or the
output channel will be incorrect for asp-3.0 and newer.

Fixes: e9f31435ee ("net: bcmasp: Add support for asp-v3.0")
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20260120192339.2031648-1-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21 19:29:52 -08:00
Eric Dumazet
5f9b329096 bonding: provide a net pointer to __skb_flow_dissect()
After 3cbf4ffba5 ("net: plumb network namespace into __skb_flow_dissect")
we have to provide a net pointer to __skb_flow_dissect(),
either via skb->dev, skb->sk, or a user provided pointer.

In the following case, syzbot was able to cook a bare skb.

WARNING: net/core/flow_dissector.c:1131 at __skb_flow_dissect+0xb57/0x68b0 net/core/flow_dissector.c:1131, CPU#1: syz.2.1418/11053
Call Trace:
 <TASK>
  bond_flow_dissect drivers/net/bonding/bond_main.c:4093 [inline]
  __bond_xmit_hash+0x2d7/0xba0 drivers/net/bonding/bond_main.c:4157
  bond_xmit_hash_xdp drivers/net/bonding/bond_main.c:4208 [inline]
  bond_xdp_xmit_3ad_xor_slave_get drivers/net/bonding/bond_main.c:5139 [inline]
  bond_xdp_get_xmit_slave+0x1fd/0x710 drivers/net/bonding/bond_main.c:5515
  xdp_master_redirect+0x13f/0x2c0 net/core/filter.c:4388
  bpf_prog_run_xdp include/net/xdp.h:700 [inline]
  bpf_test_run+0x6b2/0x7d0 net/bpf/test_run.c:421
  bpf_prog_test_run_xdp+0x795/0x10e0 net/bpf/test_run.c:1390
  bpf_prog_test_run+0x2c7/0x340 kernel/bpf/syscall.c:4703
  __sys_bpf+0x562/0x860 kernel/bpf/syscall.c:6182
  __do_sys_bpf kernel/bpf/syscall.c:6274 [inline]
  __se_sys_bpf kernel/bpf/syscall.c:6272 [inline]
  __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:6272
  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
  do_syscall_64+0xec/0xf80 arch/x86/entry/syscall_64.c:94

Fixes: 58deb77cc5 ("bonding: balance ICMP echoes in layer3+4 mode")
Reported-by: syzbot+c46409299c70a221415e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/696faa23.050a0220.4cb9c.001f.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Matteo Croce <mcroce@redhat.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20260120161744.1893263-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21 19:27:56 -08:00
Taehee Yoo
04708606fd selftests: net: amt: wait longer for connection before sending packets
Both send_mcast4() and send_mcast6() use sleep 2 to wait for the tunnel
connection between the gateway and the relay, and for the listener
socket to be created in the LISTENER namespace.

However, tests sometimes fail because packets are sent before the
connection is fully established.

Increase the waiting time to make the tests more reliable, and use
wait_local_port_listen() to explicitly wait for the listener socket.

Fixes: c08e8baea7 ("selftests: add amt interface selftest script")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://patch.msgid.link/20260120133930.863845-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21 19:20:30 -08:00
Andrey Vatoropin
8215794403 be2net: Fix NULL pointer dereference in be_cmd_get_mac_from_list
When the parameter pmac_id_valid argument of be_cmd_get_mac_from_list() is
set to false, the driver may request the PMAC_ID from the firmware of the
network card, and this function will store that PMAC_ID at the provided
address pmac_id. This is the contract of this function.

However, there is a location within the driver where both
pmac_id_valid == false and pmac_id == NULL are being passed. This could
result in dereferencing a NULL pointer.

To resolve this issue, it is necessary to pass the address of a stub
variable to the function.

Fixes: 95046b927a ("be2net: refactor MAC-addr setup code")
Signed-off-by: Andrey Vatoropin <a.vatoropin@crpt.ru>
Link: https://patch.msgid.link/20260120113734.20193-1-a.vatoropin@crpt.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21 19:18:57 -08:00
Slark Xiao
2030c4358b Revert "net: wwan: mhi_wwan_mbim: Avoid -Wflex-array-member-not-at-end warning"
This reverts commit eeecf5d3a3.

This change lead to MHI WWAN device can't connect to internet.
I found a netwrok issue with kernel 6.19-rc4, but network works
well with kernel 6.18-rc1. After checking, this commit is the
root cause.

Before appliing this serial changes on MHI WWAN network, we shall
revert this change in case of v6.19 being impacted.

Fixes: eeecf5d3a3 ("net: wwan: mhi_wwan_mbim: Avoid -Wflex-array-member-not-at-end warning")
Signed-off-by: Slark Xiao <slark_xiao@163.com>
Link: https://patch.msgid.link/20260120072018.29375-1-slark_xiao@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21 19:03:37 -08:00
Jeongjun Park
ba1096c315 netrom: fix double-free in nr_route_frame()
In nr_route_frame(), old_skb is immediately freed without checking if
nr_neigh->ax25 pointer is NULL. Therefore, if nr_neigh->ax25 is NULL,
the caller function will free old_skb again, causing a double-free bug.

Therefore, to prevent this, we need to modify it to check whether
nr_neigh->ax25 is NULL before freeing old_skb.

Cc: <stable@vger.kernel.org>
Reported-by: syzbot+999115c3bf275797dc27@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69694d6f.050a0220.58bed.0029.GAE@google.com/
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Link: https://patch.msgid.link/20260119063359.10604-1-aha310510@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 19:15:40 -08:00
Hariprasad Kelam
cdf8de9c6b Octeontx2-pf: Update xdp features
In recent testing, verification of XDP_REDIRECT and zero-copy features
failed because the driver is not setting the corresponding feature flags.

Fixes: efabce2901 ("octeontx2-pf: AF_XDP zero copy receive support")
Fixes: 66c0e13ad2 ("drivers: net: turn on XDP features")
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://patch.msgid.link/20260119100222.2267925-1-hkelam@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 18:59:57 -08:00
Laurent Vivier
c7159e960f usbnet: limit max_mtu based on device's hard_mtu
The usbnet driver initializes net->max_mtu to ETH_MAX_MTU before calling
the device's bind() callback. When the bind() callback sets
dev->hard_mtu based the device's actual capability (from CDC Ethernet's
wMaxSegmentSize descriptor), max_mtu is never updated to reflect this
hardware limitation).

This allows userspace (DHCP or IPv6 RA) to configure MTU larger than the
device can handle, leading to silent packet drops when the backend sends
packet exceeding the device's buffer size.

Fix this by limiting net->max_mtu to the device's hard_mtu after the
bind callback returns.

See https://gitlab.com/qemu-project/qemu/-/issues/3268 and
    https://bugs.passt.top/attachment.cgi?bugid=189

Fixes: f77f0aee4d ("net: use core MTU range checking in USB NIC drivers")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Link: https://bugs.passt.top/show_bug.cgi?id=189
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Link: https://patch.msgid.link/20260119075518.2774373-1-lvivier@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 18:58:54 -08:00
Eric Dumazet
9a063f96d8 ipv6: annotate data-race in ndisc_router_discovery()
syzbot found that ndisc_router_discovery() could read and write
in6_dev->ra_mtu without holding a lock [1]

This looks fine, IFLA_INET6_RA_MTU is best effort.

Add READ_ONCE()/WRITE_ONCE() to document the race.

Note that we might also reject illegal MTU values
(mtu < IPV6_MIN_MTU || mtu > skb->dev->mtu) in a future patch.

[1]
BUG: KCSAN: data-race in ndisc_router_discovery / ndisc_router_discovery

read to 0xffff888119809c20 of 4 bytes by task 25817 on cpu 1:
  ndisc_router_discovery+0x151d/0x1c90 net/ipv6/ndisc.c:1558
  ndisc_rcv+0x2ad/0x3d0 net/ipv6/ndisc.c:1841
  icmpv6_rcv+0xe5a/0x12f0 net/ipv6/icmp.c:989
  ip6_protocol_deliver_rcu+0xb2a/0x10d0 net/ipv6/ip6_input.c:438
  ip6_input_finish+0xf0/0x1d0 net/ipv6/ip6_input.c:489
  NF_HOOK include/linux/netfilter.h:318 [inline]
  ip6_input+0x5e/0x140 net/ipv6/ip6_input.c:500
  ip6_mc_input+0x27c/0x470 net/ipv6/ip6_input.c:590
  dst_input include/net/dst.h:474 [inline]
  ip6_rcv_finish+0x336/0x340 net/ipv6/ip6_input.c:79
...

write to 0xffff888119809c20 of 4 bytes by task 25816 on cpu 0:
  ndisc_router_discovery+0x155a/0x1c90 net/ipv6/ndisc.c:1559
  ndisc_rcv+0x2ad/0x3d0 net/ipv6/ndisc.c:1841
  icmpv6_rcv+0xe5a/0x12f0 net/ipv6/icmp.c:989
  ip6_protocol_deliver_rcu+0xb2a/0x10d0 net/ipv6/ip6_input.c:438
  ip6_input_finish+0xf0/0x1d0 net/ipv6/ip6_input.c:489
  NF_HOOK include/linux/netfilter.h:318 [inline]
  ip6_input+0x5e/0x140 net/ipv6/ip6_input.c:500
  ip6_mc_input+0x27c/0x470 net/ipv6/ip6_input.c:590
  dst_input include/net/dst.h:474 [inline]
  ip6_rcv_finish+0x336/0x340 net/ipv6/ip6_input.c:79
...

value changed: 0x00000000 -> 0xe5400659

Fixes: 49b99da2c9 ("ipv6: add IFLA_INET6_RA_MTU to expose mtu value")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Rocco Yue <rocco.yue@mediatek.com>
Link: https://patch.msgid.link/20260118152941.2563857-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 18:37:45 -08:00
Eric Dumazet
8175dbf174 mISDN: annotate data-race around dev->work
dev->work can re read locklessly in mISDN_read()
and mISDN_poll(). Add READ_ONCE()/WRITE_ONCE() annotations.

BUG: KCSAN: data-race in mISDN_ioctl / mISDN_read

write to 0xffff88812d848280 of 4 bytes by task 10864 on cpu 1:
  misdn_add_timer drivers/isdn/mISDN/timerdev.c:175 [inline]
  mISDN_ioctl+0x2fb/0x550 drivers/isdn/mISDN/timerdev.c:233
  vfs_ioctl fs/ioctl.c:51 [inline]
  __do_sys_ioctl fs/ioctl.c:597 [inline]
  __se_sys_ioctl+0xce/0x140 fs/ioctl.c:583
  __x64_sys_ioctl+0x43/0x50 fs/ioctl.c:583
  x64_sys_call+0x14b0/0x3000 arch/x86/include/generated/asm/syscalls_64.h:17
  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
  do_syscall_64+0xd8/0x2c0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

read to 0xffff88812d848280 of 4 bytes by task 10857 on cpu 0:
  mISDN_read+0x1f2/0x470 drivers/isdn/mISDN/timerdev.c:112
  do_loop_readv_writev fs/read_write.c:847 [inline]
  vfs_readv+0x3fb/0x690 fs/read_write.c:1020
  do_readv+0xe7/0x210 fs/read_write.c:1080
  __do_sys_readv fs/read_write.c:1165 [inline]
  __se_sys_readv fs/read_write.c:1162 [inline]
  __x64_sys_readv+0x45/0x50 fs/read_write.c:1162
  x64_sys_call+0x2831/0x3000 arch/x86/include/generated/asm/syscalls_64.h:20
  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
  do_syscall_64+0xd8/0x2c0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

value changed: 0x00000000 -> 0x00000001

Fixes: 1b2b03f8e5 ("Add mISDN core files")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260118132528.2349573-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 18:37:41 -08:00
Jiawen Wu
3d778e65b4 net: txgbe: remove the redundant data return in SW-FW mailbox
For these two firmware mailbox commands, in txgbe_test_hostif() and
txgbe_set_phy_link_hostif(), there is no need to read data from the
buffer.

Under the current setting, OEM firmware will cause the driver to fail to
probe. Because OEM firmware returns more link information, with a larger
OEM structure txgbe_hic_ephy_getlink. However, the current driver does
not support the OEM function. So just fix it in the way that does not
involve reading the returned data.

Fixes: d84a3ff9aa ("net: txgbe: Restrict the use of mismatched FW versions")
Cc: stable@vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Link: https://patch.msgid.link/2914AB0BC6158DDA+20260119065935.6015-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 18:35:56 -08:00
Jakub Kicinski
9d11e974f5 Merge branch 'fix-some-bugs-in-the-flow-director-of-hns3-driver'
Jijie Shao says:

====================
fix some bugs in the flow director of HNS3 driver

This patchset fixes two bugs in the flow director:
1. Incorrect definition of HCLGE_FD_AD_COUNTER_NUM_M
2. Incorrect assignment of HCLGE_FD_AD_NXT_KEY
====================

Link: https://patch.msgid.link/20260119132840.410513-1-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 18:31:42 -08:00
Jijie Shao
f87e034d16 net: hns3: fix the HCLGE_FD_AD_NXT_KEY error setting issue
Use next_input_key instead of counter_id to set HCLGE_FD_AD_NXT_KEY.

Fixes: 1173286802 ("net: hns3: Add input key and action config support for flow director")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20260119132840.410513-3-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 18:31:40 -08:00
Jijie Shao
d57c67c956 net: hns3: fix wrong GENMASK() for HCLGE_FD_AD_COUNTER_NUM_M
HCLGE_FD_AD_COUNTER_NUM_M should be at GENMASK(19, 13),
rather than at GENMASK(20, 13), because bit 20 is
HCLGE_FD_AD_NXT_STEP_B.

This patch corrects the wrong definition.

Fixes: 1173286802 ("net: hns3: Add input key and action config support for flow director")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20260119132840.410513-2-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 18:31:40 -08:00
Russell King (Oracle)
5228e9faae net: stmmac: fix resume: calculate tso last_segment
Tao Wang reports that sometimes, after resume, stmmac can watchdog:
NETDEV WATCHDOG: CPU: x: transmit queue x timed out xx ms

When this occurs, the DMA transmit descriptors contain:
eth0: 221 [0x0000000876d10dd0]: 0x73660cbe 0x8 0x42 0xb04416a0
eth0: 222 [0x0000000876d10de0]: 0x77731d40 0x8 0x16a0 0x90000000

where descriptor 221 is the TSO header and 222 is the TSO payload.
tdes3 for descriptor 221 (0xb04416a0) has both bit 29 (first
descriptor) and bit 28 (last descriptor) set, which is incorrect.
The following packet also has bit 28 set, but isn't marked as a
first descriptor, and this causes the transmit DMA to stall.

This occurs because stmmac_tso_allocator() populates the first
descriptor, but does not set .last_segment correctly. There are two
places where this matters: one is later in stmmac_tso_xmit() where
we use it to update the TSO header descriptor. The other is in the
ring/chain mode clean_desc3() which is a performance optimisation.

Rather than using tx_q->tx_skbuff_dma[].last_segment to determine
whether the first descriptor entry is the only segment, calculate the
number of descriptor entries used. If there is only one descriptor,
then the first is also the last, so mark it as such.

Further work will be necessary to either eliminate .last_segment
entirely or set it correctly. Code analysis also indicates that a
similar issue exists with .is_jumbo. These will be the subject of
a future patch.

Reported-by: Tao Wang <tao03.wang@horizon.auto>
Fixes: c2837423cb ("net: stmmac: Rework TX Coalesce logic")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vhq8O-00000005N5s-0Ke5@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 18:31:13 -08:00
David Yang
302e5b481c be2net: fix data race in be_get_new_eqd
In be_get_new_eqd(), statistics of pkts, protected by u64_stats_sync, are
read and accumulated in ignorance of possible u64_stats_fetch_retry()
events. Before the commit in question, these statistics were retrieved
one by one directly from queues. Fix this by reading them into temporary
variables first.

Fixes: 2094777041 ("be2net: set interrupt moderation for Skyhawk-R using EQ-DB")
Signed-off-by: David Yang <mmyangfl@gmail.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20260119153440.1440578-1-mmyangfl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 18:30:02 -08:00
David Yang
5fbe395cd1 idpf: Fix data race in idpf_net_dim
In idpf_net_dim(), some statistics protected by u64_stats_sync, are read
and accumulated in ignorance of possible u64_stats_fetch_retry() events.
The correct way to copy statistics is already illustrated by
idpf_add_queue_stats(). Fix this by reading them into temporary variables
first.

Fixes: c2d548cad1 ("idpf: add TX splitq napi poll support")
Fixes: 3a8845af66 ("idpf: add RX splitq napi poll support")
Signed-off-by: David Yang <mmyangfl@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260119162720.1463859-1-mmyangfl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 18:29:37 -08:00
David Yang
748a81c8ce net: hns3: fix data race in hns3_fetch_stats
In hns3_fetch_stats(), ring statistics, protected by u64_stats_sync, are
read and accumulated in ignorance of possible u64_stats_fetch_retry()
events. These statistics are already accumulated by
hns3_ring_stats_update(). Fix this by reading them into a temporary
buffer first.

Fixes: b20d7fe51e ("net: hns3: add some statitics info to tx process")
Signed-off-by: David Yang <mmyangfl@gmail.com>
Link: https://patch.msgid.link/20260119160759.1455950-1-mmyangfl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 18:29:09 -08:00
Michel Lind
a917cd0a23 tools/net/ynl: Makefile's install target now installs ynltool
This tool is built by default, but was not being installed by default
when running `make install`. Fix this by calling ynltool's install
target.

Signed-off-by: Michel Lind <michel@michel-slm.name>
Link: https://patch.msgid.link/aWqr9gUT4hWZwwcI@mbp-m3-fedora.vm
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 17:10:50 -08:00
Krzysztof Kozlowski
a9f470594c nfc: MAINTAINERS: Orphan the NFC and look for new maintainers
NFC stack in Linux is in poor shape, with several bugs being discovered
last years via fuzzing, not much new development happening and limited
review and testing.  It requires some more effort than drive-by reviews
I have been offering last one or two years.

I don't have much time nor business interests to keep looking at NFC,
so let's drop me from the maintainers to clearly indicate that more
hands are needed.

Acked-by: Mark Greer <mgreer@animalcreek.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260118094822.10126-2-krzysztof.kozlowski@oss.qualcomm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 16:05:38 -08:00
Daniel Golle
79912b256e net: phy: intel-xway: fix OF node refcount leakage
Automated review spotted am OF node reference count leakage when
checking if the 'leds' child node exists.

Call of_put_node() to correctly maintain the refcount.

Link: https://netdev-ai.bots.linux.dev/ai-review.html?id=20f173ba-0c64-422b-a663-fea4b4ad01d0
Fixes: 1758af47b9 ("net: phy: intel-xway: add support for PHY LEDs")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/e3275e1c1cdca7e6426bb9c11f33bd84b8d900c8.1768783208.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 15:57:21 -08:00
Mina Almasry
bdfc7b55ad idpf: read lower clock bits inside the time sandwich
PCIe reads need to be done inside the time sandwich because PCIe
writes may get buffered in the PCIe fabric and posted to the device
after the _postts completes. Doing the PCIe read inside the time
sandwich guarantees that the write gets flushed before the _postts
timestamp is taken.

Cc: lrizzo@google.com
Cc: namangulati@google.com
Cc: willemb@google.com
Cc: intel-wired-lan@lists.osuosl.org
Cc: milena.olech@intel.com
Cc: jacob.e.keller@intel.com

Fixes: 5cb8805d23 ("idpf: negotiate PTP capabilities and get PTP clock")
Suggested-by: Shachar Raindel <shacharr@google.com>
Signed-off-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-20 12:55:34 -08:00
Paul Greenwalt
d3f867e7a0 ice: fix devlink reload call trace
Commit 4da71a77fc ("ice: read internal temperature sensor") introduced
internal temperature sensor reading via HWMON. ice_hwmon_init() was added
to ice_init_feature() and ice_hwmon_exit() was added to ice_remove(). As a
result if devlink reload is used to reinit the device and then the driver
is removed, a call trace can occur.

BUG: unable to handle page fault for address: ffffffffc0fd4b5d
Call Trace:
 string+0x48/0xe0
 vsnprintf+0x1f9/0x650
 sprintf+0x62/0x80
 name_show+0x1f/0x30
 dev_attr_show+0x19/0x60

The call trace repeats approximately every 10 minutes when system
monitoring tools (e.g., sadc) attempt to read the orphaned hwmon sysfs
attributes that reference freed module memory.

The sequence is:
1. Driver load, ice_hwmon_init() gets called from ice_init_feature()
2. Devlink reload down, flow does not call ice_remove()
3. Devlink reload up, ice_hwmon_init() gets called from
   ice_init_feature() resulting in a second instance
4. Driver unload, ice_hwmon_exit() called from ice_remove() leaving the
   first hwmon instance orphaned with dangling pointer

Fix this by moving ice_hwmon_exit() from ice_remove() to
ice_deinit_features() to ensure proper cleanup symmetry with
ice_hwmon_init().

Fixes: 4da71a77fc ("ice: read internal temperature sensor")
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-20 12:55:34 -08:00
Paul Greenwalt
42fb5f3deb ice: add missing ice_deinit_hw() in devlink reinit path
devlink-reload results in ice_init_hw failed error, and then removing
the ice driver causes a NULL pointer dereference.

[  +0.102213] ice 0000:ca:00.0: ice_init_hw failed: -16
...
[  +0.000001] Call Trace:
[  +0.000003]  <TASK>
[  +0.000006]  ice_unload+0x8f/0x100 [ice]
[  +0.000081]  ice_remove+0xba/0x300 [ice]

Commit 1390b8b3d2 ("ice: remove duplicate call to ice_deinit_hw() on
error paths") removed ice_deinit_hw() from ice_deinit_dev(). As a result
ice_devlink_reinit_down() no longer calls ice_deinit_hw(), but
ice_devlink_reinit_up() still calls ice_init_hw(). Since the control
queues are not uninitialized, ice_init_hw() fails with -EBUSY.

Add ice_deinit_hw() to ice_devlink_reinit_down() to correspond with
ice_init_hw() in ice_devlink_reinit_up().

Fixes: 1390b8b3d2 ("ice: remove duplicate call to ice_deinit_hw() on error paths")
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-20 12:55:34 -08:00
Cody Haas
f406220eb8 ice: Fix persistent failure in ice_get_rxfh
Several ioctl functions have the ability to call ice_get_rxfh, however
all of these ioctl functions do not provide all of the expected
information in ethtool_rxfh_param. For example, ethtool_get_rxfh_indir does
not provide an rss_key. This previously caused ethtool_get_rxfh_indir to
always fail with -EINVAL.

This change draws inspiration from i40e_get_rss to handle this
situation, by only calling the appropriate rss helpers when the
necessary information has been provided via ethtool_rxfh_param.

Fixes: b66a972abb ("ice: Refactor ice_set/get_rss into LUT and key specific functions")
Signed-off-by: Cody Haas <chaas@riotgames.com>
Closes: https://lore.kernel.org/intel-wired-lan/CAH7f-UKkJV8MLY7zCdgCrGE55whRhbGAXvgkDnwgiZ9gUZT7_w@mail.gmail.com/
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-01-20 12:55:34 -08:00
Yun Lu
b97d5eedf4 netdevsim: fix a race issue related to the operation on bpf_bound_progs list
The netdevsim driver lacks a protection mechanism for operations on the
bpf_bound_progs list. When the nsim_bpf_create_prog() performs
list_add_tail, it is possible that nsim_bpf_destroy_prog() is
simultaneously performs list_del. Concurrent operations on the list may
lead to list corruption and trigger a kernel crash as follows:

[  417.290971] kernel BUG at lib/list_debug.c:62!
[  417.290983] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[  417.290992] CPU: 10 PID: 168 Comm: kworker/10:1 Kdump: loaded Not tainted 6.19.0-rc5 #1
[  417.291003] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  417.291007] Workqueue: events bpf_prog_free_deferred
[  417.291021] RIP: 0010:__list_del_entry_valid_or_report+0xa7/0xc0
[  417.291034] Code: a8 ff 0f 0b 48 89 fe 48 89 ca 48 c7 c7 48 a1 eb ae e8 ed fb a8 ff 0f 0b 48 89 fe 48 89 c2 48 c7 c7 80 a1 eb ae e8 d9 fb a8 ff <0f> 0b 48 89 d1 48 c7 c7 d0 a1 eb ae 48 89 f2 48 89 c6 e8 c2 fb a8
[  417.291040] RSP: 0018:ffffb16a40807df8 EFLAGS: 00010246
[  417.291046] RAX: 000000000000006d RBX: ffff8e589866f500 RCX: 0000000000000000
[  417.291051] RDX: 0000000000000000 RSI: ffff8e59f7b23180 RDI: ffff8e59f7b23180
[  417.291055] RBP: ffffb16a412c9000 R08: 0000000000000000 R09: 0000000000000003
[  417.291059] R10: ffffb16a40807c80 R11: ffffffffaf9edce8 R12: ffff8e594427ac20
[  417.291063] R13: ffff8e59f7b44780 R14: ffff8e58800b7a05 R15: 0000000000000000
[  417.291074] FS:  0000000000000000(0000) GS:ffff8e59f7b00000(0000) knlGS:0000000000000000
[  417.291079] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  417.291083] CR2: 00007fc4083efe08 CR3: 00000001c3626006 CR4: 0000000000770ee0
[  417.291088] PKRU: 55555554
[  417.291091] Call Trace:
[  417.291096]  <TASK>
[  417.291103]  nsim_bpf_destroy_prog+0x31/0x80 [netdevsim]
[  417.291154]  __bpf_prog_offload_destroy+0x2a/0x80
[  417.291163]  bpf_prog_dev_bound_destroy+0x6f/0xb0
[  417.291171]  bpf_prog_free_deferred+0x18e/0x1a0
[  417.291178]  process_one_work+0x18a/0x3a0
[  417.291188]  worker_thread+0x27b/0x3a0
[  417.291197]  ? __pfx_worker_thread+0x10/0x10
[  417.291207]  kthread+0xe5/0x120
[  417.291214]  ? __pfx_kthread+0x10/0x10
[  417.291221]  ret_from_fork+0x31/0x50
[  417.291230]  ? __pfx_kthread+0x10/0x10
[  417.291236]  ret_from_fork_asm+0x1a/0x30
[  417.291246]  </TASK>

Add a mutex lock, to prevent simultaneous addition and deletion operations
on the list.

Fixes: 31d3ad8329 ("netdevsim: add bpf offload support")
Reported-by: Yinhao Hu <dddddd@hust.edu.cn>
Reported-by: Kaiyan Mei <M202472210@hust.edu.cn>
Signed-off-by: Yun Lu <luyun@kylinos.cn>
Link: https://patch.msgid.link/20260116095308.11441-1-luyun_611@163.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-20 17:18:52 +01:00
Michal Luczaj
5d54aa40c7 vsock/test: Do not filter kallsyms by symbol type
Blamed commit implemented logic to discover available vsock transports by
grepping /proc/kallsyms for known symbols. It incorrectly filtered entries
by type 'd'.

For some kernel configs having

    CONFIG_VIRTIO_VSOCKETS=m
    CONFIG_VSOCKETS_LOOPBACK=y

kallsyms reports

    0000000000000000 d virtio_transport	[vmw_vsock_virtio_transport]
    0000000000000000 t loopback_transport

Overzealous filtering might have affected vsock test suit, resulting in
insufficient/misleading testing.

Do not filter symbols by type. It never helped much.

Fixes: 3070c05b7a ("vsock/test: Introduce get_transports()")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20260116-vsock_test-kallsyms-grep-v1-1-3320bc3346f2@rbox.co
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-20 16:45:39 +01:00
Jakub Kicinski
58bae918d7 Merge branch 'net-sched-teql-enforce-hierarchy-placement'
Jamal Hadi Salim says:

====================
net/sched: teql: Enforce hierarchy placement

GangMin Kim <km.kim1503@gmail.com> managed to create a UAF on qfq by inserting
teql as a child qdisc and exploiting a qlen sync issue.
teql is not intended to be used as a child qdisc. Lets enforce that rule in
patch #1. Although patch #1 fixes the issue, we prevent another potential qlen
exploit in qfq in patch #2 by enforcing the child's active status is not
determined by inspecting the qlen. In patch #3 we add a tdc test case.
====================

Link: https://patch.msgid.link/20260114160243.913069-1-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-19 12:06:45 -08:00
Victor Nogueira
2460f31e6e selftests/tc-testing: Try to add teql as a child qdisc
Add a selftest that attempts to add a teql qdisc as a qfq child.
Since teql _must_ be added as a root qdisc, the kernel should reject
this.

Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/20260114160243.913069-4-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-19 12:06:42 -08:00
Jamal Hadi Salim
d837fbee92 net/sched: qfq: Use cl_is_active to determine whether class is active in qfq_rm_from_ag
This is more of a preventive patch to make the code more consistent and
to prevent possible exploits that employ child qlen manipulations on qfq.
use cl_is_active instead of relying on the child qdisc's qlen to determine
class activation.

Fixes: 462dbc9101 ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost")
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260114160243.913069-3-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-19 12:06:41 -08:00
Jamal Hadi Salim
50da4b9d07 net/sched: Enforce that teql can only be used as root qdisc
Design intent of teql is that it is only supposed to be used as root qdisc.
We need to check for that constraint.

Although not important, I will describe the scenario that unearthed this
issue for the curious.

GangMin Kim <km.kim1503@gmail.com> managed to concot a scenario as follows:

ROOT qdisc 1:0 (QFQ)
  ├── class 1:1 (weight=15, lmax=16384) netem with delay 6.4s
  └── class 1:2 (weight=1, lmax=1514) teql

GangMin sends a packet which is enqueued to 1:1 (netem).
Any invocation of dequeue by QFQ from this class will not return a packet
until after 6.4s. In the meantime, a second packet is sent and it lands on
1:2. teql's enqueue will return success and this will activate class 1:2.
Main issue is that teql only updates the parent visible qlen (sch->q.qlen)
at dequeue. Since QFQ will only call dequeue if peek succeeds (and teql's
peek always returns NULL), dequeue will never be called and thus the qlen
will remain as 0. With that in mind, when GangMin updates 1:2's lmax value,
the qfq_change_class calls qfq_deact_rm_from_agg. Since the child qdisc's
qlen was not incremented, qfq fails to deactivate the class, but still
frees its pointers from the aggregate. So when the first packet is
rescheduled after 6.4 seconds (netem's delay), a dangling pointer is
accessed causing GangMin's causing a UAF.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: GangMin Kim <km.kim1503@gmail.com>
Tested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260114160243.913069-2-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-19 12:06:41 -08:00
Jakub Kicinski
3ddf44626a Merge tag 'linux-can-fixes-for-6.19-20260116' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:

====================
pull-request: can 2026-01-16

The first patch is by me and sets the missing CAN device default
capabilities in the CAN device layer.

The next patch is by me, target the gs_usb driver and adds the missing
unanchor URB on usb_submit_urb() error.

The last 5 patches are also from me and fix the same USB-URB leak (as
in the gs_usb driver) in the affected CAN-USB driver: ems_usb,
esd_usb, kvaser_usb, mcba_usb and usb_8dev.

* tag 'linux-can-fixes-for-6.19-20260116' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: usb_8dev: usb_8dev_read_bulk_callback(): fix URB memory leak
  can: mcba_usb: mcba_usb_read_bulk_callback(): fix URB memory leak
  can: kvaser_usb: kvaser_usb_read_bulk_callback(): fix URB memory leak
  can: esd_usb: esd_usb_read_bulk_callback(): fix URB memory leak
  can: ems_usb: ems_usb_read_bulk_callback(): fix URB memory leak
  can: gs_usb: gs_usb_receive_bulk_callback(): unanchor URL on usb_submit_urb() error
  can: dev: alloc_candev_mqs(): add missing default CAN capabilities
====================

Link: https://patch.msgid.link/20260116200323.366877-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-19 10:19:33 -08:00
Alok Tiwari
ab9b218a15 octeontx2: cn10k: fix RX flowid TCAM mask handling
The RX flowid programming initializes the TCAM mask to all ones, but
then overwrites it when clearing the MAC DA mask bits. This results
in losing the intended initialization and may affect other match fields.

Update the code to clear the MAC DA bits using an AND operation, making
the handling of mask[0] consistent with mask[1], where the field-specific
bits are cleared after initializing the mask to ~0ULL.

Fixes: 57d00d4364 ("octeontx2-pf: mcs: Match macsec ethertype along with DMAC")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Link: https://patch.msgid.link/20260116164724.2733511-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-19 10:12:10 -08:00
David Howells
2c28769a51 rxrpc: Fix recvmsg() unconditional requeue
If rxrpc_recvmsg() fails because MSG_DONTWAIT was specified but the call at
the front of the recvmsg queue already has its mutex locked, it requeues
the call - whether or not the call is already queued.  The call may be on
the queue because MSG_PEEK was also passed and so the call was not dequeued
or because the I/O thread requeued it.

The unconditional requeue may then corrupt the recvmsg queue, leading to
things like UAFs or refcount underruns.

Fix this by only requeuing the call if it isn't already on the queue - and
moving it to the front if it is already queued.  If we don't queue it, we
have to put the ref we obtained by dequeuing it.

Also, MSG_PEEK doesn't dequeue the call so shouldn't call
rxrpc_notify_socket() for the call if we didn't use up all the data on the
queue, so fix that also.

Fixes: 540b1c48c3 ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg")
Reported-by: Faith <faith@zellic.io>
Reported-by: Pumpkin Chang <pumpkin@devco.re>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Marc Dionne <marc.dionne@auristor.com>
cc: Nir Ohfeld <niro@wiz.io>
cc: Willy Tarreau <w@1wt.eu>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: stable@kernel.org
Link: https://patch.msgid.link/95163.1768428203@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-19 10:07:06 -08:00
Jakub Kicinski
7b8e1a807c Merge branch 'ipvlan-addrs_lock-made-per-port'
Dmitry Skorodumov says:

====================
ipvlan: addrs_lock made per port

First patch fixes a rather minor issues that sometimes
ipvlan-addrs are modified without lock (because
for IPv6 addr can be sometimes added without RTNL)
====================

Link: https://patch.msgid.link/20260112142417.4039566-1-skorodumov.dmitry@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-19 10:03:34 -08:00
Dmitry Skorodumov
8becfe16e4 selftests: net: simple selftest for ipvtap
This is a simple ipvtap test to test handling
IP-address add/remove on ipvlan interface.

It creates a veth-interface and then creates several
network-namespace with ipvlan0 interface in it linked to veth.

Then it starts to add/remove addresses on ipvlan0 interfaces
in several threads.

At finish, it checks that there is no duplicated addresses.

Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>
Link: https://patch.msgid.link/20260112142417.4039566-3-skorodumov.dmitry@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-19 10:03:31 -08:00
Dmitry Skorodumov
d3ba321624 ipvlan: Make the addrs_lock be per port
Make the addrs_lock be per port, not per ipvlan dev.

Initial code seems to be written in the assumption,
that any address change must occur under RTNL.
But it is not so for the case of IPv6. So

1) Introduce per-port addrs_lock.

2) It was needed to fix places where it was forgotten
to take lock (ipvlan_open/ipvlan_close)

This appears to be a very minor problem though.
Since it's highly unlikely that ipvlan_add_addr() will
be called on 2 CPU simultaneously. But nevertheless,
this could cause:

1) False-negative of ipvlan_addr_busy(): one interface
iterated through all port->ipvlans + ipvlan->addrs
under some ipvlan spinlock, and another added IP
under its own lock. Though this is only possible
for IPv6, since looks like only ipvlan_addr6_event() can be
called without rtnl_lock.

2) Race since ipvlan_ht_addr_add(port) is called under
different ipvlan->addrs_lock locks

This should not affect performance, since add/remove IP
is a rare situation and spinlock is not taken on fast
paths.

Fixes: 8230819494 ("ipvlan: use per device spinlock to protect addrs list updates")
Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/20260112142417.4039566-2-skorodumov.dmitry@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-19 10:03:30 -08:00
Eric Dumazet
7a29f6bf60 l2tp: avoid one data-race in l2tp_tunnel_del_work()
We should read sk->sk_socket only when dealing with kernel sockets.

syzbot reported the following data-race:

BUG: KCSAN: data-race in l2tp_tunnel_del_work / sk_common_release

write to 0xffff88811c182b20 of 8 bytes by task 5365 on cpu 0:
  sk_set_socket include/net/sock.h:2092 [inline]
  sock_orphan include/net/sock.h:2118 [inline]
  sk_common_release+0xae/0x230 net/core/sock.c:4003
  udp_lib_close+0x15/0x20 include/net/udp.h:325
  inet_release+0xce/0xf0 net/ipv4/af_inet.c:437
  __sock_release net/socket.c:662 [inline]
  sock_close+0x6b/0x150 net/socket.c:1455
  __fput+0x29b/0x650 fs/file_table.c:468
  ____fput+0x1c/0x30 fs/file_table.c:496
  task_work_run+0x131/0x1a0 kernel/task_work.c:233
  resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
  __exit_to_user_mode_loop kernel/entry/common.c:44 [inline]
  exit_to_user_mode_loop+0x1fe/0x740 kernel/entry/common.c:75
  __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
  syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
  syscall_exit_to_user_mode_work include/linux/entry-common.h:159 [inline]
  syscall_exit_to_user_mode include/linux/entry-common.h:194 [inline]
  do_syscall_64+0x1e1/0x2b0 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

read to 0xffff88811c182b20 of 8 bytes by task 827 on cpu 1:
  l2tp_tunnel_del_work+0x2f/0x1a0 net/l2tp/l2tp_core.c:1418
  process_one_work kernel/workqueue.c:3257 [inline]
  process_scheduled_works+0x4ce/0x9d0 kernel/workqueue.c:3340
  worker_thread+0x582/0x770 kernel/workqueue.c:3421
  kthread+0x489/0x510 kernel/kthread.c:463
  ret_from_fork+0x149/0x290 arch/x86/kernel/process.c:158
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246

value changed: 0xffff88811b818000 -> 0x0000000000000000

Fixes: d00fa9adc5 ("l2tp: fix races with tunnel socket close")
Reported-by: syzbot+7312e82745f7fa2526db@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6968b029.050a0220.58bed.0016.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/20260115092139.3066180-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-19 09:55:41 -08:00
Thadeu Lima de Souza Cascardo
f40ddcc0c0 Revert "nfc/nci: Add the inconsistency check between the input data length and count"
This reverts commit 068648aab7.

NFC packets may have NUL-bytes. Checking for string length is not a correct
assumption here. As long as there is a check for the length copied from
copy_from_user, all should be fine.

The fix only prevented the syzbot reproducer from triggering the bug
because the packet is not enqueued anymore and the code that triggers the
bug is not exercised.

The fix even broke
testing/selftests/nci/nci_dev, making all tests there fail. After the
revert, 6 out of 8 tests pass.

Fixes: 068648aab7 ("nfc/nci: Add the inconsistency check between the input data length and count")
Cc: stable@vger.kernel.org
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Link: https://patch.msgid.link/20260113202458.449455-1-cascardo@igalia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17 18:02:50 -08:00
Hamza Mahfooz
a92a6c50e3 net: sfp: add potron quirk to the H-COM SPP425H-GAB4 SFP+ Stick
This is another one of those XGSPON ONU sticks that's using the
X-ONU-SFPP internally, thus it also requires the potron quirk to avoid tx
faults. So, add an entry for it in sfp_quirks[].

Cc: stable@vger.kernel.org
Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
Link: https://patch.msgid.link/20260113232957.609642-1-someguy@effective-light.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17 16:45:40 -08:00
David Yang
b47adaab8b veth: fix data race in veth_get_ethtool_stats
In veth_get_ethtool_stats(), some statistics protected by
u64_stats_sync, are read and accumulated in ignorance of possible
u64_stats_fetch_retry() events. These statistics, peer_tq_xdp_xmit and
peer_tq_xdp_xmit_err, are already accumulated by veth_xdp_xmit(). Fix
this by reading them into a temporary buffer first.

Fixes: 5fe6e56776 ("veth: rely on peer veth_rq for ndo_xdp_xmit accounting")
Signed-off-by: David Yang <mmyangfl@gmail.com>
Link: https://patch.msgid.link/20260114122450.227982-1-mmyangfl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17 16:22:18 -08:00
Jakub Kicinski
6a5e5a3da3 Merge branch 'fou-gue-fix-skb-memleak-with-inner-protocol-0'
Kuniyuki Iwashima says:

====================
fou/gue: Fix skb memleak with inner protocol 0.

syzbot reported memleak for a GUE packet with its inner
protocol number 0.

Patch 1 fixes the issue, and patch 3 fixes the same issue
in FOU.
====================

Link: https://patch.msgid.link/20260115172533.693652-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17 16:00:28 -08:00
Kuniyuki Iwashima
7a9bc9e3f4 fou: Don't allow 0 for FOU_ATTR_IPPROTO.
fou_udp_recv() has the same problem mentioned in the previous
patch.

If FOU_ATTR_IPPROTO is set to 0, skb is not freed by
fou_udp_recv() nor "resubmit"-ted in ip_protocol_deliver_rcu().

Let's forbid 0 for FOU_ATTR_IPPROTO.

Fixes: 23461551c0 ("fou: Support for foo-over-udp RX path")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260115172533.693652-4-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17 16:00:24 -08:00
Kuniyuki Iwashima
68578370f9 tools: ynl: Specify --no-line-number in ynl-regen.sh.
If grep.lineNumber is enabled in .gitconfig,

  [grep]
  lineNumber = true

ynl-regen.sh fails with the following error:

  $ ./tools/net/ynl/ynl-regen.sh -f
  ...
  ynl_gen_c.py: error: argument --mode: invalid choice: '4:' (choose from user, kernel, uapi)
  	GEN 4:	net/ipv4/fou_nl.c

Let's specify --no-line-number explicitly.

Fixes: be5bea1cc0 ("net: add basic C code generators for Netlink")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260115172533.693652-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17 16:00:23 -08:00
Kuniyuki Iwashima
9a56796ad2 gue: Fix skb memleak with inner IP protocol 0.
syzbot reported skb memleak below. [0]

The repro generated a GUE packet with its inner protocol 0.

gue_udp_recv() returns -guehdr->proto_ctype for "resubmit"
in ip_protocol_deliver_rcu(), but this only works with
non-zero protocol number.

Let's drop such packets.

Note that 0 is a valid number (IPv6 Hop-by-Hop Option).

I think it is not practical to encap HOPOPT in GUE, so once
someone starts to complain, we could pass down a resubmit
flag pointer to distinguish two zeros from the upper layer:

  * no error
  * resubmit HOPOPT

[0]
BUG: memory leak
unreferenced object 0xffff888109695a00 (size 240):
  comm "syz.0.17", pid 6088, jiffies 4294943096
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 40 c2 10 81 88 ff ff 00 00 00 00 00 00 00 00  .@..............
  backtrace (crc a84b336f):
    kmemleak_alloc_recursive include/linux/kmemleak.h:44 [inline]
    slab_post_alloc_hook mm/slub.c:4958 [inline]
    slab_alloc_node mm/slub.c:5263 [inline]
    kmem_cache_alloc_noprof+0x3b4/0x590 mm/slub.c:5270
    __build_skb+0x23/0x60 net/core/skbuff.c:474
    build_skb+0x20/0x190 net/core/skbuff.c:490
    __tun_build_skb drivers/net/tun.c:1541 [inline]
    tun_build_skb+0x4a1/0xa40 drivers/net/tun.c:1636
    tun_get_user+0xc12/0x2030 drivers/net/tun.c:1770
    tun_chr_write_iter+0x71/0x120 drivers/net/tun.c:1999
    new_sync_write fs/read_write.c:593 [inline]
    vfs_write+0x45d/0x710 fs/read_write.c:686
    ksys_write+0xa7/0x170 fs/read_write.c:738
    do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
    do_syscall_64+0xa4/0xf80 arch/x86/entry/syscall_64.c:94
    entry_SYSCALL_64_after_hwframe+0x77/0x7f

Fixes: 37dd024779 ("gue: Receive side for Generic UDP Encapsulation")
Reported-by: syzbot+4d8c7d16b0e95c0d0f0d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6965534b.050a0220.38aacd.0001.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260115172533.693652-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17 16:00:23 -08:00
Simon Horman
ff77379468 docs: netdev: refine 15-patch limit
The 15 patch limit is intended by the maintainers to cover
all outstanding patches on the mailing list on a per-tree basis.
Not just those in a single patchset. Document this practice accordingly.

Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260115-15-minutes-of-fame-v2-1-70cbf0883aff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17 15:59:25 -08:00
Raju Rangoju
c158f985cf amd-xgbe: avoid misleading per-packet error log
On the receive path, packet can be damaged because of buffer
overflow in Rx FIFO. Avoid misleading per-packet error log when
packet->errors is set, this can flood the log. Instead, rely on the
standard rtnl_link_stats64 stats.

Fixes: c5aa9e3b81 ("amd-xgbe: Initial AMD 10GbE platform driver")
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Link: https://patch.msgid.link/20260114163037.2062606-1-Raju.Rangoju@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17 15:42:48 -08:00
Thomas Fourier
d998b0e5af octeontx2: Fix otx2_dma_map_page() error return code
0 is a valid DMA address [1] so using it as the error value can lead to
errors.  The error value of dma_map_XXX() functions is DMA_MAPPING_ERROR
which is ~0.  The callers of otx2_dma_map_page() use dma_mapping_error()
to test the return value of otx2_dma_map_page(). This means that they
would not detect an error in otx2_dma_map_page().

Make otx2_dma_map_page() return the raw value of dma_map_page_attrs().

[1] https://lore.kernel.org/all/f977f68b-cec5-4ab7-b4bd-2cf6aca46267@intel.com

Fixes: caa2da34fd ("octeontx2-pf: Initialize and config queues")
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Link: https://patch.msgid.link/20260114123107.42387-2-fourier.thomas@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17 15:40:06 -08:00