aboutgitcodebugslistschat
path: root/netlink.c
diff options
context:
space:
mode:
authorStefano Brivio <sbrivio@redhat.com>2024-03-15 12:07:52 +0100
committerStefano Brivio <sbrivio@redhat.com>2024-03-18 08:56:32 +0100
commitf00b153414b1e57e41cfb49cf0ac15c747f6c910 (patch)
tree3702f8e95d55fe4fd4864b01486189f3e142674c /netlink.c
parentd3eb0d7b59f6a1f3e78efc04c44e1c700b907332 (diff)
downloadpasst-f00b153414b1e57e41cfb49cf0ac15c747f6c910.tar
passt-f00b153414b1e57e41cfb49cf0ac15c747f6c910.tar.gz
passt-f00b153414b1e57e41cfb49cf0ac15c747f6c910.tar.bz2
passt-f00b153414b1e57e41cfb49cf0ac15c747f6c910.tar.lz
passt-f00b153414b1e57e41cfb49cf0ac15c747f6c910.tar.xz
passt-f00b153414b1e57e41cfb49cf0ac15c747f6c910.tar.zst
passt-f00b153414b1e57e41cfb49cf0ac15c747f6c910.zip
netlink: Don't try to get further datagrams in nl_route_dup() on NLMSG_DONE
Martin reports that, with Fedora Linux kernel version kernel-core-6.9.0-0.rc0.20240313gitb0546776ad3f.4.fc41.x86_64, including commit 87d381973e49 ("genetlink: fit NLMSG_DONE into same read() as families"), pasta doesn't exit once the network namespace is gone. Actually, pasta is completely non-functional, at least with default options, because nl_route_dup(), which duplicates routes from the parent namespace into the target namespace at start-up, is stuck on a second receive operation for RTM_GETROUTE. However, with that commit, the kernel is now able to fit the whole response, including the NLMSG_DONE message, into a single datagram, so no further messages will be received. It turns out that commit 4d6e9d0816e2 ("netlink: Always process all responses to a netlink request") accidentally relied on the fact that we would always get at least two datagrams as a response to RTM_GETROUTE. That is, the test to check if we expect another datagram, is based on the 'status' variable, which is 0 if we just parsed NLMSG_DONE, but we'll also expect another datagram if NLMSG_OK on the last message is false. But NLMSG_OK with a zero length is always false. The problem is that we don't distinguish if status is zero because we got a NLMSG_DONE message, or because we processed all the available datagram bytes. Introduce an explicit check on NLMSG_DONE. We should probably refactor this slightly, for example by introducing a special return code from nl_status(), but this is probably the least invasive fix for the issue at hand. Reported-by: Martin Pitt <mpitt@redhat.com> Link: https://github.com/containers/podman/issues/22052 Fixes: 4d6e9d0816e2 ("netlink: Always process all responses to a netlink request") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Tested-by: Paul Holzinger <pholzing@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Diffstat (limited to 'netlink.c')
-rw-r--r--netlink.c3
1 files changed, 2 insertions, 1 deletions
diff --git a/netlink.c b/netlink.c
index 9e7cccb..20de9b3 100644
--- a/netlink.c
+++ b/netlink.c
@@ -525,7 +525,8 @@ int nl_route_dup(int s_src, unsigned int ifi_src,
}
}
- if (!NLMSG_OK(nh, status) || status > 0) {
+ if (nh->nlmsg_type != NLMSG_DONE &&
+ (!NLMSG_OK(nh, status) || status > 0)) {
/* Process any remaining datagrams in a different
* buffer so we don't overwrite the first one.
*/