passt/udp_vu.c, branch 2025_06_11.0293c6f

udp: Merge vhost-user and "buf" listening socket paths

2025-04-07T19:43:52+00:00

udp_buf_listen_sock_data() and udp_vu_listen_sock_data() now have
effectively identical structure.  The forwarding functions used for flow
specific sockets (udp_buf_sock_to_tap(), udp_vu_sock_to_tap() and
udp_sock_to_sock()) also now take a number of datagrams.  This means we
can re-use them for the listening socket path, just passing '1' so they
handle a single datagram at a time.

This allows us to merge both the vhost-user and flow specific paths into
a single, simpler udp_listen_sock_data().

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

udp: Split spliced forwarding path from udp_buf_reply_sock_data()

2025-04-07T19:41:32+00:00

udp_buf_reply_sock_data() can handle forwarding data either from socket
to socket ("splicing") or from socket to tap.  It has a test on each
datagram for which case we're in, but that will be the same for everything
in the batch.

Split out the spliced path into a separate udp_sock_to_sock() function.
This leaves udp_{buf,vu}_reply_sock_data() handling only forwards from
socket to tap, so rename and simplify them accordingly.

This makes the code slightly longer for now, but will allow future cleanups
to shrink it back down again.

Signed-off-by: David Gibson 
[sbrivio: Fix typos in comments to udp_sock_recv() and
 udp_vu_listen_sock_data()]
Signed-off-by: Stefano Brivio

udp: Parameterize number of datagrams handled by udp_*_reply_sock_data()

2025-04-07T19:31:54+00:00

Both udp_buf_reply_sock_data() and udp_vu_reply_sock_data() internally
decide what the maximum number of datagrams they will forward is.  We have
some upcoming reasons to allow the caller to decide that instead, so make
the maximum number of datagrams a parameter for both of them.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

udp: Polish udp_vu_sock_info() and remove from vu specific code

2025-04-07T19:29:23+00:00

udp_vu_sock_info() uses MSG_PEEK to look ahead at the next datagram to be
received and gets its source address.  Currently we only use it in the
vhost-user path, but there's nothing inherently vhost-user specific about
it.  We have upcoming uses for it elsewhere so rename and move to udp.c.

While we're there, polish its error reporting a litle.

Signed-off-by: David Gibson 
[sbrivio: Drop excess newline before udp_sock_recv()]
Signed-off-by: Stefano Brivio

udp: Simplify updates to UDP flow timestamp

2025-04-02T09:30:26+00:00

Since UDP has no built in knowledge of connections, the only way we
know when we're done with a UDP flow is a timeout with no activity.
To keep track of this struct udp_flow includes a timestamp to record
the last time we saw traffic on the flow.

For data from listening sockets and from tap, this is done implicitly via
udp_flow_from_{sock,tap}() but for reply sockets it's done explicitly.
However, that logic is duplicated between the vhost-user and "buf" paths.
Make it common in udp_reply_sock_handler() instead.

Technically this is a behavioural change: previously if we got an EPOLLIN
event, but there wasn't actually any data we wouldn't update the timestamp,
now we will.  This should be harmless: if there's an EPOLLIN we expect
there to be data, and even if there isn't the worst we can do is mildly
delay the cleanup of a stale flow.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

udp: Share more logic between vu and non-vu reply socket paths

2025-03-26T20:34:28+00:00

Share some additional miscellaneous logic between the vhost-user and "buf"
paths for data on udp reply sockets.  The biggest piece is error handling
of cases where we can't forward between the two pifs of the flow.  We also
make common some more simple logic locating the correct flow and its
parameters.

This adds some lines of code due to extra comment lines, but nonetheless
reduces logic duplication.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

udp_vu: Factor things out of udp_vu_reply_sock_data() loop

2025-03-26T20:34:26+00:00

At the start of every cycle of the loop in udp_vu_reply_sock_data() we:
 - ASSERT that uflow is not NULL
 - Check if the target pif is PIF_TAP
 - Initialize the v6 boolean

However, all of these depend only on the flow, which doesn't change across
the loop.  This is probably a duplication from udp_vu_listen_sock_data(),
where the flow can be different for each packet.  For the reply socket
case, however, factor that logic out of the loop.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

udp: Simplify checking of epoll event bits

2025-03-26T20:34:23+00:00

udp_{listen,reply}_sock_handler() can accept both EPOLLERR and EPOLLIN
events.  However, unlike most epoll event handlers we don't check the
event bits right there.  EPOLLERR is checked within udp_sock_errs() which
we call unconditionally.  Checking EPOLLIN is still more buried: it is
checked within both udp_sock_recv() and udp_vu_sock_recv().

We can simplify the logic and pass less extraneous parameters around by
moving the checking of the event bits to the top level event handlers.

This makes udp_{buf,vu}_{listen,reply}_sock_handler() no longer general
event handlers, but specific to EPOLLIN events, meaning new data.  So,
rename those functions to udp_{buf,vu}_{listen,reply}_sock_data() to better
reflect their function.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

udp: Common invocation of udp_sock_errs() for vhost-user and "buf" paths

2025-03-26T20:34:11+00:00

The vhost-user and non-vhost-user paths for both udp_listen_sock_handler()
and udp_reply_sock_handler() are more or less completely separate.  Both,
however, start with essentially the same invocation of udp_sock_errs(), so
that can be made common.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

udp: create and send ICMPv4 to local peer when applicable

2025-03-07T01:21:19+00:00

When a local peer sends a UDP message to a non-existing port on an
existing remote host, that host will return an ICMP message containing
the error code ICMP_PORT_UNREACH, plus the header and the first eight
bytes of the original message. If the sender socket has been connected,
it uses this message to issue a "Connection Refused" event to the user.

Until now, we have only read such events from the externally facing
socket, but we don't forward them back to the local sender because
we cannot read the ICMP message directly to user space. Because of
this, the local peer will hang and wait for a response that never
arrives.

We now fix this for IPv4 by recreating and forwarding a correct ICMP
message back to the internal sender. We synthesize the message based
on the information in the extended error structure, plus the returned
part of the original message body.

Note that for the sake of completeness, we even produce ICMP messages
for other error codes. We have noticed that at least ICMP_PROT_UNREACH
is propagated as an error event back to the user.

Reviewed-by: David Gibson 
Signed-off-by: Jon Maloy 
[sbrivio: fix cppcheck warning: udp_send_conn_fail_icmp4() doesn't
 modify 'in', it can be declared as const]
Signed-off-by: Stefano Brivio