passt, branch 2024_12_11.09478d5

treewide: Dodge dynamic memory allocation in strerror() from glibc > 2.40

2024-12-11T11:21:23+00:00

With glibc commit 25a5eb4010df ("string: strerror, strsignal cannot
use buffer after dlmopen (bug 32026)"), strerror() now needs, at least
on x86, the getrandom() and brk() system calls, in order to fill in
the locale-translated error message. But getrandom() and brk() are not
allowed by our seccomp profiles.

This became visible on Fedora Rawhide with the "podman login and
logout" Podman tests, defined at test/e2e/login_logout_test.go in the
Podman source tree, where pasta would terminate upon printing error
descriptions (at least the ones related to the SO_ERROR queue for
spliced connections).

Avoid dynamic memory allocation by calling strerrordesc_np() instead,
which is a GNU function returning a static, untranslated version of
the error description. If it's not available, keep calling strerror(),
which at that point should be simple enough as to be usable (at least,
that's currently the case for musl).

Reported-by: Paul Holzinger 
Link: https://github.com/containers/podman/issues/24804
Analysed-by: Paul Holzinger 
Signed-off-by: Stefano Brivio 
Reviewed-by: David Gibson 
Tested-by: Paul Holzinger

pasta: make it possible to disable socket splicing

2024-12-11T00:47:37+00:00

During testing it is sometimes useful to force traffic which would
normally be forwared by socket splicing through the tap interface.

In this commit, we add a command switch enabling such funtionality
for inbound local traffic.

For outbound local traffic this is much trickier, if even possible,
so leave that for a later commit.

Suggested-by: David Gibson 
Signed-off-by: Jon Maloy 
Reviewed-by: David Gibson 
Signed-off-by: Stefano Brivio

tap: Call vu_init() with --fd

2024-12-10T11:26:56+00:00

We need to initialize vhost-user structures with --fd too.

Signed-off-by: Laurent Vivier 
Reviewed-by: David Gibson 
Signed-off-by: Stefano Brivio

tap: Use a common function to start a new connection

2024-12-10T11:26:34+00:00

Merge code from tap_backend_init(), tap_sock_tun_init() and
tap_listen_handler() to set epoll_ref entry and to add it
to epollfd.

No functionality change

Signed-off-by: Laurent Vivier 
Reviewed-by: David Gibson 
Signed-off-by: Stefano Brivio

udp_vu: update segment size

2024-12-05T20:08:58+00:00

In udp_vu_sock_recv(), collect a segment with a size defined to
IP_MAX_MTU + ETH_HLEN + sizeof(struct virtio_net_hdr_mrg_rxbuf)

The original version double counted the IP header: IP_MAX_MTU includes
the IP header, and so did hdrlen.

Signed-off-by: Laurent Vivier 
Signed-off-by: Stefano Brivio

flow: Remove over-zealous sanity checks in flow_sidx_hash()

2024-12-05T20:08:58+00:00

In flow_sidx_hash() we verify that the flow we're hashing doesn't have an
unspecified endpoint address, or zero for either port.  The hash table only
works if we're looking for exact matches of address and port, and this is
attempting to catch any cases where we might have left address or port
unpopulated or filled with a wildcard.

This doesn't really work though, because there are cases where unspecified
addresses or zero ports are correct:
 * We already use unspecified addresses for our address in cases where we
   don't know the specific local address for that side, and exclude the
   obvious extra check on side->oaddr for that reason.
 * Zero port numbers aren't strictly forbidden over the wire.  We forbid
   them for TCP & UDP because they can't safely be handled on the socket
   side.  However for ICMP a zero id, which goes in the port field is
   valid.
 * Possible future flow types (for example, for multicast protocols) might
   legitimately have an unspecified address.

Although it makes them easier to miss, these sorts of sanity checks really
have to be done at the protocol / flow type layer, and we already do so.
Remove the checks in flow_sidx_hash() other than checking that the pif
is specified.

Reported-by: Stefan 
Link: https://bugs.passt.top/show_bug.cgi?id=105
Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

udp: Improve detail of UDP endpoint sanity checking

2024-12-05T20:08:58+00:00

In udp_flow_new() we reject a flow if the endpoint isn't unicast, or it has
a zero endpoint port.  Those conditions aren't strictly illegal, but we
can't safely handle them at present:
 * Multicast UDP endpoints are certainly possible, but our current flow
   tracking only makes sense for simple unicast flows - we'll need
   different handling if we want to handle multicast flows in future
 * It's not entirely clear if port 0 is RFC-ishly correct, but for socket
   interfaces port 0 sometimes has a special meaning such as "pick the port
   for me, kernel".  That makes flows on port 0 unsafe to forward in the
   usual way.

For the same reason we also can't safely handle port 0 as our port.  In
principle that's also true for our address, however in the case of flows
initiated from a socket, we may not know our address since the socket
could be bound to 0.0.0.0 or ::, so we can only verify that our address
is unicast for flows initiated from the tap side.

Refine the current check in udp_flow_new() to slightly more detailed checks
in udp_flow_from_sock() and udp_flow_from_tap() to make what is and isn't
handled clearer.  This makes this checking more similar to what we do for
TCP connections.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

perf/passt_vu_tcp: Make it shine

2024-11-28T14:06:44+00:00

Signed-off-by: Stefano Brivio

tcp_vu: Compute IPv4 header checksum if dlen changes

2024-11-28T13:03:16+00:00

In tcp_vu_data_from_sock() we compute IPv4 header checksum only
for the first and the last packets, and re-use the first packet checksum
for all the other packets as the content of the header doesn't change.

It's more accurate to check the dlen value to know if the checksum
should change as dlen is the only information that can change in the
loop.

Signed-off-by: Laurent Vivier 
Signed-off-by: Stefano Brivio

Makefile: Use make internal string functions

2024-11-28T13:03:16+00:00

TARGET_ARCH is computed from '$(CC) -dumpmachine' using external
bash commands like echo, cut, tr and sed. This can be done using
make internal string functions.

Signed-off-by: Laurent Vivier 
Reviewed-by: David Gibson 
Signed-off-by: Stefano Brivio