passt/test, branch 2025_01_21.4f2c8e7

test/pasta_podman: Run Podman tests on a single CPU thread

2025-01-15T21:16:32+00:00

Increasingly often, I'm getting occasional failures of the same type
as https://github.com/containers/podman/issues/24147. I guess it
mostly depends on the system load.

It will be a while until I'll actually run tests on a kernel
including my fix for it, kernel commit a502ea6fa94b ("udp: Deal with
race between UDP socket address change and rehash"), so add a horrible
workaround using taskset(1), for the moment.

Signed-off-by: Stefano Brivio

perf/passt_vu_tcp: Make it shine

2024-11-28T14:06:44+00:00

Signed-off-by: Stefano Brivio

test: Add tests for passt in vhost-user mode

2024-11-27T15:49:21+00:00

Run functional and performance tests for vhost-user mode as well. For
functional tests, we add passt_vu and passt_vu_in_ns as symbolic links
to their non-vhost-user counterparts, as no differences are intended
but we want to distinguish them in test logs.

For performance tests, instead, we add separate perf/passt_vu_tcp and
perf/passt_vu_udp files, as we need longer test duration, as well as
higher UDP sending bandwidths and larger TCP windows, to actually get
the highest throughput vhost-user mode offers.

For valgrind tests, vhost-user mode needs two extra system calls:
statx and readlink. Add them as EXTRA_SYSCALLS for the valgrind
target.

Signed-off-by: Stefano Brivio 
Signed-off-by: Laurent Vivier 
Reviewed-by: David Gibson

test: Improve logic for waiting for SLAAC & DAD to complete in NDP tests

2024-11-26T07:30:18+00:00

Since 9a0e544f05bf the NDP tests attempt to explicitly wait for DAD to
complete, rather than just having a hard coded sleep.  However, the
conditions we use are a bit sloppy and allow for a number of possible cases
where it might not work correctly.  Stefano seems to be hitting one of
these (though I'm not sure which) with some later patches.

 - We wait for *lack* of a tentative address, so if the first check occurs
   before we have even a tentative address it will bypass the delay
 - It's not entirely clear if the permanent address will always appear
   as soon as the tentative address disappears
 - We weren't filtering on interface
 - We were doing the filtering with ip-address options rather than in jq.
   However in at least in some circumstances this seems to result in an
   empty .addr_info field, rather than omitting it entirely, which could
   cause us to get the wrong result

So, instead, explicitly wait for the address we need to be present: an
RA provided address on the external interface.  While we're here we remove
the requirement that it have global scope: the "kernel_ra" check is already
sufficient to make sure this address comes from an NDP RA, not something
else.  If it's not the global scope address we expect, better to check it
and fail, rather than keep waiting.

Fixes: 9a0e544f05bf ("test: Improve test for NDP assigned prefix")
Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

test/perf: Select a single IPv6 namespace address in pasta tests

2024-11-26T07:30:18+00:00

By dropping the filter on prefix length, commit 910f4f910301
("test: Don't require 64-bit prefixes in perf tests") broke tests on
setups where two global unicast IPv6 addresses are available, which
is the typical case when the "host" is a VM running under passt with
addresses from SLAAC and DHCPv6, because two addresses will be
returned.

Pick the first one instead. We don't really care about the prefix
length, any of these addresses will work.

Fixes: 910f4f910301 ("test: Don't require 64-bit prefixes in perf tests")
Link: https://archives.passt.top/passt-dev/20241119214344.6b4a5b3a@elisabeth/
Signed-off-by: Stefano Brivio 
Reviewed-by: David Gibson

test: Improve test for NDP assigned prefix

2024-11-07T11:47:37+00:00

In the NDP tests we search explicitly for a guest address with prefix
length 64.  AFAICT this is an attempt to specifically find the SLAAC
assigned address, rather than something assigned by other means.  We can do
that more explicitly by checking for .protocol == "kernel_ra". however.

The SLAAC prefixes we assigned *will* always be 64-bit, that's hard-coded
into our NDP implementation.  RFC4862 doesn't really allow anything else
since the interface identifiers for an Ethernet-like link are 64-bits.

Let's actually verify that, rather than just assuming it, by extracting the
prefix length assigned in the guest and checking it as well.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

test: Don't require 64-bit prefixes in perf tests

2024-11-07T11:47:34+00:00

When determining the namespace's IPv6 address in the perf test setup, we
explicitly filter for addresses with a 64-bit prefix length.  There's no
real reason we need that - as long as it's a global address we can use it.
I suspect this was copied without thinking from a similar example in the
NDP tests, where the 64-bit prefix length _is_ meaningful (though it's not
entirely clear if the handling is correct there either).

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

test: Make nstool hold robust against interruptions to control clients

2024-11-07T11:47:30+00:00

Currently nstool die()s on essentially any error.  In most cases that's
fine for our purposes.  However, it's a problem when in "hold" mode and
getting an IO error on an accept()ed socket.  This could just indicate that
the control client aborted prematurely, in which case we don't want to
kill of the namespace we're holding.

Adjust these to print an error, close() the control client socket and
carry on.  In addition, we need to explicitly ignore SIGPIPE in order not
to be killed by an abruptly closed client connection.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

test: Rename propagating signal handler

2024-11-07T11:47:27+00:00

nstool in "exec" mode will propagate some signals (specifically SIGTERM) to
the process in the namespace it executes.  The signal handler which
accomplishes this is called simply sig_handler().  However, it turns out
we're going to need some other signal handlers, so rename this to the more
specific sig_propagate().

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

test: Adjust misplaced sleeps in two_guests code

2024-11-05T22:46:38+00:00

Most of our transfer tests using socat use 'sleep' waaiting for the server
side to be ready before starting the client.  However in two_guests/basic
the sleep is in the wrong place: rather than being between starting the
server and starting the client, it's after waiting for the server to
complete.  This causes occasional hangs when the client runs before the
server is ready - in that case the receiving guest sends an RST, which we
don't (currently) propagate back to the sender.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio