passt/test, branch 2024_03_20.71dd405

test: Fix passt.mbuto for cases where /usr/sbin doesn't exist

2024-01-16T20:48:31+00:00

f0ccca74 ("test: make passt.mbuto script more robust") is supposed to make
mbuto more robust by standardizing on always putting things in /usr/sbin
with /sbin a symlink to it.  This matters because different distros have
different conventions about how the two are used.

However, the logic there requires that /usr/sbin at least exists to start
with.  This isn't always the case with Fedora derived mbuto images.
Ironically the DIRS variable ensures that /sbin exists, although we then
remove it, but doesn't require /usr/sbin to exist.  Fix that up so that
the new logic will work with Fedora.

Fixes: f0ccca741f64 ("test: make passt.mbuto script more robust")
Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

test: make passt.mbuto script more robust

2023-12-27T18:33:31+00:00

Creation of a symbolic link from /sbin to /usr/sbin fails if /sbin
exists and is non-empty. This is the case on Ubuntu-23.04.

We fix this by removing /sbin before creating the link.

Signed-off-by: Jon Maloy 
Reviewed-by: David Gibson 
Signed-off-by: Stefano Brivio

test: Select first reported IPv6 address for guest/host comparison

2023-12-27T18:28:35+00:00

If we run passt nested (a guest connected via passt to a guest
connected via passt to the host), the first guest (L1) typically has
two IPv6 addresses on the same interface: one formed from the prefix
assigned via SLAAC, and another one assigned via DHCPv6 (to match the
address on the host).

When we select addresses for comparison, in this case, we have
multiple global unicast addresses -- again, on the same interface.
Selecting the first reported one on both host and guest is not
entirely correct (in theory, the order might differ), but works
reasonably well.

Use the trick from 5beef085978e ("test: Only select a single
interface or gateway in tests") to ask jq(1) for the first address
returned by the query.

Signed-off-by: Stefano Brivio

test: Make handling of shell prompts with escapes a little more reliable

2023-12-07T06:24:48+00:00

When using the old-style "pane" methods of executing commands during the
tests, we need to scan the shell output for prompts in order to tell when
commands have finished.  This is inherently unreliable because commands
could output things that look like prompts, and prompts might not look like
we expect them to.  The only way to really fix this is to use a better way
of dispatching commands, like the newer "context" system.

However, it's awkward to convert everything to "context" right at the
moment, so we're still relying on some tests that do work most of the time.
It is, however, particularly sensitive to fancy coloured prompts using
escape sequences.  Currently we try to handle this by stripping actual
ESC characters with tr, then looking for some common variants.

We can do a bit better: instead strip all escape sequences using sed before
looking for our prompt.  Or, at least, any one using [a-zA-Z] as the
terminating character. Strictly speaking ANSI escapes can be terminated by
any character in 0x40..0x7e, which isn't easily expressed in a regexp.
This should capture all common ones, though.

With this transformation we can simplify the list of patterns we then look
for as a prompt, removing some redundant variants.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

test: Avoid hitting guestfish command length limits

2023-12-04T08:51:26+00:00

In test/prepare-distro-img.sh we use guestfish to tweak our distro guest
images to be suitable.  Part of this is using a 'copy-in' directive to copy
in the source files for passt itself.  Currently we copy in all the files
with a single 'copy-in', since it allows listing multiple files.  However
it turns out that the number of arguments it can accept is fairly limited
and our current list of files is already very close to that limit.

Instead, expand our list of files to one copy-in per file, avoiding that
limitation.  This isn't much slower, because all the commands still run in
a single invocation of guestfish itself.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

valgrind: Adjust suppression for MSG_TRUNC with NULL buffer

2023-11-19T08:10:12+00:00

valgrind complains if we pass a NULL buffer to recv(), even if we use
MSG_TRUNC, in which case it's actually safe.  For a long time we've had
a valgrind suppression for this.  It singles out the recv() in
tcp_sock_consume(), the only place we use MSG_TRUNC.

However, tcp_sock_consume() only has a single caller, which makes it a
prime candidate for inlining.  If inlined, it won't appear on the stack and
valgrind won't match the correct suppression.

It appears that certain compiler versions (for example gcc-13.2.1 in
Fedora 39) will inline this function even with the -O0 we use for valgrind
builds.  This breaks the suppression leading to a spurious failure in the
tests.

There's not really any way to adjust the suppression itself without making
it overly broad (we don't want to match other recv() calls).  So, as a hack
explicitly prevent inlining of this function when we're making a valgrind
build.  To accomplish this add an explicit -DVALGRIND when making such a
build.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

test/lib/perf_report: Fix up table highlight for pasta's local flows

2023-11-10T05:35:54+00:00

As commit 29269705239f ("test/perf: Small MTUs for spliced TCP
aren't interesting") drops all columns for TCP test MTUs except for
one, in throughput test for pasta's local flows, the first column we
need to highlight in that table is now the second one.

Signed-off-by: Stefano Brivio

test/perf: Simplify calculation of "omit" time for TCP throughput

2023-11-07T08:56:24+00:00

For the TCP throughput tests, we use iperf3's -O "omit" option which
ignores results for the given time at the beginning of the test.  Currently
we calculate this as 1/6th of the test measurement time.  The purpose of
-O, however, is to skip over the TCP slow start period, which in no way
depends on the overall length of the test.

The slow start time is roughly speaking
    log_2 ( max_window_size / MSS ) * round_trip_time
These factors all vary between tests and machines we're running on, but we
can estimate some reasonable bounds for them:
  * The maximum window size is bounded by the buffer sizes at each end,
    which shouldn't exceed 16MiB
  * The mss varies with the MTU we use, but the smallest we use in tests is
    ~256 bytes
  * Round trip time will vary with the system, but with these essentially
    local transfers it will typically be well under 1ms (on my laptop it is
    closer to 0.03ms)

That gives a worst case slow start time of about 16ms.  Setting an omit
time of 0.1s uniformly is therefore more than enough, and substantially
smaller than what we calculate now for the default case (10s / 6 ~= 1.7s).

This reduces total time for the standard benchmark run by around 30s.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

test/perf: Remove unnecessary --pacing-timer options

2023-11-07T08:56:21+00:00

We always set --pacing-timer when invoking iperf3.  However, the iperf3
man page implies this is only relevant for the -b option.  We only use the
-b option for the UDP tests, not TCP, so remove --pacing-timer from the TCP
cases.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

test/perf: "MTU" changes in passt_tcp host to guest aren't useful

2023-11-07T08:56:18+00:00

The TCP packet size used on the passt L2 link (qemu socket) makes a huge
difference to passt/pasta throughput; many of passt's overheads (chiefly
syscalls) are per-packet.

That packet size is largely determined by the MTU on the L2 link, so we
benchmark for a number of different MTUs.  That works well for the guest to
host transfers.  For the host to guest transfers, we purport to test for
different MTUs, but we're not actually adjusting anything interesting.

The host to guest transfers adjust the MTU on the "host's" (actually ns)
loopback interface.  However, that only affects the packet size for the
socket going to passt, not the packet size for the L2 link that passt
manages - passt can and will repack the stream into packets of its own
size.  Since the depacketization on that socket is handled by the kernel it
doesn't have a lot of bearing on passt's performance.

We can't fix this by changing the L2 link MTU from the guest side (as we do
for guest to host), because that would only change the guest's view of the
MTU, passt would still think it has the large MTU.  We could test this by
using the --mtu option to passt, but that would require restarting passt
for each run, which is awkward in the current setup.  So, for now, drop all
the "small MTU" tests for host to guest.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio