<feed xmlns='http://www.w3.org/2005/Atom'>
<title>passt/test, branch 2023_11_07.56d9f6d</title>
<subtitle>Plug A Simple Socket Transport</subtitle>
<link rel='alternate' type='text/html' href='https://passt.top/passt/'/>
<entry>
<title>test/perf: Simplify calculation of "omit" time for TCP throughput</title>
<updated>2023-11-07T08:56:24+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2023-11-06T07:08:33+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=53ff387156380bee9acb5fe2ca62af97b9ccce36'/>
<id>53ff387156380bee9acb5fe2ca62af97b9ccce36</id>
<content type='text'>
For the TCP throughput tests, we use iperf3's -O "omit" option which
ignores results for the given time at the beginning of the test.  Currently
we calculate this as 1/6th of the test measurement time.  The purpose of
-O, however, is to skip over the TCP slow start period, which in no way
depends on the overall length of the test.

The slow start time is roughly speaking
    log_2 ( max_window_size / MSS ) * round_trip_time
These factors all vary between tests and machines we're running on, but we
can estimate some reasonable bounds for them:
  * The maximum window size is bounded by the buffer sizes at each end,
    which shouldn't exceed 16MiB
  * The mss varies with the MTU we use, but the smallest we use in tests is
    ~256 bytes
  * Round trip time will vary with the system, but with these essentially
    local transfers it will typically be well under 1ms (on my laptop it is
    closer to 0.03ms)

That gives a worst case slow start time of about 16ms.  Setting an omit
time of 0.1s uniformly is therefore more than enough, and substantially
smaller than what we calculate now for the default case (10s / 6 ~= 1.7s).

This reduces total time for the standard benchmark run by around 30s.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For the TCP throughput tests, we use iperf3's -O "omit" option which
ignores results for the given time at the beginning of the test.  Currently
we calculate this as 1/6th of the test measurement time.  The purpose of
-O, however, is to skip over the TCP slow start period, which in no way
depends on the overall length of the test.

The slow start time is roughly speaking
    log_2 ( max_window_size / MSS ) * round_trip_time
These factors all vary between tests and machines we're running on, but we
can estimate some reasonable bounds for them:
  * The maximum window size is bounded by the buffer sizes at each end,
    which shouldn't exceed 16MiB
  * The mss varies with the MTU we use, but the smallest we use in tests is
    ~256 bytes
  * Round trip time will vary with the system, but with these essentially
    local transfers it will typically be well under 1ms (on my laptop it is
    closer to 0.03ms)

That gives a worst case slow start time of about 16ms.  Setting an omit
time of 0.1s uniformly is therefore more than enough, and substantially
smaller than what we calculate now for the default case (10s / 6 ~= 1.7s).

This reduces total time for the standard benchmark run by around 30s.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>test/perf: Remove unnecessary --pacing-timer options</title>
<updated>2023-11-07T08:56:21+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2023-11-06T07:08:32+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=aa0bb9f471627c10b3c71fc9dacd06c642d88ad6'/>
<id>aa0bb9f471627c10b3c71fc9dacd06c642d88ad6</id>
<content type='text'>
We always set --pacing-timer when invoking iperf3.  However, the iperf3
man page implies this is only relevant for the -b option.  We only use the
-b option for the UDP tests, not TCP, so remove --pacing-timer from the TCP
cases.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We always set --pacing-timer when invoking iperf3.  However, the iperf3
man page implies this is only relevant for the -b option.  We only use the
-b option for the UDP tests, not TCP, so remove --pacing-timer from the TCP
cases.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>test/perf: "MTU" changes in passt_tcp host to guest aren't useful</title>
<updated>2023-11-07T08:56:18+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2023-11-06T07:08:31+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=498108ad70e95f8485b9be3e480f2c9e2fed713a'/>
<id>498108ad70e95f8485b9be3e480f2c9e2fed713a</id>
<content type='text'>
The TCP packet size used on the passt L2 link (qemu socket) makes a huge
difference to passt/pasta throughput; many of passt's overheads (chiefly
syscalls) are per-packet.

That packet size is largely determined by the MTU on the L2 link, so we
benchmark for a number of different MTUs.  That works well for the guest to
host transfers.  For the host to guest transfers, we purport to test for
different MTUs, but we're not actually adjusting anything interesting.

The host to guest transfers adjust the MTU on the "host's" (actually ns)
loopback interface.  However, that only affects the packet size for the
socket going to passt, not the packet size for the L2 link that passt
manages - passt can and will repack the stream into packets of its own
size.  Since the depacketization on that socket is handled by the kernel it
doesn't have a lot of bearing on passt's performance.

We can't fix this by changing the L2 link MTU from the guest side (as we do
for guest to host), because that would only change the guest's view of the
MTU, passt would still think it has the large MTU.  We could test this by
using the --mtu option to passt, but that would require restarting passt
for each run, which is awkward in the current setup.  So, for now, drop all
the "small MTU" tests for host to guest.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The TCP packet size used on the passt L2 link (qemu socket) makes a huge
difference to passt/pasta throughput; many of passt's overheads (chiefly
syscalls) are per-packet.

That packet size is largely determined by the MTU on the L2 link, so we
benchmark for a number of different MTUs.  That works well for the guest to
host transfers.  For the host to guest transfers, we purport to test for
different MTUs, but we're not actually adjusting anything interesting.

The host to guest transfers adjust the MTU on the "host's" (actually ns)
loopback interface.  However, that only affects the packet size for the
socket going to passt, not the packet size for the L2 link that passt
manages - passt can and will repack the stream into packets of its own
size.  Since the depacketization on that socket is handled by the kernel it
doesn't have a lot of bearing on passt's performance.

We can't fix this by changing the L2 link MTU from the guest side (as we do
for guest to host), because that would only change the guest's view of the
MTU, passt would still think it has the large MTU.  We could test this by
using the --mtu option to passt, but that would require restarting passt
for each run, which is awkward in the current setup.  So, for now, drop all
the "small MTU" tests for host to guest.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>test/perf: Explicitly control UDP packet length, instead of MTU</title>
<updated>2023-11-07T08:56:16+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2023-11-06T07:08:30+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=f94adb121afcf7d5b3cc300fccbdf2247a907f63'/>
<id>f94adb121afcf7d5b3cc300fccbdf2247a907f63</id>
<content type='text'>
Packet size can make a big difference to UDP throughput, so it makes sense
to measure it for a variety of different sizes.  Currently we do this by
adjusting the MTU on the relevant interface before running iperf3.

However, the UDP packet size has no inherent connection to the MTU - it's
controlled by the sender, and the MTU just affects whether the packet will
make it through or be fragmented.  The only reason adjusting the MTU works
is because iperf3 bases its default packet size on the (path) MTU.

We can test this more simply by using the -l option to the iperf3 client
to directly control the packet size, instead of adjusting the MTU.

As well as simplifying this lets us test different packet sizes for host to
ns traffic.  We couldn't do that previously because we don't have
permission to change the MTU on the host.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Packet size can make a big difference to UDP throughput, so it makes sense
to measure it for a variety of different sizes.  Currently we do this by
adjusting the MTU on the relevant interface before running iperf3.

However, the UDP packet size has no inherent connection to the MTU - it's
controlled by the sender, and the MTU just affects whether the packet will
make it through or be fragmented.  The only reason adjusting the MTU works
is because iperf3 bases its default packet size on the (path) MTU.

We can test this more simply by using the -l option to the iperf3 client
to directly control the packet size, instead of adjusting the MTU.

As well as simplifying this lets us test different packet sizes for host to
ns traffic.  We couldn't do that previously because we don't have
permission to change the MTU on the host.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>test/perf: Small MTUs for spliced TCP aren't interesting</title>
<updated>2023-11-07T08:56:13+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2023-11-06T07:08:29+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=29269705239fc3102a42950a5d84c3c3acd619d0'/>
<id>29269705239fc3102a42950a5d84c3c3acd619d0</id>
<content type='text'>
Currently we make TCP throughput measurements for spliced connections with
a number of different MTU values.  However, the results from this aren't
really interesting.

Unlike with tap connections, spliced connections only involve the loopback
interface on host and container, not a "real" external interface.  lo
typically has an MTU of 65535 and there is very little reason to ever
change that.  So, the measurements for smaller MTUs are rarely going to be
relevant.

In addition, the fact that we can offload all the {de,}packetization to the
kernel with splice(2) means that the throughput difference between these
MTUs isn't very great anyway.

Remove the short MTUs and only show spliced throughput for the normal
65535 byte loopback MTU.  This reduces runtime of the performance tests on
my laptop by about 1 minute (out of ~24 minutes).

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we make TCP throughput measurements for spliced connections with
a number of different MTU values.  However, the results from this aren't
really interesting.

Unlike with tap connections, spliced connections only involve the loopback
interface on host and container, not a "real" external interface.  lo
typically has an MTU of 65535 and there is very little reason to ever
change that.  So, the measurements for smaller MTUs are rarely going to be
relevant.

In addition, the fact that we can offload all the {de,}packetization to the
kernel with splice(2) means that the throughput difference between these
MTUs isn't very great anyway.

Remove the short MTUs and only show spliced throughput for the normal
65535 byte loopback MTU.  This reduces runtime of the performance tests on
my laptop by about 1 minute (out of ~24 minutes).

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>test/perf: Start iperf3 server less often</title>
<updated>2023-11-07T08:56:10+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2023-11-06T07:08:28+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=e516809a74ffd495481a7adf6b565181861a41f9'/>
<id>e516809a74ffd495481a7adf6b565181861a41f9</id>
<content type='text'>
Currently we start both the iperf3 server(s) and client(s) afresh each time
we want to make a bandwidth measurement.  That's not really necessary as
usually a whole batch of bandwidth measurements can use the same server.

Split up the iperf3 directive into 3 directives: iperf3s to start the
server, iperf3 to make a measurement and iperf3k to kill the server, so
that we can start the server less often.  This - and more importantly, the
reduced number of waits for the server to be ready - reduces runtime of the
performance tests on my laptop by about 4m (out of ~28minutes).

For now we still restart the server between IPv4 and IPv6 tests.  That's
because in some cases the latency measurements we make in between use the
same ports.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we start both the iperf3 server(s) and client(s) afresh each time
we want to make a bandwidth measurement.  That's not really necessary as
usually a whole batch of bandwidth measurements can use the same server.

Split up the iperf3 directive into 3 directives: iperf3s to start the
server, iperf3 to make a measurement and iperf3k to kill the server, so
that we can start the server less often.  This - and more importantly, the
reduced number of waits for the server to be ready - reduces runtime of the
performance tests on my laptop by about 4m (out of ~28minutes).

For now we still restart the server between IPv4 and IPv6 tests.  That's
because in some cases the latency measurements we make in between use the
same ports.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>test/perf: Get iperf3 stats from client side</title>
<updated>2023-11-07T08:56:06+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2023-11-06T07:08:27+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=f9ff6678d4bbf5d9c80c1c6f784c3955468c09d6'/>
<id>f9ff6678d4bbf5d9c80c1c6f784c3955468c09d6</id>
<content type='text'>
iperf3 generates statistics about its run on both the client and server
sides.  They don't have exactly the same information, but both have the
pieces we need (AFAICT the server communicates some nformation to the
client over the control socket, so the most important information is in the
client side output, even if measured by the server).

Currently we use the server side information for our measurements. Using
the client side information has several advantages though:

 * We can directly wait for the client to complete and we know we'll have
   the output we want.  We don't need to sleep to give the server time to
   write out the results.
 * That in turn means we can wrap up as soon as the client is done, we
   don't need to wait overlong to make sure everything is finished.
 * The slightly different organisation of the data in the client output
   means that we always want the same json value, rather than requiring
   slightly different onces for UDP and TCP.

The fact that we avoid some extra delays speeds up the overal run of the
perf tests by around 7 minutes (out of around 35 minutes) on my laptop.

The fact that we no longer unconditionally kill client and server after
a certain time means that the client could run indefinitely if the server
doesn't respond.  We mitigate that by setting 1s connect timeout on the
client.  This isn't foolproof - if we get an initial response, but then
lose connectivity this could still run indefinitely, however it does cover
by far the most likely failure cases.  --snd-timeout would provide more
robustness, but I've hit odd failures when trying to use it.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
iperf3 generates statistics about its run on both the client and server
sides.  They don't have exactly the same information, but both have the
pieces we need (AFAICT the server communicates some nformation to the
client over the control socket, so the most important information is in the
client side output, even if measured by the server).

Currently we use the server side information for our measurements. Using
the client side information has several advantages though:

 * We can directly wait for the client to complete and we know we'll have
   the output we want.  We don't need to sleep to give the server time to
   write out the results.
 * That in turn means we can wrap up as soon as the client is done, we
   don't need to wait overlong to make sure everything is finished.
 * The slightly different organisation of the data in the client output
   means that we always want the same json value, rather than requiring
   slightly different onces for UDP and TCP.

The fact that we avoid some extra delays speeds up the overal run of the
perf tests by around 7 minutes (out of around 35 minutes) on my laptop.

The fact that we no longer unconditionally kill client and server after
a certain time means that the client could run indefinitely if the server
doesn't respond.  We mitigate that by setting 1s connect timeout on the
client.  This isn't foolproof - if we get an initial response, but then
lose connectivity this could still run indefinitely, however it does cover
by far the most likely failure cases.  --snd-timeout would provide more
robustness, but I've hit odd failures when trying to use it.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>test/perf: Remove stale iperf3c/iperf3s directives</title>
<updated>2023-11-07T08:56:03+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2023-11-06T07:08:26+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=8a41a8b20f0f5a6c497bccbb41198277f9a865f8'/>
<id>8a41a8b20f0f5a6c497bccbb41198277f9a865f8</id>
<content type='text'>
Some older revisions used separate iperf3c and iperf3s test directives to
invoke the iperf3 client and server.  Those were combined into a single
iperf3 directive some time ago, but a couple of places still have the old
syntax.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some older revisions used separate iperf3c and iperf3s test directives to
invoke the iperf3 client and server.  Those were combined into a single
iperf3 directive some time ago, but a couple of places still have the old
syntax.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>test: Add Podman system test with bats for pasta</title>
<updated>2023-09-07T09:25:41+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2023-08-23T13:51:49+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=ee58f37db060535bee298bc98f61497eac37f152'/>
<id>ee58f37db060535bee298bc98f61497eac37f152</id>
<content type='text'>
Ugly as hell, but we keep breaking things otherwise, and I keep
forgetting to run this manually (as long as it's based on my local
Podman setup, that's the only alternative).

We need to clone the Podman repository as distribution packages don't
contain test scripts, typically. While at it, build the latest
version which is what really matters.

As we're planning anyway to revamp the test framework, I'd be
inclined to just add this without too many thoughts, and have it as
a nice-to-have requirement reminder for the new framework.

Link: https://github.com/containers/podman/pull/19699
Suggested-by: Paul Holzinger &lt;pholzing@redhat.com&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Ugly as hell, but we keep breaking things otherwise, and I keep
forgetting to run this manually (as long as it's based on my local
Podman setup, that's the only alternative).

We need to clone the Podman repository as distribution packages don't
contain test scripts, typically. While at it, build the latest
version which is what really matters.

As we're planning anyway to revamp the test framework, I'd be
inclined to just add this without too many thoughts, and have it as
a nice-to-have requirement reminder for the new framework.

Link: https://github.com/containers/podman/pull/19699
Suggested-by: Paul Holzinger &lt;pholzing@redhat.com&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>test/nstool: Fix fd leak in accept() loop</title>
<updated>2023-05-23T15:06:32+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2023-05-23T02:25:43+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=e3b19530e4a689f9f8e417ebf737dfca2340342b'/>
<id>e3b19530e4a689f9f8e417ebf737dfca2340342b</id>
<content type='text'>
nstool loops on accept(), but failed to close the accepted socket fds
before continuing on.  So, with repeated commands it would eventually die
with an EMFILE.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
nstool loops on accept(), but failed to close the accepted socket fds
before continuing on.  So, with repeated commands it would eventually die
with an EMFILE.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
