passt - Plug A Simple Socket Transport

	Commit message (Collapse)	Author	Age	Files	Lines
*	conf: Allow address remapped to host to be configured	David Gibson	2024-08-21	4	-58/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because the host and guest share the same IP address with passt/pasta, it's not possible for the guest to directly address the host. Therefore we allow packets from the guest going to a special "NAT to host" address to be redirected to the host, appearing there as though they have both source and destination address of loopback. Currently that special address is always the address of the default gateway (or none). That can be a problem if we want that gateway to be addressable by the guest. Therefore, allow the special "NAT to host" address to be overridden on the command line with a new --map-host-loopback option. In order to exercise and test it, update the passt_in_ns and perf tests to use this option and give different mapping addresses for the two layers of the environment. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test: Reconfigure IPv6 address after changing MTU	David Gibson	2024-08-21	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the TCP throughput tests, we adjust the guest's MTU in order to test various packet sizes. Some of those are below 1280 which causes IPv6 to be deconfigured on the guest interface. When we increase it above 1280 again, IPv6 is re-enabled and we get an address in the right prefix with NDP, but we don't get exactly the expected address back - that's only communicated with --config-net or DHCPv6. With changes to how we handle NAT this can cause some of the IPv6 tests to fail, because they don't use the address that passt/pasta expects, and the guest doesn't initiate any traffic which allows us to learn what the new address is. Work around this by re-invoking dhclient -6 between adjusting the MTU and running IPv6 test cases. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test: Speed up by cutting on eye candy and performance test duration	Stefano Brivio	2024-08-15	4	-28/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have a number of delays when we switch to new layouts that were added to make the tests visually easier to follow, together with blinking status bars. Shorten the delays and avoid blinking the status bar if $FAST is set to 1 (no demo mode). Shorten delays in busy loops to 10ms, instead of 100ms, and skip the one-second fixed delay when we wait for the status of a command. Cut the duration of throughput and latency tests to one second, down from ten. Somewhat surprisingly, the results we get are rather consistent, and not significantly different from what we'd get with 10 seconds. This, together with Podman's commit 20f3e8909e3a ("test/system: pasta_test_do add explicit port check"), cuts the time needed on my setup for full test run from approximately 37 minutes to...: $ time ./run [exited] PASS: 165, FAIL: 0 Log at /home/sbrivio/passt/test/test_logs/test.log real 15m34.253s user 0m0.011s sys 0m0.011s Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Tested-by: David Gibson <david@gibson.dropbear.id.au>
*	test: iperf3 3.16 introduces multiple threads, drop our own implementation ↵	Stefano Brivio	2024-07-25	4	-122/+117
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of that Starting from iperf3 version 3.16, -P / --parallel spawns multiple clients as separate threads, instead of multiple streams serviced by the same thread. So we can drop our lib/test implementation to spawn several iperf3 client and server processes and finally simplify things quite a bit. Adjust number of threads and UDP sending bandwidth to values that seem to be more or less matching previous throughput tests on my setup. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Tested-by: David Gibson <david@gibson.dropbear.id.au>
*	test/perf: Simplify calculation of "omit" time for TCP throughput	David Gibson	2023-11-07	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For the TCP throughput tests, we use iperf3's -O "omit" option which ignores results for the given time at the beginning of the test. Currently we calculate this as 1/6th of the test measurement time. The purpose of -O, however, is to skip over the TCP slow start period, which in no way depends on the overall length of the test. The slow start time is roughly speaking log_2 ( max_window_size / MSS ) * round_trip_time These factors all vary between tests and machines we're running on, but we can estimate some reasonable bounds for them: * The maximum window size is bounded by the buffer sizes at each end, which shouldn't exceed 16MiB * The mss varies with the MTU we use, but the smallest we use in tests is ~256 bytes * Round trip time will vary with the system, but with these essentially local transfers it will typically be well under 1ms (on my laptop it is closer to 0.03ms) That gives a worst case slow start time of about 16ms. Setting an omit time of 0.1s uniformly is therefore more than enough, and substantially smaller than what we calculate now for the default case (10s / 6 ~= 1.7s). This reduces total time for the standard benchmark run by around 30s. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf: Remove unnecessary --pacing-timer options	David Gibson	2023-11-07	2	-3/+3
\| \| \| \| \| \| \| \| \| \|	We always set --pacing-timer when invoking iperf3. However, the iperf3 man page implies this is only relevant for the -b option. We only use the -b option for the UDP tests, not TCP, so remove --pacing-timer from the TCP cases. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf: "MTU" changes in passt_tcp host to guest aren't useful	David Gibson	2023-11-07	1	-29/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The TCP packet size used on the passt L2 link (qemu socket) makes a huge difference to passt/pasta throughput; many of passt's overheads (chiefly syscalls) are per-packet. That packet size is largely determined by the MTU on the L2 link, so we benchmark for a number of different MTUs. That works well for the guest to host transfers. For the host to guest transfers, we purport to test for different MTUs, but we're not actually adjusting anything interesting. The host to guest transfers adjust the MTU on the "host's" (actually ns) loopback interface. However, that only affects the packet size for the socket going to passt, not the packet size for the L2 link that passt manages - passt can and will repack the stream into packets of its own size. Since the depacketization on that socket is handled by the kernel it doesn't have a lot of bearing on passt's performance. We can't fix this by changing the L2 link MTU from the guest side (as we do for guest to host), because that would only change the guest's view of the MTU, passt would still think it has the large MTU. We could test this by using the --mtu option to passt, but that would require restarting passt for each run, which is awkward in the current setup. So, for now, drop all the "small MTU" tests for host to guest. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf: Explicitly control UDP packet length, instead of MTU	David Gibson	2023-11-07	2	-94/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Packet size can make a big difference to UDP throughput, so it makes sense to measure it for a variety of different sizes. Currently we do this by adjusting the MTU on the relevant interface before running iperf3. However, the UDP packet size has no inherent connection to the MTU - it's controlled by the sender, and the MTU just affects whether the packet will make it through or be fragmented. The only reason adjusting the MTU works is because iperf3 bases its default packet size on the (path) MTU. We can test this more simply by using the -l option to the iperf3 client to directly control the packet size, instead of adjusting the MTU. As well as simplifying this lets us test different packet sizes for host to ns traffic. We couldn't do that previously because we don't have permission to change the MTU on the host. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf: Small MTUs for spliced TCP aren't interesting	David Gibson	2023-11-07	1	-52/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we make TCP throughput measurements for spliced connections with a number of different MTU values. However, the results from this aren't really interesting. Unlike with tap connections, spliced connections only involve the loopback interface on host and container, not a "real" external interface. lo typically has an MTU of 65535 and there is very little reason to ever change that. So, the measurements for smaller MTUs are rarely going to be relevant. In addition, the fact that we can offload all the {de,}packetization to the kernel with splice(2) means that the throughput difference between these MTUs isn't very great anyway. Remove the short MTUs and only show spliced throughput for the normal 65535 byte loopback MTU. This reduces runtime of the performance tests on my laptop by about 1 minute (out of ~24 minutes). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf: Start iperf3 server less often	David Gibson	2023-11-07	4	-90/+175
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we start both the iperf3 server(s) and client(s) afresh each time we want to make a bandwidth measurement. That's not really necessary as usually a whole batch of bandwidth measurements can use the same server. Split up the iperf3 directive into 3 directives: iperf3s to start the server, iperf3 to make a measurement and iperf3k to kill the server, so that we can start the server less often. This - and more importantly, the reduced number of waits for the server to be ready - reduces runtime of the performance tests on my laptop by about 4m (out of ~28minutes). For now we still restart the server between IPv4 and IPv6 tests. That's because in some cases the latency measurements we make in between use the same ports. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf: Remove stale iperf3c/iperf3s directives	David Gibson	2023-11-07	2	-6/+1
\| \| \| \| \| \| \| \| \| \|	Some older revisions used separate iperf3c and iperf3s test directives to invoke the iperf3 client and server. Those were combined into a single iperf3 directive some time ago, but a couple of places still have the old syntax. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt: Relicense to GPL 2.0, or any later version	Stefano Brivio	2023-04-06	4	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In practical terms, passt doesn't benefit from the additional protection offered by the AGPL over the GPL, because it's not suitable to be executed over a computer network. Further, restricting the distribution under the version 3 of the GPL wouldn't provide any practical advantage either, as long as the passt codebase is concerned, and might cause unnecessary compatibility dilemmas. Change licensing terms to the GNU General Public License Version 2, or any later version, with written permission from all current and past contributors, namely: myself, David Gibson, Laine Stump, Andrea Bolognani, Paul Holzinger, Richard W.M. Jones, Chris Kuhn, Florian Weimer, Giuseppe Scrivano, Stefan Hajnoczi, and Vasiliy Ulyanov. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf/pasta_tcp: Add host to namespace cases for traffic via tap	Stefano Brivio	2023-01-05	1	-0/+57
\| \| \| \| \| \| \| \| \| \|	Similarly to UDP cases, these were missing as it wasn't clear, when the other tests were introduced, if using the global address of a namespace, from the host, should have resulted in connections being routed via the tap interface. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	test/perf/pasta_udp: Add host to namespace cases for traffic via tap	Stefano Brivio	2023-01-05	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	These were missing as it wasn't clear, when the other tests were introduced, if using the global address of a namespace, from the host, should have resulted in traffic being routed via the tap interface (as opposed to the loopback interface). We now clarified that's actually the case. Use same values and thresholds as the tests for loopback traffic, as throughput figures currently indicate there isn't much difference. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	test/perf: Finally drop workaround for virtio_net TX stall	Stefano Brivio	2022-11-04	2	-30/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Now that we require 13c6be96618c ("net: stream: add unix socket") in qemu to run the tests, we can also assume that commit df8d07081718 ("virtio-net: fix bottom-half packet TX on asynchronous completion") is present, as it was merged before that one. This fixes the issue we attempted to work around in passt TCP and UDP performance tests: finally drop that stuff. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf: Wait for neper servers in guest to be ready before starting client	Stefano Brivio	2022-09-23	2	-0/+6
\| \| \| \| \| \| \| \|	Starting tcp_rr, tcp_crr, udp_rr servers in the guest takes a bit longer than starting the corresponding clients on the host, and we end up starting clients before servers unless we add a delay there. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf: Disable periodic throughput reports to avoid vhost hang	Stefano Brivio	2022-09-22	4	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It appears that if we run throughput tests with one-second periodic reports, the sending side of the vhost channel used for SSH-based command dispatch occasionally stops working altogether. I haven't investigated this further, all I see is that output is truncated at some point, and doesn't resume. If we use gzip compression (ssh -C) this happens less frequently, but it still happens, seemingly indicating the issue is probably related to vhost itself. Disable periodic reports in iperf3 clients. The -i options were actually redundant, so remove them from both test files as well as from test_iperf3(). Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf: Switch performance test duration to 10 seconds instead of 30	Stefano Brivio	2022-09-22	4	-4/+4
\| \| \| \| \| \| \| \| \|	It looks like the workaround for the virtio_net TX hang issue is working less reliably with the new command dispatch mechanism, I'm not sure why. Switch to 10 seconds, at least for the moment. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	test/perf: Always use /sbin/sysctl in tcp test	Stefano Brivio	2022-09-22	2	-6/+6
\| \| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	test/perf: Check for /sbin/sysctl with which(1), not simply sysctl	Stefano Brivio	2022-09-22	3	-4/+4
\| \| \| \| \| \| \| \| \| \|	Otherwise, we're depending on having /sbin in $PATH. For some reason I didn't completely grasp, with the new command dispatch mechanism that's not the case anymore, even if I have /sbin in $PATH in the parent shell. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	test: Rewrite test_iperf3	David Gibson	2022-09-07	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	test_iperf3() is a pretty inscrutable mess of nested background processes. It has a number of ugly sleeps needed to wait for things to complete. Rewrite it to be cleaner: * Use the construct (a & b & wait) to run 'a' and 'b' in parallel, but then wait for them both to complete before continuing * This allows us to wait for both the server and client to finish, rather than sleeping * Use jq to do all the math we need to get the final result, rather than jq followed by some complicated 'bc' mangling Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	test: Parameterize run time for throughput performance tests	David Gibson	2022-09-07	4	-83/+89
\| \| \| \| \| \| \| \| \| \| \| \|	Currently all the throughput tests are run for 30s. This is reflected in both the actual parameters given to the iperf commands, but also in the matching sleeps in test_iperf3. Allow this to be adjusted more easily with a new parameter to test_iperf3. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [sbrivio: Reflect new parameter in comment to test_iperf3()] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test: Combine iperf3c and iperf3s into a single DSL command	David Gibson	2022-09-07	4	-150/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These two commands in the DSL to run an iperf client and server are always used together, and some of the parameters must match between them. The iperf3s must also be run more or less immediately after iperf3c, since iperf3c will run a client in the background after a sleep and requires a server to be running before it will work. A bunch of things can be made cleaner if we make a single DSL command that runs both sides of the test. For now make the combined command work exactly like the two commands together did, warts and all. This does lose the ability for the DSL scripts to give additional options to the iperf3 server, but we weren't using that anyway. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	tests: Explicitly list test files in test/run, remove "onlyfor" support	David Gibson	2022-07-14	4	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently test/run uses wildcards to run all of the tests in a directory. However, that wildcard list is filtered down by the "onlyfor" directives in the test files... usually to a single file. Therefore, just explicitly list the files we really want to run for this test mode. This makes it easier to see at the top level what tests will be executed, and to change that list temporarily while debugging specific failures. This means the "onlyfor" directive no longer has any purpose, and we can remove it. "onlyfor" was also the only used of the $MODE variable, so we can remove that too. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	test: Embed script for dhclient(8) in mbuto(1) profile	Stefano Brivio	2022-07-14	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	David reports that dhclient-script(8) on Fedora needs a number of binaries that are not included in PROGS of the current mbuto profile, and we would also need to include hostnamectl(1) there, which will fail without a systemd init. Embed a minimal script for dhclient(8) in the profile itself, written to /sbin/dhclient-script at boot, to just check what we need to check out of DHCP and DHCPv6 functionality. While at it, drop busybox and logger from PROGS, as we don't need them, and add hostname(1). While DHCP option 12 isn't supported yet by the DHCP implementation in passt, we should probably add it soon. Note: owing to the simplicity of this script, we now need to bring up the interface before starting dhclient: add this in test scripts where it's not the case yet. Reported-by: David Gibson <david@gibson.dropbear.id.au> Suggested-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> note that we need to bring up the interface before starting dhclient
*	Tweak dhclient arguments for readability	David Gibson	2022-06-15	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	A number of tests and examples use dhclient in both IPv4 and IPv6 modes. We use "dhclient -6" for IPv6, but usually just "dhclient" for IPv4. Add an explicit "-4" argument to make it more clear and explicit. In addition, when dhclient is run from within pasta it usually won't be "real" root, and so will not have access to write the default global pid file. This results in a mostly harmless but irritating error: Can't create /var/run/dhclient.pid: Permission denied We can avoid that by using the --no-pid flag to dhclient. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	Don't abbreviate ip(8) arguments in examples and tests	David Gibson	2022-06-15	4	-12/+12
\| \| \| \| \| \| \| \|	ip(8)'s ability to take abbreviated arguments (e.g. "li sh" instead of "link show") is very handy when using it interactively, but it doesn't make for very readable scripts and examples when shown that way. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	test/perf/pasta_udp: Drop redundant assignment of ::1 to loopback interface	Stefano Brivio	2022-05-19	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \|	There are a few occurrences of this assignment, which are needed to re-add ::1 as loopback address after the MTU has been increased back from a value below 1280 bytes. This one, however, is redundant, and causes an error in the execution. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf: Work-around for virtio_net hang before long streams from guest	Stefano Brivio	2022-03-29	2	-0/+30
\| \| \| \| \| \| \|	I didn't have time to investigate the root cause for the virtio_net TX hang yet. Add a quick work-around for the moment being. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf/passt_udp: Drop threshold for 256B test	Stefano Brivio	2022-02-21	1	-2/+2
\| \| \| \| \| \| \| \| \|	That test fails sometimes, it looks like iperf3 is still sending initial messages that are too big. I'll need to figure out why, but given that 256 bytes is not really an expected MTU, drop the thresholds to zero for the moment being. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf/passt_tcp: Drop iperf3 window size for host-to-guest tests	Stefano Brivio	2022-02-01	1	-10/+10
\| \| \| \| \| \| \| \| \| \|	With a recent 5.15 kernel, passing a huge window size to iperf3 with lower MTUs makes iperf3 stop sending packets after a few seconds -- I haven't investigated this in detail, but the window size will be adjusted dynamically anyway and not passing it doesn't actually affect throughput, so simply drop the option. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	perf/passt_udp: Lower failure throughput thresholds with big MTUs	Stefano Brivio	2022-01-26	1	-4/+4
\| \| \| \| \| \| \|	The throughput results in this test look quite variable, slightly lower figures look reasonable anyway. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf: Actually load passt enough to test UDP performance	Stefano Brivio	2021-10-21	2	-26/+26
\| \| \| \| \| \| \| \| \|	With recent improvements, we're not CPU-bound at all while testing UDP performance. Give the VM more memory and CPUs, forward two additional ports, start up to four threads in parallel, and give single iperf3 threads higher bandwidth targets. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf: Try sourcing maximum scaling frequency from cpufreq	Stefano Brivio	2021-10-21	4	-4/+14
\| \| \| \| \| \| \|	On most recent CPUs, that's a better indication of all-core turbo frequency, or non-turbo frequency, than /proc/cpuinfo. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	perf/passt_udp: Don't overshoot UDP bandwidth excessively on larger MTUs	Stefano Brivio	2021-10-19	1	-2/+2
\| \| \| \| \| \| \|	...performance with 64KiB MTUs might look worse than with 9000bytes on some configurations. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	perf/passt_tcp: Don't exceed typical L3 cache sizes with buffers	Stefano Brivio	2021-10-19	1	-8/+8
\| \| \| \| \| \|	...we might see misleading rate drops with larger MTUs otherwise. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf: Use CPU frequency from /proc/cpuinfo instead of cpupower(1)	Stefano Brivio	2021-10-19	4	-8/+8
\| \| \| \| \| \|	Get it to work also in nested virtualisation environments. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test: Add CI/demo scripts	Stefano Brivio	2021-09-27	4	-0/+859
	Not really quick, definitely dirty. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>