passt - Plug A Simple Socket Transport

	Commit message (Collapse)	Author	Age	Files	Lines
*	test/lib: Consistent cols, rows, poster attributes for asciinema player	Stefano Brivio	2022-04-07	2	-2/+2
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	arch: Pointer to local outside scope, CWE-562	Stefano Brivio	2022-04-07	1	-5/+5
\| \| \| \| \| \| \| \| \|	Reported by Coverity: if we fail to run the AVX2 version, once execve() fails, we had already replaced argv[0] with the new stack-allocated path string, and that's then passed back to main(). Use a static variable instead. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	udp: Out-of-bounds read, CWE-125 in udp_timer()	Stefano Brivio	2022-04-07	1	-1/+1
\| \| \| \| \| \| \|	Not an actual issue due to how it's typically stored, but udp_act can also be used for ports 65528-65535. Reported by Coverity. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	tcp: False "Out-of-bounds read" positive, CWE-125	Stefano Brivio	2022-04-07	1	-1/+5
\| \| \| \| \| \| \|	Reported by Coverity: it doesn't see that tcp{4,6}_l2_buf_used are set to zero by tcp_l2_data_buf_flush(), repeat that explicitly here. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	tcp, tcp_splice: False "Negative array index read" positives, CWE-129	Stefano Brivio	2022-04-07	2	-12/+24
\| \| \| \| \| \|	A flag or event bit is always set by callers. Reported by Coverity. Signed-by-off: Stefano Brivio <sbrivio@redhat.com>
*	tcp_splice: Logically dead code, CWE-561	Stefano Brivio	2022-04-07	1	-7/+1
\| \| \| \| \| \|	Reported by Coverity. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	tcp: Dereference null return value, CWE-476	Stefano Brivio	2022-04-07	1	-1/+1
\| \| \| \| \| \|	Not an issue with a sane kernel behaviour. Reported by Coverity. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, tap: False "Buffer not null terminated" positives, CWE-170	Stefano Brivio	2022-04-07	2	-6/+6
\| \| \| \| \| \| \|	Those strings are actually guaranteed to be NULL-terminated. Reported by Coverity. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf: False "Assign instead of compare" positive, CWE-481	Stefano Brivio	2022-04-07	1	-1/+1
\| \| \| \| \| \| \|	This really just needs to be an assignment before line_read() -- turn it into a for loop. Reported by Coverity. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	treewide: Argument cannot be negative, CWE-687	Stefano Brivio	2022-04-07	4	-22/+30
\| \| \| \| \| \|	Actually harmless. Reported by Coverity. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt: Improper use of negative value (CWE-394)	Stefano Brivio	2022-04-07	1	-5/+14
\| \| \| \| \| \|	Reported by Coverity. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, packet: Operands don't affect result, CWE-569	Stefano Brivio	2022-04-07	2	-3/+8
\| \| \| \| \| \|	Reported by Coverity. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	tap: Resource leak, CWE-404	Stefano Brivio	2022-04-07	1	-1/+4
\| \| \| \| \| \|	Reported by Coverity. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	treewide: Unchecked return value from library, CWE-252	Stefano Brivio	2022-04-07	8	-55/+116
\| \| \| \| \| \| \|	All instances were harmless, but it might be useful to have some debug messages here and there. Reported by Coverity. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	tcp: False "Untrusted loop bound" positive, CWE-606	Stefano Brivio	2022-04-05	1	-0/+2
\| \| \| \| \| \| \| \|	Field doff in struct tcp_hdr is 4 bits wide, so optlen in tcp_tap_handler() is already bound, but make that explicit. Reported by Coverity. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt: Ignoring number of bytes read, CWE-252	Stefano Brivio	2022-04-05	1	-2/+3
\| \| \| \| \| \|	Harmless, assuming sane kernel behaviour. Reported by Coverity. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	treewide: Invalid type in argument to printf format specifier, CWE-686	Stefano Brivio	2022-04-05	4	-32/+32
\| \| \| \| \| \|	Harmless except for two bad debugging prints. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt.1, qrap.1: Update links to qemu out-of-tree patch	Stefano Brivio	2022-04-01	2	-2/+2
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	README: Fix link to contrib/debian	Stefano Brivio	2022-03-30	1	-1/+1
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	hooks: Copy .webp diagram versions too	Stefano Brivio	2022-03-30	1	-0/+1
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	README: Drop red notice about early development phase	Stefano Brivio	2022-03-30	1	-3/+1
\| \| \| \| \| \|	Last famous words: it should be tested enough by now. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	contrib: Add example of Debian package files	Stefano Brivio	2022-03-30	7	-1/+66
\| \| \| \| \| \| \|	...using dh_apparmor to ship and apply AppArmor profiles. Tried on current Debian testing (Bookworm, 12). Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	contrib: Add example spec file for Fedora	Stefano Brivio	2022-03-30	1	-0/+95
\| \| \| \| \| \| \|	...with SELinux package, too. Tested on Fedora 35, but it should work on pretty much any version. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	tap: Re-read from tap in tap_handler_pasta() on buffer full	Stefano Brivio	2022-03-30	1	-2/+9
\| \| \| \| \| \| \| \|	read() will return zero if we pass a zero length, which makes no sense: instead, track explicitly that we exhausted the buffer, flush packets to handlers and redo. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	tap: Allow ioctl() and openat() for tap_ns_tun() re-initialisation	Stefano Brivio	2022-03-30	2	-1/+3
\| \| \| \| \| \| \|	If the tun interface disappears, we'll call tap_ns_tun() after the seccomp profile is applied: add ioctl() and openat() to it. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	tap, tcp, udp, icmp: Cut down on some oversized buffers	Stefano Brivio	2022-03-29	6	-31/+72
\| \| \| \| \| \| \| \| \|	The existing sizes provide no measurable differences in throughput and packet rates at this point. They were probably needed as batched implementations were not complete, but they can be decreased quite a bit now. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt, pasta: Add examples of SELinux policy modules	Stefano Brivio	2022-03-29	7	-0/+364
\| \| \| \| \| \|	These should cover any reasonably common use case in distributions. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt, pasta: Add examples of AppArmor policies	Stefano Brivio	2022-03-29	2	-0/+125
\| \| \| \| \| \|	These should cover any reasonably common use case in distributions. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	tcp: Fix warning by gcc 5.4 on ppc64le about comparison in CONN_OR_NULL()	Stefano Brivio	2022-03-29	1	-13/+13
\| \| \| \| \| \| \|	...we don't really need two extra bits, but it's easier to organise things differently than to silence this. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt: Accurate error reporting for sandbox()	Stefano Brivio	2022-03-29	1	-10/+26
\| \| \| \| \| \| \|	It's actually quite easy to make it fail depending on the environment, accurately report errors here. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	Makefile: Allow implicit test for bugprone-suspicious-string-compare checker	Stefano Brivio	2022-03-29	1	-4/+1
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	treewide: Fix android-cloexec-* clang-tidy warnings, re-enable checks	Stefano Brivio	2022-03-29	8	-31/+30
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	udp: Move flags before ts in struct udp_tap_port, avoid end padding	Stefano Brivio	2022-03-29	1	-3/+3
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	treewide: Mark constant references as const	Stefano Brivio	2022-03-29	29	-168/+192
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	treewide: Add include guards	Stefano Brivio	2022-03-29	15	-0/+75
\| \| \| \| \| \| \|	...at the moment, just for consistency with packet.h, icmp.h, tcp.h and udp.h. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	treewide: Packet abstraction with mandatory boundary checks	Stefano Brivio	2022-03-29	23	-700/+999
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement a packet abstraction providing boundary and size checks based on packet descriptors: packets stored in a buffer can be queued into a pool (without storage of its own), and data can be retrieved referring to an index in the pool, specifying offset and length. Checks ensure data is not read outside the boundaries of buffer and descriptors, and that packets added to a pool are within the buffer range with valid offset and indices. This implies a wider rework: usage of the "queueing" part of the abstraction mostly affects tap_handler_{passt,pasta}() functions and their callees, while the "fetching" part affects all the guest or tap facing implementations: TCP, UDP, ICMP, ARP, NDP, DHCP and DHCPv6 handlers. Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	util: Fix function declaration style of write_pidfile()	Stefano Brivio	2022-03-29	1	-1/+2
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	tcp, tcp_splice: Use less awkward syntax to swap in/out sockets from pools	Stefano Brivio	2022-03-29	2	-12/+10
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	dhcp: Minimum option length implied by RFC 951 is 60 bytes, not 62	Stefano Brivio	2022-03-29	1	-3/+5
\| \| \| \| \| \| \|	In section 3 ("Packet Format"), "vend" is 64 bytes long, minus the magic that's 60 bytes, not 62. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	tcp: Fit struct tcp_conn into a single 64-byte cacheline	Stefano Brivio	2022-03-29	2	-137/+166
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	...by: - storing the chained-hash next connection pointer as numeric reference rather than as pointer - storing the MSS as 14-bit value, and rounding it - using only the effective amount of bits needed to store the hash bucket number - explicitly limiting window scaling factors to 4-bit values (maximum factor is 14, from RFC 7323) - scaling SO_SNDBUF values, and using a 8-bit representation for the duplicate ACK sequence - keeping window values unscaled, as received and sent Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	README: Update Interfaces and Availability sections	Stefano Brivio	2022-03-29	1	-4/+9
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	README: Avoid "here" links	Stefano Brivio	2022-03-29	1	-20/+19
\| \| \| \| \| \|	They look a bit lame: rephrase sentences to avoid them. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test/perf: Work-around for virtio_net hang before long streams from guest	Stefano Brivio	2022-03-29	2	-0/+30
\| \| \| \| \| \| \|	I didn't have time to investigate the root cause for the virtio_net TX hang yet. Add a quick work-around for the moment being. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	tcp_splice: Close sockets right away on high number of open files	Stefano Brivio	2022-03-29	5	-7/+27
\| \| \| \| \| \| \| \| \| \| \| \| \|	We can't take for granted that the hard limit for open files is big enough as to allow to delay closing sockets to a timer. Store the value of RTLIMIT_NOFILE we set at start, and use it to understand if we're approaching the limit with pending, spliced TCP connections. If that's the case, close sockets right away as soon as they're not needed, instead of deferring this task to a timer. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	tcp: Rework timers to use timerfd instead of periodic bitmap scan	Stefano Brivio	2022-03-29	5	-241/+288
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With a lot of concurrent connections, the bitmap scan approach is not really sustainable. Switch to per-connection timerfd timers, set based on events and on two new flags, ACK_FROM_TAP_DUE and ACK_TO_TAP_DUE. Timers are added to the common epoll list, and implement the existing timeouts. While at it, drop the CONN_ prefix from flag names, otherwise they get quite long, and fix the logic to decide if a connection has a local, possibly unreachable endpoint: we shouldn't go through the rest of tcp_conn_from_tap() if we reset the connection due to a successful bind(2), and we'll get EACCES if the port number is low. Suggested by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	tcp, udp, util: Enforce 24-bit limit on socket numbers	Stefano Brivio	2022-03-29	5	-1/+42
\| \| \| \| \| \| \|	This should never happen, but there are no formal guarantees: ensure socket numbers are below SOCKET_MAX. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test, seccomp, Makefile: Switch to valgrind runs for passt functional tests	Stefano Brivio	2022-03-29	8	-14/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pass to seccomp.sh a list of additional syscalls valgrind needs as EXTRA_SYSCALLS in a new 'valgrind' make target, and add corresponding support in seccomp.sh itself. In test setup functions, start passt with valgrind, but not for performance tests. Add tests checking that valgrind exits without errors after all the other tests in the group are done. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	test: Add asciinema(1) as requirement for CI in README	Stefano Brivio	2022-03-28	1	-1/+1
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	Makefile: Enable a few hardening flags	Stefano Brivio	2022-03-28	1	-2/+8
\| \| \| \| \| \| \|	They don't have a measurable performance impact and make things a bit safer. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	udp: Use flags for local, loopback, and configured unicast binds	Stefano Brivio	2022-03-28	1	-25/+23
\| \| \| \| \| \| \| \| \| \| \|	There's no value in keeping a separate timestamp for activity and for aging of local binds, given that they have the same timeout. Reduce that to a single timestamp, with a flag indicating the local bind. Also use flags instead of separate int fields for loopback and configured unicast address usage as source address. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>