passt - Plug A Simple Socket Transport

	Commit message (Collapse)	Author	Age	Files	Lines
*	flow: Don't crash if guest attempts to connect to port 02024_08_14.61c0b0d	David Gibson	2024-08-14	1	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using a zero port on TCP or UDP is dubious, and we can't really deal with forwarding such a flow within the constraints of the socket API. Hence we ASSERT()ed that we had non-zero ports in flow_hash(). The intention was to make sure that the protocol code sanitizes such ports before completing a flow entry. Unfortunately, flow_hash() is also called on new packets to see if they have an existing flow, so the unsanitized guest packet can crash passt with the assert. Correct this by moving the assert from flow_hash() to flow_sidx_hash() which is only used on entries already in the table, not on unsanitized data. Reported-by: Matt Hamilton <matt@thmail.io> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf: Don't ignore -t and -u options after -D	David Gibson	2024-08-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	f6d5a5239264 moved handling of -D into a later loop. However as a side effect it moved this from a switch block to an if block. I left a couple of 'break' statements that don't make sense in the new context. They should be 'continue' so that we go onto the next option, rather than leaving the loop entirely. Fixes: f6d5a5239264 ("conf: Delay handling -D option until after addresses are configured") Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	ndp.c: Turn NDP responder into more declarative implementation	AbdAlRahman Gad	2024-08-13	3	-79/+246
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Add structs for NA, RA, NS, MTU, prefix info, option header, link-layer address, RDNSS, DNSSL and link-layer for RA message. - Turn NA message from purely imperative, going byte by byte, to declarative by filling it's struct. - Turn part of RA message into declarative. - Move packet_add() to be before the call of ndp() in tap6_handler() if the protocol of the packet is ICMPv6. - Add a pool of packets as an additional parameter to ndp(). - Check the size of NS packet with packet_get() before sending an NA packet. - Add documentation for the structs. - Add an enum for NDP option types. Link: https://bugs.passt.top/show_bug.cgi?id=21 Signed-off-by: AbdAlRahman Gad <abdobngad@gmail.com> [sbrivio: Minor coding style fixes] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf: Delay handling -D option until after addresses are configured	David Gibson	2024-08-12	1	-39/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	add_dns[46]() rely on the gateway address and c->no_map_gw being already initialised, in order to properly handle DNS servers which need NAT to be accessed from the guest. Usually these are called from get_dns() which is well after the addresses are configured, so that's fine. However, they can also be called earlier if an explicit -D command line option is given. In this case no_map_gw and/or c->ip[46].gw may not get be initialised properly, leading to this doing the wrong thing. Luckily we already have a second pass of option parsing for things which need addresses to already be configured. Move handling of -D to there. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	Correct inaccurate comments on ip[46]_ctx::addr	David Gibson	2024-08-12	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	These fields are described as being an address for an external, routable interface. That's not necessarily the case when using -a. But, more importantly, saying where the value comes from is not as useful as what it's used for. The real purpose of this field is as the address which we assign to the guest via DHCP or --config-net. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	log: Don't prefix message with timestamp on --debug if it's a continuation	Stefano Brivio	2024-08-12	3	-18/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we prefix the second part of messages printed through logmsg_perror() by the timestamp, on debug, we'll have two timestamps and a weird separator in the result, such as this beauty: 0.0013: Failed to clone process with detached namespaces0.0013: : Operation not permitted Add a parameter to logmsg() and vlogmsg() which indicates a message continuation. If that's set, don't print the timestamp in vlogmsg(). Link: https://github.com/moby/moby/issues/48257#issuecomment-2282875092 Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	conf: Stop parsing options at first non-option argument	Stefano Brivio	2024-08-08	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Given that pasta supports specifying a command to be executed on the command line, even without the usual -- separator as long as there's no ambiguity, we shouldn't eat up options that are not meant for us. Paul reports, for instance, that with: pasta --config-net ip -6 route -6 is taken by pasta to mean --ipv6-only, and we execute 'ip route'. That's because getopt_long(), by default, shuffles the argument list to shift non-option arguments at the end. Avoid that by adding '+' at the beginning of 'optstring'. Reported-by: Paul Holzinger <pholzing@redhat.com> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	passt, util: Close any open file that the parent might have leaked	Stefano Brivio	2024-08-08	6	-7/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a parent accidentally or due to implementation reasons leaks any open file, we don't want to have access to them, except for the file passed via --fd, if any. This is the case for Podman when Podman's parent leaks files into Podman: it's not practical for Podman to close unrelated files before starting pasta, as reported by Paul. Use close_range(2) to close all open files except for standard streams and the one from --fd. Given that parts of conf() depend on other files to be already opened, such as the epoll file descriptor, we can't easily defer this to a more convenient point, where --fd was already parsed. Introduce a minimal, duplicate version of --fd parsing to keep this simple. As we need to check that the passed --fd option doesn't exceed INT_MAX, because we'll parse it with strtol() but file descriptor indices are signed ints (regardless of the arguments close_range() take), extend the existing check in the actual --fd parsing in conf(), also rejecting file descriptors numbers that match standard streams, while at it. Suggested-by: Paul Holzinger <pholzing@redhat.com> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Paul Holzinger <pholzing@redhat.com>
*	nstool: Propagate SIGTERM to processes executed in the namespace	David Gibson	2024-08-07	1	-2/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Particularly in shell it's sometimes natural to save the pid from a process run and later kill it. If doing this with nstool exec, however, it will kill nstool itself, not the program it is running, which isn't usually what you want or expect. Address this by having nstool propagate SIGTERM to its child process. It may make sense to propagate some other signals, but some introduce extra complications, so we'll worry about them when and if it seems useful. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	nstool: Fix some trivial typos	David Gibson	2024-08-07	1	-2/+2
\| \| \| \| \|	Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	log: Avoid duplicate calls to logtime()	David Gibson	2024-08-07	1	-9/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We use logtime() to get a timestamp for the log in two places: - in vlogmsg(), which is used only for debug_print messages - in logfile_write() which is only used messages to the log file These cases are mutually exclusive, so we don't ever print the same message with different timestamps, but that's not particularly obvious to see. It's possible future tweaks to logging logic could mean we log to two different places with different timestamps, which would be confusing. Refactor to have a single logtime() call in vlogmsg() and use it for all the places we need it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	log: Handle errors from clock_gettime()	David Gibson	2024-08-07	1	-15/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	clock_gettime() can, theoretically, fail, although it probably won't until 2038 on old 32-bit systems. Still, it's possible someone could run with a wildly out of sync clock, or new errors could be added, or it could fail due to a bug in libc or the kernel. We don't handle this well. In the debug_print case in vlogmsg we'll just ignore the failure, and print a timestamp based on uninitialised garbage. In logfile_write() we exit early and won't log anything at all, which seems like a good way to make an already weird situation undebuggable. Add some helpers to instead handle this by using "<error>" in place of a timestamp if something goes wrong with clock_gettime(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	log: Correct formatting of timestamps	David Gibson	2024-08-07	1	-12/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	logtime_fmt_and_arg() is a rather odd macro, producing both a format string and an argument, which can only be used in quite specific printf() like formulations. It also has a significant bug: it tries to display 4 digits after the decimal point (so down to tenths of milliseconds) using %04i. But the field width in printf() is always a minimum not maximum field width, so this will not truncate the given value, but will redisplay the entire tenth-of-milliseconds difference again after the decimal point. Replace the macro with an snprintf() like function which will format the timestamp, and use an explicit % to correct the display. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [sbrivio: Make logtime_fmt() static] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	util: Some corrections for timespec_diff_us	David Gibson	2024-08-07	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The comment for timespec_diff_us() claims it will wrap after 2^64µs. This is incorrect for two reasons: * It returns a long long, which is probably 64-bits, but might not be * It returns a signed value, so even if it is 64 bits it will wrap after 2^63µs Correct the comment and use an explicitly 64-bit type to avoid that imprecision. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, pasta: Make -g and -a skip route/addresses copy for matching IP ↵	Stefano Brivio	2024-08-07	4	-22/+36
	version only Paul reports that setting IPv4 address and gateway manually, using --address and --gateway, causes pasta to fail inserting IPv6 routes in a setup where multiple, inter-dependent IPv6 routes are present on the host. That's because, currently, any -g option implies --no-copy-routes altogether, and any -a implies --no-copy-addrs. Limit this implication to the matching IP version, instead, by having two copies of no_copy_routes and no_copy_addrs in the context structure, separately for IPv4 and IPv6. While at it, change them to 'bool': we had them as 'int' because getopt_long() used to set them directly, but it hasn't been the case for a while already. Reported-by: Paul Holzinger <pholzing@redhat.com> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>