passt - Plug A Simple Socket Transport

	Commit message (Collapse)	Author	Age	Files	Lines
*	conf, netlink: Don't require a default route to start	Stefano Brivio	2024-03-18	1	-16/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There might be isolated testing environments where default routes and global connectivity are not needed, a single interface has all non-loopback addresses and routes, and still passt and pasta are expected to work. In this case, it's pretty obvious what our upstream interface should be, so go ahead and select the only interface with at least one route, disabling DHCP and implying --no-map-gw as the documentation already states. If there are multiple interfaces with routes, though, refuse to start, because at that point it's really not clear what we should do. Reported-by: Martin Pitt <mpitt@redhat.com> Link: https://github.com/containers/podman/issues/21896 Signed-off-by: Stefano brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	passt.1: --{no-,}dhcp-dns and --{no-,}dhcp-search don't take addresses	Stefano Brivio	2024-03-14	1	-4/+4
\| \| \| \| \| \| \| \|	...they are simple enable/disable options. Fixes: 89678c515755 ("conf, udp: Introduce basic DNS forwarding") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	pasta: Don't try to watch namespaces in procfs with inotify, use timer instead2024_02_19.ff22a78	Stefano Brivio	2024-02-19	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We watch network namespace entries to detect when we should quit (unless --no-netns-quit is passed), and these might stored in a tmpfs typically mounted at /run/user/UID or /var/run/user/UID, or found in procfs at /proc/PID/ns/. Currently, we try to use inotify for any possible location of those entries, but inotify, of course, doesn't work on pseudo-filesystems (see inotify(7)). The man page reflects this: the description of --no-netns-quit implies that we won't quit anyway if the namespace is not "bound to the filesystem". Well, we won't quit, but, since commit 9e0dbc894813 ("More deterministic detection of whether argument is a PID, PATH or NAME"), we try. And, indeed, this is harmless, as the caveat from that commit message states. Now, it turns out that Buildah, a tool to create container images, sharing its codebase with Podman, passes a procfs entry to pasta, and expects pasta to exit once the network namespace is not needed anymore, that is, once the original container process, also spawned by Buildah, terminates. Get this to work by using the timer fallback mechanism if the namespace name is passed as a path belonging to a pseudo-filesystem. This is expected to be procfs, but I covered sysfs and devpts pseudo-filesystems as well, because nothing actually prevents creating this kind of directory structure and links there. Note that fstatfs(), according to some versions of man pages, was apparently "deprecated" by the LSB. My reasoning for using it is essentially this: https://lore.kernel.org/linux-man/f54kudgblgk643u32tb6at4cd3kkzha6hslahv24szs4raroaz@ogivjbfdaqtb/t/#u ...that is, there was no such thing as an LSB deprecation, and anyway there's no other way to get the filesystem type. Also note that, while it might sound more obvious to detect the filesystem type using fstatfs() on the file descriptor itself (c->pasta_netns_fd), the reported filesystem type for it is nsfs, no matter what path was given to pasta. If we use the parent directory, we'll typically have either tmpfs or procfs reported. If the target namespace is given as a PID, or as a PID-based procfs entry, we don't risk races if this PID is recycled: our handle on /proc/PID/ns will always refer to the original namespace associated with that PID, and we don't re-open this entry from procfs to check it. There's, however, a remaining race possibility if the parent process is not the one associated to the network namespace we operate on: in that case, the parent might pass a procfs entry associated to a PID that was recycled by the time we parse it. This can't happen if the namespace PID matches the one of the parent, because we detach from the controlling terminal after parsing the namespace reference. To avoid this type of race, if desired, we could add the option for the parent to pass a PID file descriptor, that the parent obtained via pidfd_open(). This is beyond the scope of this change. Update the man page to reflect that, even if the target network namespace is passed as a procfs path or a PID, we'll now quit when the procfs entry is gone. Reported-by: Paul Holzinger <pholzing@redhat.com> Link: https://github.com/containers/podman/pull/21563#issuecomment-1948200214 Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, passt.1: Exit if we can't bind a forwarded port, except for -[tu] all	Stefano Brivio	2024-02-16	1	-3/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	...or similar, that is, if only excluded ranges are given (implying we'll forward any other available port). In that case, we'll usually forward large sets of ports, and it might be inconvenient for the user to skip excluding single ports that are already taken. The existing behaviour, that is, exiting only if we fail to bind all the ports for one given forwarding option, turns out to be problematic for several aspects raised by Paul: - Podman merges ranges anyway, so we might fail to bind all the ports from a specific range given by the user, but we'll not fail anyway because Podman merges it with another one where we succeed to bind at least one port. At the same time, there should be no semantic difference between multiple ranges given by a single option and multiple ranges given as multiple options: it's unexpected and not documented - the user might actually rely on a given port to be forwarded to a given container or a virtual machine, and if connections are forwarded to an unrelated process, this might raise security concerns - given that we can try and fail to bind multiple ports before exiting (in case we can't bind any), we don't have a specific error code we can return to the user, so we don't give the user helpful indication as to why we couldn't bind ports. Exit as soon as we fail to create or bind a socket for a given forwarded port, and report the actual error. Keep the current behaviour, however, in case the user wants to forward all the (available) ports for a given protocol, or all the ports with excluded ranges only. There, it's more reasonable that the user is expecting partial failures, and it's probably convenient that we continue with the ports we could forward. Update the manual page to reflect the new behaviour, and the old behaviour too in the cases where we keep it. Suggested-by: Paul Holzinger <pholzing@redhat.com> Link: https://github.com/containers/podman/pull/21563#issuecomment-1937024642 Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Tested-by: Paul Holzinger <pholzing@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	netlink: Add support to fetch default gateway from multipath routes	Stefano Brivio	2024-02-09	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the default route for a given IP version is a multipath one, instead of refusing to start because there's no RTA_GATEWAY attribute in the set returned by the kernel, we can just pick one of the paths. To make this somewhat less arbitrary, pick the path with the highest weight, if weights differ. Reported-by: Ed Santiago <santiago@redhat.com> Link: https://github.com/containers/podman/issues/20927 Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	udp,pasta: Periodically scan for ports to automatically forward	David Gibson	2023-11-19	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pasta supports automatic port forwarding, where we look for listening sockets in /proc/net (in both namespace and outside) and establish port forwarding to match. For TCP we do this scan both at initial startup, then periodically thereafter. For UDP however, we currently only scan at start. So unlike TCP we won't update forwarding to handle services that start after pasta has begun. There's no particular reason for that, other than that we didn't implement it. So, remove that difference, by scanning for new UDP forwards periodically too. The logic is basically identical to that for TCP, but it needs some changes to handle the mildly different data structures in the UDP case. Link: https://bugs.passt.top/show_bug.cgi?id=45 Link: https://github.com/rootless-containers/rootlesskit/issues/383 Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, pasta: With --config-net, copy all addresses by default	Stefano Brivio	2023-05-23	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the newly-introduced NL_DUP mode for nl_addr() to copy all the addresses associated to the template interface in the outer namespace, unless --no-copy-addrs (also implied by -a) is given. This option is introduced as deprecated right away: it's not expected to be of any use, but it's helpful to keep it around for a while to debug any suspected issue with this change. This is done mostly for consistency with routes. It might partially cover the issue at: https://bugs.passt.top/show_bug.cgi?id=47 Support multiple addresses per address family for some use cases, but not the originally intended one: we'll still use a single outbound address (unless the routing table specifies different preferred source addresses depending on the destination), regardless of the address used in the target namespace. Link: https://bugs.passt.top/show_bug.cgi?id=47 Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	conf: Don't exit if sourced default route has no gateway	Stefano Brivio	2023-05-23	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we use a template interface without a gateway on the default route, we can still offer almost complete functionality, except that, of course, we can't map the gateway address to the outer namespace or host, and that we have no obvious server address or identifier for use in DHCP's siaddr and option 54 (Server identifier, mandatory). Continue, if we have a default route but no default gateway, and imply --no-map-gw and --no-dhcp in that case. NDP responder and DHCPv6 should be able to work as usual because we require a link-local address to be present, and we'll fall back to that. Together with the previous commits implementing an actual copy of routes from the outer namespace, this should finally fix the operation of 'pasta --config-net' for cases where we have a default route on the host, but no default gateway, as it's the case for tap-style routes, including typical Wireguard endpoints. Reported-by: me@yawnt.com Link: https://bugs.passt.top/show_bug.cgi?id=49 Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	conf, pasta: With --config-net, copy all routes by default	Stefano Brivio	2023-05-23	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the newly-introduced NL_DUP mode for nl_route() to copy all the routes associated to the template interface in the outer namespace, unless --no-copy-routes (also implied by -g) is given. This option is introduced as deprecated right away: it's not expected to be of any use, but it's helpful to keep it around for a while to debug any suspected issue with this change. Otherwise, we can't use default gateways which are not, address-wise, on the same subnet as the container, as reported by Callum. Reported-by: Callum Parsey <callum@neoninteger.au> Link: https://github.com/containers/podman/issues/18539 Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	correct -6 option in manpage2023_05_09.96f8d55	lemmi	2023-05-09	1	-1/+1
\| \| \| \| \| \|	Signed-off-by: lemmi <lemmi@nerd2nerd.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt: Relicense to GPL 2.0, or any later version	Stefano Brivio	2023-04-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In practical terms, passt doesn't benefit from the additional protection offered by the AGPL over the GPL, because it's not suitable to be executed over a computer network. Further, restricting the distribution under the version 3 of the GPL wouldn't provide any practical advantage either, as long as the passt codebase is concerned, and might cause unnecessary compatibility dilemmas. Change licensing terms to the GNU General Public License Version 2, or any later version, with written permission from all current and past contributors, namely: myself, David Gibson, Laine Stump, Andrea Bolognani, Paul Holzinger, Richard W.M. Jones, Chris Kuhn, Florian Weimer, Giuseppe Scrivano, Stefan Hajnoczi, and Vasiliy Ulyanov. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf: Allow binding to ports on an interface without a specific address	Stefano Brivio	2023-03-29	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Somebody might want to bind listening sockets to a specific interface, but not a specific address, and there isn't really a reason to prevent that. For example: -t %eth0/2022 Alternatively, we support options such as -t 0.0.0.0%eth0/2022 and -t ::%eth0/2022, but not together, for the same port. Enable this kind of syntax and add examples to the man page. Reported-by: Paul Holzinger <pholzing@redhat.com> Link: https://github.com/containers/podman/issues/14425#issuecomment-1485192195 Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt.1: Fix description of --mtu option	Stefano Brivio	2023-03-17	1	-2/+4
\| \| \| \| \| \| \| \|	By default, 65520 bytes are advertised, and zero disables DHCP and NDP options. Fixes: ec2b58ea4dc4 ("conf, dhcp, ndp: Fix message about default MTU, make NDP consistent") Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, icmp, tcp, udp: Add options to bind to outbound address and interface	Stefano Brivio	2023-03-09	1	-2/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I didn't notice earlier: libslirp (and slirp4netns) supports binding outbound sockets to specific IPv4 and IPv6 addresses, to force the source addresse selection. If we want to claim feature parity, we should implement that as well. Further, Podman supports specifying outbound interfaces as well, but this is simply done by resolving the primary address for an interface when the network back-end is started. However, since kernel version 5.7, commit c427bfec18f2 ("net: core: enable SO_BINDTODEVICE for non-root users"), we can actually bind to a specific interface name, which doesn't need to be validated in advance. Implement -o / --outbound ADDR to bind to IPv4 and IPv6 addresses, and --outbound-if4 and --outbound-if6 to bind IPv4 and IPv6 sockets to given interfaces. Given that it probably makes little sense to select addresses and routes from interfaces different than the ones given for outbound sockets, also assign those as "template" interfaces, by default, unless explicitly overridden by '-i'. For ICMP and UDP, we call sock_l4() to open outbound sockets, as we already needed to bind to given ports or echo identifiers, and we can bind() a socket only once: there, pass address (if any) and interface (if any) for the existing bind() and setsockopt() calls. For TCP, in general, we wouldn't otherwise bind sockets. Add a specific helper to do that. For UDP outbound sockets, we need to know if the final destination of the socket is a loopback address, before we decide whether it makes sense to bind the socket at all: move the block mangling the address destination before the creation of the socket in the IPv4 path. This was already the case for the IPv6 path. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	passt.1: Fix typo, improve wording in examples of port forwarding specifiers	Stefano Brivio	2023-02-16	1	-12/+17
\| \| \| \| \| \| \| \| \| \| \| \|	Based on a patch from Laine, and reports from Laine and Yalan: fix the "22-80:32-90" example, and improve wording for the other ones: instead of using "to" to denote the end of a range, use "between ... and", so that it's clear we're not referring to target ports. Reported-by: Laine Stump <laine@redhat.com> Reported-by: Yalan Zhang <yalzhang@redhat.com> Fixes: da20f57f19dc ("passt, qrap: Add man pages") Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt, tap: Add --fd option	Richard W.M. Jones	2022-11-25	1	-0/+10
\| \| \| \| \| \| \| \| \|	This passes a fully connected stream socket to passt. Signed-off-by: Richard W.M. Jones <rjones@redhat.com> [sbrivio: reuse fd_tap instead of adding a new descriptor, imply --one-off on --fd, add to optstring and usage()] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt, qrap, README: Update notes and documentation for AF_UNIX support in qemu	Stefano Brivio	2022-11-04	1	-6/+2
\| \| \| \| \| \| \| \|	We can't get rid of qrap quite yet, but at least we should start telling users it's not going to be needed anymore starting from qemu 7.2. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt.1: Fix typo: "addressses", reported by Lintian	Stefano Brivio	2022-10-27	1	-1/+1
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, passt.1: Don't imply --foreground with --debug	Stefano Brivio	2022-10-27	1	-3/+2
\| \| \| \| \| \| \| \| \| \|	Having -f implied by -d (and --trace) usually saves some typing, but debug mode in background (with a log file) is quite useful if pasta is started by Podman, and is probably going to be handy for passt with libvirt later, too. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	passt.1: Add David to AUTHORS2022_10_15.b3f3591	Stefano Brivio	2022-10-15	1	-2/+2
\| \| \| \| \| \| \|	I just realised while reading the man page. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	conf: Bind inbound ports with CAP_NET_BIND_SERVICE before isolate_user()	Stefano Brivio	2022-10-15	1	-7/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Even if CAP_NET_BIND_SERVICE is granted, we'll lose the capability in the target user namespace as we isolate the process, which means we're unable to bind to low ports at that point. Bind inbound ports, and only those, before isolate_user(). Keep the handling of outbound ports (for pasta mode only) after the setup of the namespace, because that's where we'll bind them. To this end, initialise the netlink socket for the init namespace before isolate_user() as well, as we actually need to know the addresses of the upstream interface before binding ports, in case they're not explicitly passed by the user. As we now call nl_sock_init() twice, checking its return code from conf() twice looks a bit heavy: make it exit(), instead, as we can't do much if we don't have netlink sockets. While at it: - move the v4_only && v6_only options check just after the first option processing loop, as this is more strictly related to option parsing proper - update the man page, explaining that CAP_NET_BIND_SERVICE is not the preferred way to bind ports, because passt and pasta can be abused to allow other processes to make effective usage of it. Add a note about the recommended sysctl instead - simplify nl_sock_init_do() now that it's called once for each case Reported-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, tcp, udp: Allow specification of interface to bind to	Stefano Brivio	2022-10-15	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since kernel version 5.7, commit c427bfec18f2 ("net: core: enable SO_BINDTODEVICE for non-root users"), we can bind sockets to interfaces, if they haven't been bound yet (as in bind()). Introduce an optional interface specification for forwarded ports, prefixed by %, that can be passed together with an address. Reported use case: running local services that use ports we want to have externally forwarded: https://github.com/containers/podman/issues/14425 Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	conf, tap: Add option to quit once the client closes the connection	Stefano Brivio	2022-10-15	1	-0/+5
\| \| \| \| \| \| \| \|	This is practical to avoid explicit lifecycle management in users, e.g. libvirtd, and is trivial to implement. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	conf, log, Makefile: Add versioning information	Stefano Brivio	2022-10-15	1	-0/+4
\| \| \| \| \| \| \|	Add a --version option displaying that, and also include this information in the log files. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	log, conf: Add support for logging to file	Stefano Brivio	2022-10-14	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In some environments, such as KubeVirt pods, we might not have a system logger available. We could choose to run in foreground, but this takes away the convenient synchronisation mechanism derived from forking to background when interfaces are ready. Add optional logging to file with -l/--log-file and --log-size. Unfortunately, this means we need to duplicate features that are more appropriately implemented by a system logger, such as rotation. Keep that reasonably simple, by using fallocate() with range collapsing where supported (Linux kernel >= 3.15, extent-based ext4 and XFS) and falling back to an unsophisticated block-by-block moving of entries toward the beginning of the file once we reach the (mandatory) size limit. While at it, clarify the role of LOG_EMERG in passt.c. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	Allow --userns when pasta spawns a command	David Gibson	2022-09-13	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Currently --userns is only allowed when pasta is attaching to an existing netns or PID, and is prohibited when creating a new netns by spawning a command or shell. With the new handling of userns, this check isn't neccessary. I'm not sure if there's any use case for --userns with a spawned command, but it's strictly more flexible and requires zero extra code, so we might as well. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	Split checking for root from dropping root privilege	David Gibson	2022-09-13	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	check_root() both checks to see if we are root (in the init namespace), and if we are drops to an unprivileged user. To make future cleanups simpler, split the checking for root (now in check_root()) from the actual dropping of privilege (now in drop_root()). Note that this does slightly alter semantics. Previously we would only setuid() if we were originally root (in the init namespace). Now we will always setuid() and setgid(), though it won't actually change anything if we weren't privileged to begin with. This also means that we will now always attempt to switch to the user specified with --runas, even if we aren't (init namespace) root to begin with. Obviously this will fail with an error if we weren't privileged to start with. --help and the man page are updated accordingly. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	Allow pasta to take a command to execute	David Gibson	2022-08-30	1	-5/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When not given an existing PID or network namspace to attach to, pasta spawns a shell. Most commands which can spawn a shell in an altered environment can also run other commands in that same environment, which can be useful in automation. Allow pasta to do the same thing; it can be given an arbitrary command to run in the network and user namespace which pasta creates. If neither a command nor an existing PID or netns to attach to is given, continue to spawn a default shell, as before. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	Use explicit --netns option rather than multiplexing with PID	David Gibson	2022-08-30	1	-3/+13
\| \| \| \| \| \| \| \| \| \| \| \|	When attaching to an existing namespace, pasta can take a PID or the name or path of a network namespace as a non-option parameter. We disambiguate based on what the parameter looks like. Make this more explicit by using a --netns option for explicitly giving the path or name, and treating a non-option argument always as a PID. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [sbrivio: Fix typo in man page] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	Remove --nsrun-dir option	David Gibson	2022-08-30	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	pasta can identify a netns as a "name", which is to say a path relative to (usually) /run/netns, which is the place that ip(8) creates persistent network namespaces. Alternatively a full path to a netns can be given. The --nsrun-dir option allows the user to change the standard path where netns names are resolved. However, there's no real point to this, if the user wants to override the location of the netns, they can just as easily use the full path to specify the netns. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	Correct manpage for --userns	David Gibson	2022-08-30	1	-3/+2
\| \| \| \| \| \| \| \| \|	The man page states that the --userns option can be given either as a path or as a name relative to --nsrun-dir. This is not correct: as the name suggests --nsrun-dir is (correctly) used only for netns resolution, not userns resolution. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	conf: Use "-D none" and "-S none" instead of missing empty option arguments	David Gibson	2022-08-30	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Both the -D (--dns) and -S (--search) options take an optional argument. If the argument is omitted the option is disabled entirely. However, handling the optional argument requires some ugly special case handling if it's the last option on the command line, and has potential ambiguity with non-option arguments used with pasta. It can also make it more confusing to read command lines. Simplify the logic here by replacing the non-argument versions with an explicit "-D none" or "-S none". Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [sbrivio: Reworked logic to exclude redundant/conflicting options] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf: Make the argument to --pcap option mandatory	David Gibson	2022-08-30	1	-10/+0
\| \| \| \| \| \| \| \| \| \| \|	The --pcap or -p option can be used with or without an argument. If given, the argument gives the name of the file to save a packet trace to. If omitted, we generate a default name in /tmp. Generating the default name isn't particularly useful though, since making a suitable name can easily be done by the caller. Remove this feature. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	passt.1: Default host interfaces are now selected based on IP version	Stefano Brivio	2022-07-30	1	-6/+7
\| \| \| \| \| \| \|	Reflect the changes from commit 4b2e018d70f3 ("Allow different external interfaces for IPv4 and IPv6 connectivity") into the manual. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf: Allow to specify ranges and ports excluded from given ranges	Stefano Brivio	2022-07-14	1	-2/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is useful in environments where we want to forward a large number of ports, or all non-ephemeral ones, and some other service running on the host needs a few selected ports. I'm using ~ as prefix for the specification of excluded ranges and ports to avoid the need for explicit command line quoting. Ranges and ports can be excluded from given ranges by adding them in the comma-separated list, prefixed by ~. Some quick examples: -t 5000-6000,~5555: forward ports 5000 to 6000, but not 5555 -t ~20000-20010: forward all non-ephemeral, allowed ports, except for ports 20000 to 20010 ...more details in usage message and man page. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	Invoke specific qemu-system-* binaries	David Gibson	2022-07-14	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A lot of tests and examples invoke qemu with the command "kvm". However, as far as I can tell, "kvm" being aliased to the appropriate qemu system binary is Debian specific. The binary names from qemu upstream - qemu-system-$ARCH - also aren't universal, but they are more common (they should be good for both Debian and Fedora at least). In order to still get KVM acceleration when available, we use the option "-M accel=kvm:tcg" to tell qemu to try using either KVM or TCG in that order A number of the places we invoked "kvm" are expecting specifically an x86 guest, and so it's also safer to explicitly invoke qemu-system-x86_64. Some others appear to be independent of the target arch (just wanting the same arch as the host to allow KVM acceleration). Although I suspect there may be more subtle x86 specific options in the qemu command lines, attempt to preserve arch independence by using $(uname -m). Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	Use dhclient instead of udhcpc	David Gibson	2022-06-15	1	-10/+2
\| \| \| \| \| \| \| \| \| \| \| \|	For some reason, the passt/pasta tests and examples use dhclient for DHCPv6, but in most cases use udhcpc for DHCPv4. Change it to use dhclient for both DHCPv4 and DHCPv6. This means one less tool we need for testing, plus dhclient is easily available on Fedora whereas udhcpc is not. Note that the passt tests still rely on udhcpc indirectly because mbuto wants to put it into the guest images it generates. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	Tweak dhclient arguments for readability	David Gibson	2022-06-15	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	A number of tests and examples use dhclient in both IPv4 and IPv6 modes. We use "dhclient -6" for IPv6, but usually just "dhclient" for IPv4. Add an explicit "-4" argument to make it more clear and explicit. In addition, when dhclient is run from within pasta it usually won't be "real" root, and so will not have access to write the default global pid file. This results in a mostly harmless but irritating error: Can't create /var/run/dhclient.pid: Permission denied We can avoid that by using the --no-pid flag to dhclient. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	conf: Add --runas option, changing to given UID and GID if started as root	Stefano Brivio	2022-05-19	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	On some systems, user and group "nobody" might not be available. The new --runas option allows to override the default "nobody" choice if started as root. Now that we allow this, drop the initgroups() call that was used to add any additional groups for the given user, as that might now grant unnecessarily broad permissions. For instance, several distributions have a "kvm" group to allow regular user access to /dev/kvm, and we don't need that in passt or pasta. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, tcp, udp: Allow address specification for forwarded ports	Stefano Brivio	2022-05-01	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	This feature is available in slirp4netns but was missing in passt and pasta. Given that we don't do dynamic memory allocation, we need to bind sockets while parsing port configuration. This means we need to process all other options first, as they might affect addressing and IP version support. It also implies a minor rework of how TCP and UDP implementations bind sockets. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt.1, qrap.1: Update links to qemu out-of-tree patch	Stefano Brivio	2022-04-01	1	-1/+1
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, util, tap: Implement --trace option for extra verbose logging	Stefano Brivio	2022-03-25	1	-0/+5
\| \| \| \| \| \| \| \|	--debug can be a bit too noisy, especially as single packets or socket messages are logged: implement a new option, --trace, implying --debug, that enables all debug messages. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt.1: Drop duplicate --dns section	Stefano Brivio	2022-02-23	1	-11/+1
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, ndp: Disable router advertisements on --config-net	Stefano Brivio	2022-02-23	1	-1/+3
\| \| \| \| \| \| \| \| \|	If we statically configure a default route, and also advertise it for SLAAC, the kernel will try moments later to add the same route: ICMPv6: RA: ndisc_router_discovery failed to add default route Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	man page: Update REPORTING BUGS section	Stefano Brivio	2022-02-21	1	-4/+5
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	pasta: By default, quit if filesystem-bound net namespace goes away	Stefano Brivio	2022-02-21	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	This should be convenient for users managing filesystem-bound network namespaces: monitor the base directory of the namespace and exit if the namespace given as PATH or NAME target is deleted. We can't add an inotify watch directly on the namespace directory, that won't work with nsfs. Add an option to disable this behaviour, --no-netns-quit. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, udp: Introduce basic DNS forwarding	Stefano Brivio	2022-02-21	1	-10/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For compatibility with libslirp/slirp4netns users: introduce a mechanism to map, in the UDP routines, an address facing guest or namespace to the first IPv4 or IPv6 address resulting from configuration as resolver. This can be enabled with the new --dns-forward option. This implies that sourcing and using DNS addresses and search lists, passed via command line or read from /etc/resolv.conf, is not bound anymore to DHCP/DHCPv6/NDP usage: for example, pasta users might just want to use addresses from /etc/resolv.conf as mapping target, while not passing DNS options via DHCP. Reflect this in all the involved code paths by differentiating DHCP/DHCPv6/NDP usage from DNS configuration per se, and in the new options --dhcp-dns, --dhcp-search for pasta, and --no-dhcp-dns, --no-dhcp-search for passt. This should be the last bit to enable substantial compatibility between slirp4netns.sh and slirp4netns(1): pass the --dns-forward option from the script too. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt, pasta: Namespace-based sandboxing, defer seccomp policy application	Stefano Brivio	2022-02-21	1	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To reach (at least) a conceptually equivalent security level as implemented by --enable-sandbox in slirp4netns, we need to create a new mount namespace and pivot_root() into a new (empty) mountpoint, so that passt and pasta can't access any filesystem resource after initialisation. While at it, also detach IPC, PID (only for passt, to prevent vulnerabilities based on the knowledge of a target PID), and UTS namespaces. With this approach, if we apply the seccomp filters right after the configuration step, the number of allowed syscalls grows further. To prevent this, defer the application of seccomp policies after the initialisation phase, before the main loop, that's where we expect bad things to happen, potentially. This way, we get back to 22 allowed syscalls for passt and 34 for pasta, on x86_64. While at it, move #syscalls notes to specific code paths wherever it conceptually makes sense. We have to open all the file handles we'll ever need before sandboxing: - the packet capture file can only be opened once, drop instance numbers from the default path and use the (pre-sandbox) PID instead - /proc/net/tcp{,v6} and /proc/net/udp{,v6}, for automatic detection of bound ports in pasta mode, are now opened only once, before sandboxing, and their handles are stored in the execution context - the UNIX domain socket for passt is also bound only once, before sandboxing: to reject clients after the first one, instead of closing the listening socket, keep it open, accept and immediately discard new connection if we already have a valid one Clarify the (unchanged) behaviour for --netns-only in the man page. To actually make passt and pasta processes run in a separate PID namespace, we need to unshare(CLONE_NEWPID) before forking to background (if configured to do so). Introduce a small daemon() implementation, __daemon(), that additionally saves the PID file before forking. While running in foreground, the process itself can't move to a new PID namespace (a process can't change the notion of its own PID): mention that in the man page. For some reason, fork() in a detached PID namespace causes SIGTERM and SIGQUIT to be ignored, even if the handler is still reported as SIG_DFL: add a signal handler that just exits. We can now drop most of the pasta_child_handler() implementation, that took care of terminating all processes running in the same namespace, if pasta started a shell: the shell itself is now the init process in that namespace, and all children will terminate once the init process exits. Issuing 'echo $$' in a detached PID namespace won't return the actual namespace PID as seen from the init namespace: adapt demo and test setup scripts to reflect that. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt: Fork into background also if not running from a terminal	Stefano Brivio	2021-10-21	1	-1/+1
\| \| \| \| \| \| \| \| \|	This is actually annoying: there's no way to make it fork into background when running from a script. However, it's always possible to keep it in foreground with -f. Make it simpler, and always fork into background if -f is not given. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	LICENSES: Add license text files, add missing notices, fix SPDX tags	Stefano Brivio	2021-10-20	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	SPDX tags don't replace license files. Some notices were missing and some tags were not according to the SPDX specification, too. Now reuse --lint from the REUSE tool (https://reuse.software/) passes. Reported-by: Martin Hauke <mardnh@gmx.de> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>