passt - Plug A Simple Socket Transport

	Commit message (Collapse)	Author	Age	Files	Lines
*	passt, qrap, README: Update notes and documentation for AF_UNIX support in qemu	Stefano Brivio	2022-11-04	1	-6/+2
\| \| \| \| \| \| \| \|	We can't get rid of qrap quite yet, but at least we should start telling users it's not going to be needed anymore starting from qemu 7.2. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt.1: Fix typo: "addressses", reported by Lintian	Stefano Brivio	2022-10-27	1	-1/+1
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, passt.1: Don't imply --foreground with --debug	Stefano Brivio	2022-10-27	1	-3/+2
\| \| \| \| \| \| \| \| \| \|	Having -f implied by -d (and --trace) usually saves some typing, but debug mode in background (with a log file) is quite useful if pasta is started by Podman, and is probably going to be handy for passt with libvirt later, too. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	passt.1: Add David to AUTHORS2022_10_15.b3f3591	Stefano Brivio	2022-10-15	1	-2/+2
\| \| \| \| \| \| \|	I just realised while reading the man page. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	conf: Bind inbound ports with CAP_NET_BIND_SERVICE before isolate_user()	Stefano Brivio	2022-10-15	1	-7/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Even if CAP_NET_BIND_SERVICE is granted, we'll lose the capability in the target user namespace as we isolate the process, which means we're unable to bind to low ports at that point. Bind inbound ports, and only those, before isolate_user(). Keep the handling of outbound ports (for pasta mode only) after the setup of the namespace, because that's where we'll bind them. To this end, initialise the netlink socket for the init namespace before isolate_user() as well, as we actually need to know the addresses of the upstream interface before binding ports, in case they're not explicitly passed by the user. As we now call nl_sock_init() twice, checking its return code from conf() twice looks a bit heavy: make it exit(), instead, as we can't do much if we don't have netlink sockets. While at it: - move the v4_only && v6_only options check just after the first option processing loop, as this is more strictly related to option parsing proper - update the man page, explaining that CAP_NET_BIND_SERVICE is not the preferred way to bind ports, because passt and pasta can be abused to allow other processes to make effective usage of it. Add a note about the recommended sysctl instead - simplify nl_sock_init_do() now that it's called once for each case Reported-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, tcp, udp: Allow specification of interface to bind to	Stefano Brivio	2022-10-15	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since kernel version 5.7, commit c427bfec18f2 ("net: core: enable SO_BINDTODEVICE for non-root users"), we can bind sockets to interfaces, if they haven't been bound yet (as in bind()). Introduce an optional interface specification for forwarded ports, prefixed by %, that can be passed together with an address. Reported use case: running local services that use ports we want to have externally forwarded: https://github.com/containers/podman/issues/14425 Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	conf, tap: Add option to quit once the client closes the connection	Stefano Brivio	2022-10-15	1	-0/+5
\| \| \| \| \| \| \| \|	This is practical to avoid explicit lifecycle management in users, e.g. libvirtd, and is trivial to implement. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	conf, log, Makefile: Add versioning information	Stefano Brivio	2022-10-15	1	-0/+4
\| \| \| \| \| \| \|	Add a --version option displaying that, and also include this information in the log files. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	log, conf: Add support for logging to file	Stefano Brivio	2022-10-14	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In some environments, such as KubeVirt pods, we might not have a system logger available. We could choose to run in foreground, but this takes away the convenient synchronisation mechanism derived from forking to background when interfaces are ready. Add optional logging to file with -l/--log-file and --log-size. Unfortunately, this means we need to duplicate features that are more appropriately implemented by a system logger, such as rotation. Keep that reasonably simple, by using fallocate() with range collapsing where supported (Linux kernel >= 3.15, extent-based ext4 and XFS) and falling back to an unsophisticated block-by-block moving of entries toward the beginning of the file once we reach the (mandatory) size limit. While at it, clarify the role of LOG_EMERG in passt.c. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
*	Allow --userns when pasta spawns a command	David Gibson	2022-09-13	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Currently --userns is only allowed when pasta is attaching to an existing netns or PID, and is prohibited when creating a new netns by spawning a command or shell. With the new handling of userns, this check isn't neccessary. I'm not sure if there's any use case for --userns with a spawned command, but it's strictly more flexible and requires zero extra code, so we might as well. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	Split checking for root from dropping root privilege	David Gibson	2022-09-13	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	check_root() both checks to see if we are root (in the init namespace), and if we are drops to an unprivileged user. To make future cleanups simpler, split the checking for root (now in check_root()) from the actual dropping of privilege (now in drop_root()). Note that this does slightly alter semantics. Previously we would only setuid() if we were originally root (in the init namespace). Now we will always setuid() and setgid(), though it won't actually change anything if we weren't privileged to begin with. This also means that we will now always attempt to switch to the user specified with --runas, even if we aren't (init namespace) root to begin with. Obviously this will fail with an error if we weren't privileged to start with. --help and the man page are updated accordingly. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	Allow pasta to take a command to execute	David Gibson	2022-08-30	1	-5/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When not given an existing PID or network namspace to attach to, pasta spawns a shell. Most commands which can spawn a shell in an altered environment can also run other commands in that same environment, which can be useful in automation. Allow pasta to do the same thing; it can be given an arbitrary command to run in the network and user namespace which pasta creates. If neither a command nor an existing PID or netns to attach to is given, continue to spawn a default shell, as before. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	Use explicit --netns option rather than multiplexing with PID	David Gibson	2022-08-30	1	-3/+13
\| \| \| \| \| \| \| \| \| \| \| \|	When attaching to an existing namespace, pasta can take a PID or the name or path of a network namespace as a non-option parameter. We disambiguate based on what the parameter looks like. Make this more explicit by using a --netns option for explicitly giving the path or name, and treating a non-option argument always as a PID. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [sbrivio: Fix typo in man page] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	Remove --nsrun-dir option	David Gibson	2022-08-30	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	pasta can identify a netns as a "name", which is to say a path relative to (usually) /run/netns, which is the place that ip(8) creates persistent network namespaces. Alternatively a full path to a netns can be given. The --nsrun-dir option allows the user to change the standard path where netns names are resolved. However, there's no real point to this, if the user wants to override the location of the netns, they can just as easily use the full path to specify the netns. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	Correct manpage for --userns	David Gibson	2022-08-30	1	-3/+2
\| \| \| \| \| \| \| \| \|	The man page states that the --userns option can be given either as a path or as a name relative to --nsrun-dir. This is not correct: as the name suggests --nsrun-dir is (correctly) used only for netns resolution, not userns resolution. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	conf: Use "-D none" and "-S none" instead of missing empty option arguments	David Gibson	2022-08-30	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Both the -D (--dns) and -S (--search) options take an optional argument. If the argument is omitted the option is disabled entirely. However, handling the optional argument requires some ugly special case handling if it's the last option on the command line, and has potential ambiguity with non-option arguments used with pasta. It can also make it more confusing to read command lines. Simplify the logic here by replacing the non-argument versions with an explicit "-D none" or "-S none". Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [sbrivio: Reworked logic to exclude redundant/conflicting options] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf: Make the argument to --pcap option mandatory	David Gibson	2022-08-30	1	-10/+0
\| \| \| \| \| \| \| \| \| \| \|	The --pcap or -p option can be used with or without an argument. If given, the argument gives the name of the file to save a packet trace to. If omitted, we generate a default name in /tmp. Generating the default name isn't particularly useful though, since making a suitable name can easily be done by the caller. Remove this feature. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	passt.1: Default host interfaces are now selected based on IP version	Stefano Brivio	2022-07-30	1	-6/+7
\| \| \| \| \| \| \|	Reflect the changes from commit 4b2e018d70f3 ("Allow different external interfaces for IPv4 and IPv6 connectivity") into the manual. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf: Allow to specify ranges and ports excluded from given ranges	Stefano Brivio	2022-07-14	1	-2/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is useful in environments where we want to forward a large number of ports, or all non-ephemeral ones, and some other service running on the host needs a few selected ports. I'm using ~ as prefix for the specification of excluded ranges and ports to avoid the need for explicit command line quoting. Ranges and ports can be excluded from given ranges by adding them in the comma-separated list, prefixed by ~. Some quick examples: -t 5000-6000,~5555: forward ports 5000 to 6000, but not 5555 -t ~20000-20010: forward all non-ephemeral, allowed ports, except for ports 20000 to 20010 ...more details in usage message and man page. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	Invoke specific qemu-system-* binaries	David Gibson	2022-07-14	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A lot of tests and examples invoke qemu with the command "kvm". However, as far as I can tell, "kvm" being aliased to the appropriate qemu system binary is Debian specific. The binary names from qemu upstream - qemu-system-$ARCH - also aren't universal, but they are more common (they should be good for both Debian and Fedora at least). In order to still get KVM acceleration when available, we use the option "-M accel=kvm:tcg" to tell qemu to try using either KVM or TCG in that order A number of the places we invoked "kvm" are expecting specifically an x86 guest, and so it's also safer to explicitly invoke qemu-system-x86_64. Some others appear to be independent of the target arch (just wanting the same arch as the host to allow KVM acceleration). Although I suspect there may be more subtle x86 specific options in the qemu command lines, attempt to preserve arch independence by using $(uname -m). Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	Use dhclient instead of udhcpc	David Gibson	2022-06-15	1	-10/+2
\| \| \| \| \| \| \| \| \| \| \| \|	For some reason, the passt/pasta tests and examples use dhclient for DHCPv6, but in most cases use udhcpc for DHCPv4. Change it to use dhclient for both DHCPv4 and DHCPv6. This means one less tool we need for testing, plus dhclient is easily available on Fedora whereas udhcpc is not. Note that the passt tests still rely on udhcpc indirectly because mbuto wants to put it into the guest images it generates. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	Tweak dhclient arguments for readability	David Gibson	2022-06-15	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	A number of tests and examples use dhclient in both IPv4 and IPv6 modes. We use "dhclient -6" for IPv6, but usually just "dhclient" for IPv4. Add an explicit "-4" argument to make it more clear and explicit. In addition, when dhclient is run from within pasta it usually won't be "real" root, and so will not have access to write the default global pid file. This results in a mostly harmless but irritating error: Can't create /var/run/dhclient.pid: Permission denied We can avoid that by using the --no-pid flag to dhclient. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
*	conf: Add --runas option, changing to given UID and GID if started as root	Stefano Brivio	2022-05-19	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	On some systems, user and group "nobody" might not be available. The new --runas option allows to override the default "nobody" choice if started as root. Now that we allow this, drop the initgroups() call that was used to add any additional groups for the given user, as that might now grant unnecessarily broad permissions. For instance, several distributions have a "kvm" group to allow regular user access to /dev/kvm, and we don't need that in passt or pasta. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, tcp, udp: Allow address specification for forwarded ports	Stefano Brivio	2022-05-01	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	This feature is available in slirp4netns but was missing in passt and pasta. Given that we don't do dynamic memory allocation, we need to bind sockets while parsing port configuration. This means we need to process all other options first, as they might affect addressing and IP version support. It also implies a minor rework of how TCP and UDP implementations bind sockets. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt.1, qrap.1: Update links to qemu out-of-tree patch	Stefano Brivio	2022-04-01	1	-1/+1
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, util, tap: Implement --trace option for extra verbose logging	Stefano Brivio	2022-03-25	1	-0/+5
\| \| \| \| \| \| \| \|	--debug can be a bit too noisy, especially as single packets or socket messages are logged: implement a new option, --trace, implying --debug, that enables all debug messages. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt.1: Drop duplicate --dns section	Stefano Brivio	2022-02-23	1	-11/+1
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, ndp: Disable router advertisements on --config-net	Stefano Brivio	2022-02-23	1	-1/+3
\| \| \| \| \| \| \| \| \|	If we statically configure a default route, and also advertise it for SLAAC, the kernel will try moments later to add the same route: ICMPv6: RA: ndisc_router_discovery failed to add default route Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	man page: Update REPORTING BUGS section	Stefano Brivio	2022-02-21	1	-4/+5
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	pasta: By default, quit if filesystem-bound net namespace goes away	Stefano Brivio	2022-02-21	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	This should be convenient for users managing filesystem-bound network namespaces: monitor the base directory of the namespace and exit if the namespace given as PATH or NAME target is deleted. We can't add an inotify watch directly on the namespace directory, that won't work with nsfs. Add an option to disable this behaviour, --no-netns-quit. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, udp: Introduce basic DNS forwarding	Stefano Brivio	2022-02-21	1	-10/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For compatibility with libslirp/slirp4netns users: introduce a mechanism to map, in the UDP routines, an address facing guest or namespace to the first IPv4 or IPv6 address resulting from configuration as resolver. This can be enabled with the new --dns-forward option. This implies that sourcing and using DNS addresses and search lists, passed via command line or read from /etc/resolv.conf, is not bound anymore to DHCP/DHCPv6/NDP usage: for example, pasta users might just want to use addresses from /etc/resolv.conf as mapping target, while not passing DNS options via DHCP. Reflect this in all the involved code paths by differentiating DHCP/DHCPv6/NDP usage from DNS configuration per se, and in the new options --dhcp-dns, --dhcp-search for pasta, and --no-dhcp-dns, --no-dhcp-search for passt. This should be the last bit to enable substantial compatibility between slirp4netns.sh and slirp4netns(1): pass the --dns-forward option from the script too. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt, pasta: Namespace-based sandboxing, defer seccomp policy application	Stefano Brivio	2022-02-21	1	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To reach (at least) a conceptually equivalent security level as implemented by --enable-sandbox in slirp4netns, we need to create a new mount namespace and pivot_root() into a new (empty) mountpoint, so that passt and pasta can't access any filesystem resource after initialisation. While at it, also detach IPC, PID (only for passt, to prevent vulnerabilities based on the knowledge of a target PID), and UTS namespaces. With this approach, if we apply the seccomp filters right after the configuration step, the number of allowed syscalls grows further. To prevent this, defer the application of seccomp policies after the initialisation phase, before the main loop, that's where we expect bad things to happen, potentially. This way, we get back to 22 allowed syscalls for passt and 34 for pasta, on x86_64. While at it, move #syscalls notes to specific code paths wherever it conceptually makes sense. We have to open all the file handles we'll ever need before sandboxing: - the packet capture file can only be opened once, drop instance numbers from the default path and use the (pre-sandbox) PID instead - /proc/net/tcp{,v6} and /proc/net/udp{,v6}, for automatic detection of bound ports in pasta mode, are now opened only once, before sandboxing, and their handles are stored in the execution context - the UNIX domain socket for passt is also bound only once, before sandboxing: to reject clients after the first one, instead of closing the listening socket, keep it open, accept and immediately discard new connection if we already have a valid one Clarify the (unchanged) behaviour for --netns-only in the man page. To actually make passt and pasta processes run in a separate PID namespace, we need to unshare(CLONE_NEWPID) before forking to background (if configured to do so). Introduce a small daemon() implementation, __daemon(), that additionally saves the PID file before forking. While running in foreground, the process itself can't move to a new PID namespace (a process can't change the notion of its own PID): mention that in the man page. For some reason, fork() in a detached PID namespace causes SIGTERM and SIGQUIT to be ignored, even if the handler is still reported as SIG_DFL: add a signal handler that just exits. We can now drop most of the pasta_child_handler() implementation, that took care of terminating all processes running in the same namespace, if pasta started a shell: the shell itself is now the init process in that namespace, and all children will terminate once the init process exits. Issuing 'echo $$' in a detached PID namespace won't return the actual namespace PID as seen from the init namespace: adapt demo and test setup scripts to reflect that. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt: Fork into background also if not running from a terminal	Stefano Brivio	2021-10-21	1	-1/+1
\| \| \| \| \| \| \| \| \|	This is actually annoying: there's no way to make it fork into background when running from a script. However, it's always possible to keep it in foreground with -f. Make it simpler, and always fork into background if -f is not given. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	LICENSES: Add license text files, add missing notices, fix SPDX tags	Stefano Brivio	2021-10-20	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	SPDX tags don't replace license files. Some notices were missing and some tags were not according to the SPDX specification, too. Now reuse --lint from the REUSE tool (https://reuse.software/) passes. Reported-by: Martin Hauke <mardnh@gmx.de> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf: Add -P, --pid, to specify a file where own PID is written to	Stefano Brivio	2021-10-14	1	-0/+5
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, tcp, udp: Add --no-map-gw to disable mapping gateway address to host	Stefano Brivio	2021-10-14	1	-1/+6
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	doc: Add to man page tip to grant passt the CAP_NET_BIND_SERVICE capability	Stefano Brivio	2021-10-14	1	-1/+6
\| \| \| \|	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	doc: Fix up note about missing tcpi_snd_wnd in man page	Stefano Brivio	2021-10-14	1	-7/+3
\| \| \| \| \| \| \| \|	The behaviour without tcpi_snd_wnd changed: the only difference now is the advertised window, which corresponds to the queried sending buffer size. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, tap: Split netlink and pasta functions, allow interface configuration	Stefano Brivio	2021-10-14	1	-0/+11
\| \| \| \| \| \| \| \| \| \|	Move netlink routines to their own file, and use netlink to configure or fetch all the information we need, except for the TUNSETIFF ioctl. Move pasta-specific functions to their own file as well, add parameters and calls to configure the tap interface in the namespace. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	pasta: Allow specifying paths and names of namespaces	Giuseppe Scrivano	2021-10-07	1	-6/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Based on a patch from Giuseppe Scrivano, this adds the ability to: - specify paths and names of target namespaces to join, instead of a PID, also for user namespaces, with --userns - request to join or create a network namespace only, without entering or creating a user namespace, with --netns-only - specify the base directory for netns mountpoints, with --nsrun-dir Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com> [sbrivio: reworked logic to actually join the given namespaces when they're not created, implemented --netns-only and --nsrun-dir, updated pasta demo script and man page] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	conf, tcp: Periodic detection of bound ports for pasta port forwarding	Stefano Brivio	2021-09-27	1	-7/+8
\| \| \| \| \| \| \| \| \| \|	Detecting bound ports at start-up time isn't terribly useful: do this periodically instead, if configured. This is only implemented for TCP at the moment, UDP is somewhat more complicated: leave a TODO there. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
*	passt, qrap: Add man pages	Stefano Brivio	2021-09-01	1	-0/+707
	Signed-off-by: Stefano Brivio <sbrivio@redhat.com>