aboutgitcodebugslistschat
path: root/conf.c
Commit message (Collapse)AuthorAgeFilesLines
* Consolidate validation of pasta namespace optionsDavid Gibson2022-09-131-41/+42
| | | | | | | | | | | | | | | | | | | | There are a number of different ways to specify namespaces for pasta to use. Some combinations are valid and some are not. Currently validation for these is spread across several places: conf_ns_pid() validates PID options specifically. Near its callsite in conf() several other checks are made. Some additional checks are made in conf_ns_open() and finally theres a check just before the call to pasta_start_ns(). This is quite hard to follow. Make it easier by putting all the validation logic together in a new conf_pasta_ns() function, which subsumes conf_ns_pid(). This reveals that some of the checks were redundant with each other, so remove those. For good measure, rename conf_netns() to conf_netns_opt() to make it clearer its handling just the --netns option specifically, not overall configuration of the netns. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Move self-isolation code into a separate fileDavid Gibson2022-09-131-0/+1
| | | | | | | | passt/pasta contains a number of routines designed to isolate passt from the rest of the system for security. These are spread through util.c and passt.c. Move them together into a new isolation.c file. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Safer handling if we can't open /proc/self/uid_mapDavid Gibson2022-09-131-2/+6
| | | | | | | | | | | | | | | | | passt is allowed to run as "root" (UID 0) in a user namespace, but notas real root in the init namespace. We read /proc/self/uid_map to determine if we're in the init namespace or not. If we're unable to open /proc/self/uid_map we assume we're ok and continue running as UID 0. This seems unwise. The only instances I can think of where uid_map won't be available are if the host kernel doesn't support namespaces, or /proc is not mounted. In neither case is it safe to assume we're "not really" root and continue (although in practice we'd likely fail for other reasons pretty soon anyway). Therefore, fail with an error in this case, instead of carrying on. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Consolidate determination of UID/GID to run asDavid Gibson2022-09-131-8/+73
| | | | | | | | | | | | Currently the logic to work out what UID and GID we will run as is spread across conf(). If --runas is specified it's handled in conf_runas(), otherwise it's handled by check_root(), which depends on initialization of the uid and gid variables by either conf() itself or conf_runas(). Make this clearer by putting all the UID and GID logic into a single conf_ugid() function. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Split checking for root from dropping root privilegeDavid Gibson2022-09-131-2/+3
| | | | | | | | | | | | | | | | | | check_root() both checks to see if we are root (in the init namespace), and if we are drops to an unprivileged user. To make future cleanups simpler, split the checking for root (now in check_root()) from the actual dropping of privilege (now in drop_root()). Note that this does slightly alter semantics. Previously we would only setuid() if we were originally root (in the init namespace). Now we will always setuid() and setgid(), though it won't actually change anything if we weren't privileged to begin with. This also means that we will now always attempt to switch to the user specified with --runas, even if we aren't (init namespace) root to begin with. Obviously this will fail with an error if we weren't privileged to start with. --help and the man page are updated accordingly. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Don't store UID & GID persistently in the context structureDavid Gibson2022-09-131-3/+5
| | | | | | | | c->uid and c->gid are first set in conf(), and last used in check_root() itself called from conf(). Therefore these don't need to be fields in the long lived context structure and can instead be locals in conf(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* conf: Fix getopt_long() optstring for current semantics of -D, -S, -pStefano Brivio2022-09-021-2/+2
| | | | | | | | | | | | | Declaring them as required_argument in the longopts array specifies validation, but doesn't affect how optind is increased after parsing their values. Currently, passing one of these options as last option causes pasta to handle their own values as path to a binary to execute. Fixes: aae2a9bbf7d1 ("conf: Use "-D none" and "-S none" instead of missing empty option arguments") Fixes: bf95322fc1ef ("conf: Make the argument to --pcap option mandatory") Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* Allow pasta to take a command to executeDavid Gibson2022-08-301-9/+18
| | | | | | | | | | | | | | When not given an existing PID or network namspace to attach to, pasta spawns a shell. Most commands which can spawn a shell in an altered environment can also run other commands in that same environment, which can be useful in automation. Allow pasta to do the same thing; it can be given an arbitrary command to run in the network and user namespace which pasta creates. If neither a command nor an existing PID or netns to attach to is given, continue to spawn a default shell, as before. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Use explicit --netns option rather than multiplexing with PIDDavid Gibson2022-08-301-25/+60
| | | | | | | | | | | | When attaching to an existing namespace, pasta can take a PID or the name or path of a network namespace as a non-option parameter. We disambiguate based on what the parameter looks like. Make this more explicit by using a --netns option for explicitly giving the path or name, and treating a non-option argument always as a PID. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [sbrivio: Fix typo in man page] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* More deterministic detection of whether argument is a PID, PATH or NAMEDavid Gibson2022-08-301-82/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pasta takes as its only non-option argument either a PID to attach to the namespaces of, a PATH to a network namespace or a NAME of a network namespace (relative to /run/netns). Currently to determine which it is we try all 3 in that order, and if anything goes wrong we move onto the next. This has the potential to cause very confusing failure modes. e.g. if the argument is intended to be a network namespace name, but a (non-namespace) file of the same name exists in the current directory. Make behaviour more predictable by choosing how to treat the argument based only on the argument's contents, not anything else on the system: - If it's a decimal integer treat it as a PID - Otherwise, if it has no '/' characters, treat it as a netns name (ip-netns doesn't allow '/' in netns names) - Otherwise, treat it as a netns path If you want to open a persistent netns in the current directory, you can use './netns'. This also allows us to split the parsing of the PID|PATH|NAME option from the actual opening of the namespaces. In turn that allows us to put the opening of existing namespaces next to the opening of new namespaces in pasta_start_ns. That makes the logical flow easier to follow and will enable later cleanups. Caveats: - The separation of functions mean we will always generate the basename and dirname for the netns_quit system, even when using PID namespaces. This is pointless, since the netns_quit system doesn't work for non persistent namespaces, but is harmless. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Move ENOENT error message into conf_ns_opt()David Gibson2022-08-301-2/+1
| | | | | | | | After calling conf_ns_opt() we check for -ENOENT and print an error message, but conf_ns_opt() prints messages for other errors itself. For consistency move the ENOENT message into conf_ns_opt() as well. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Remove --nsrun-dir optionDavid Gibson2022-08-301-20/+4
| | | | | | | | | | | | | pasta can identify a netns as a "name", which is to say a path relative to (usually) /run/netns, which is the place that ip(8) creates persistent network namespaces. Alternatively a full path to a netns can be given. The --nsrun-dir option allows the user to change the standard path where netns names are resolved. However, there's no real point to this, if the user wants to override the location of the netns, they can just as easily use the full path to specify the netns. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* conf: Use "-D none" and "-S none" instead of missing empty option argumentsDavid Gibson2022-08-301-24/+32
| | | | | | | | | | | | | | | | Both the -D (--dns) and -S (--search) options take an optional argument. If the argument is omitted the option is disabled entirely. However, handling the optional argument requires some ugly special case handling if it's the last option on the command line, and has potential ambiguity with non-option arguments used with pasta. It can also make it more confusing to read command lines. Simplify the logic here by replacing the non-argument versions with an explicit "-D none" or "-S none". Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [sbrivio: Reworked logic to exclude redundant/conflicting options] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* conf: Make the argument to --pcap option mandatoryDavid Gibson2022-08-301-15/+3
| | | | | | | | | | | The --pcap or -p option can be used with or without an argument. If given, the argument gives the name of the file to save a packet trace to. If omitted, we generate a default name in /tmp. Generating the default name isn't particularly useful though, since making a suitable name can easily be done by the caller. Remove this feature. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Don't unnecessarily avoid CLOEXEC flags2022_08_24.60ffc5bDavid Gibson2022-08-241-7/+3
| | | | | | | | | | | | | | | | There are several places in the passt code where we have lint overrides because we're not adding CLOEXEC flags to open or other operations. Comments suggest this is because it's before we fork() into the background but we'll need those file descriptors after we're in the background. However, as the name suggests CLOEXEC closes on exec(), not on fork(). The only place we exec() is either super early invoke the avx2 version of the binary, or when we start a shell in pasta mode, which certainly *doesn't* require the fds in question. Add the CLOEXEC flag in those places, and remove the lint overrides. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* conf: Fix incorrect bounds checking for sock_path parameterDavid Gibson2022-08-241-1/+1
| | | | | | | Looks like a copy-paste error where we're checking against the size of the pcap field, rather than the sock_path field. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Make substructures for IPv4 and IPv6 specific context informationDavid Gibson2022-07-301-90/+94
| | | | | | | | | | | | The context structure contains a batch of fields specific to IPv4 and to IPv6 connectivity. Split those out into a sub-structure. This allows the conf_ip4() and conf_ip6() functions, which take the entire context but touch very little of it, to be given more specific parameters, making it clearer what it affects without stepping through the code. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Separate IPv4 and IPv6 configurationDavid Gibson2022-07-301-68/+80
| | | | | | | | | | | | | | | | | | | After recent changes, conf_ip() now has essentially entirely disjoint paths for IPv4 and IPv6 configuration. So, it's cleaner to split them out into different functions conf_ip4() and conf_ip6(). Splitting these out also lets us make the interface a bit nicer, having them return success or failure directly, rather than manipulating c->v4 and c->v6 to indicate success/failure of the two versions. Since these functions may also initialize the interface index for each protocol, it turns out we can then drop c->v4 and c->v6 entirely, replacing tests on those with tests on whether c->ifi4 or c->ifi6 is non-zero (since a 0 interface index is never valid). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [sbrivio: Whitespace fixes] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* Clarify semantics of c->v4 and c->v6 variablesDavid Gibson2022-07-301-39/+20
| | | | | | | | | | | | | | | | | | | | | The v4 and v6 fields of the context structure can be confusing, because they change meaning part way through the code: Before conf_ip(), they are booleans which indicate whether the -4 or -6 options have been given. After conf_ip() they are DISABLED|ENABLED|PROBE enums which indicate whether the IP version is available (which means both that it was allowed on the command line and we were able to configure it). The PROBE variant of the enum is only used locally within conf_ip() and since recent changes there it no longer has a real purpose different from ENABLED. Simplify this all by making the context fields always just a boolean indicating the availability of the IP version. They both default to 1, but can be set to 0 by either command line options or configuration failures. We use some local variables in conf() for tracking the state of the command line options on their own. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [sbrivio: Minor coding style fix in conf.c] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* Move passt mac_guest init to be more symmetric with pastaDavid Gibson2022-07-301-3/+0
| | | | | | | | | | | | | | | | | In pasta mode, the guest's MAC address is set up in pasta_ns_cobf() called from tap_sock_tun_init(). If we have a guest MAC configured with --ns-mac-addr, this will set the given MAC on the kernel tuntap device, or if we haven't configured one it will update our record of the guest MAC to the kernel assigned one from the device. For passt, we don't initially know the guest's MAC until we receive packets from it, so we have to initially use a broadcast address. This is - oddly - set up in an entirely different place, in conf_ip() conditional on the mode. Move it to the logically matching place for passt - tap_sock_unix_init(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Initialize host side MAC when in IPv6 only modeDavid Gibson2022-07-301-9/+12
| | | | | | | | | | | | | | | | | | When sending packets to the guest we need a source MAC address, which we currently take from the host side interface we're using (though it's basically arbitrary). However if not given on the command line this MAC is initialized in an IPv4 specific path, and will end up as 00:00:00:00:00:00 when running "passt 6". The MAC address is also used for IPv6 packets, though. Interestingly, we largely seem to get away with using an all-zero MAC, but it's probably not a good idea. Make the IPv6 path pick the MAC address from its interface if the IPv4 path hasn't already done so. While we're there, use the existing MAC_IS_ZERO macro to make the code a little clearer. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Separately locate external interfaces for IPv4 and IPv6David Gibson2022-07-301-2/+17
| | | | | | | | | | | | | | | | | Now that the back end allows passt/pasta to use different external interfaces for IPv4 and IPv6, use that to do the right thing in the case that the host has IPv4 and IPv6 connectivity via different interfaces. If the user hasn't explicitly chosen an interface, separately search for a suitable external interface for each protocol. As a bonus, this substantially simplifies the external interface probe. It also eliminates a subtle confusing case where in some circumstances we would pick the first interface in interface index order, and sometimes in order of routes returned from netlink. On some network configurations that could cause tests to fail, because the logic in the tests was subtly different (it always used route order). Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Allow different external interfaces for IPv4 and IPv6 connectivityDavid Gibson2022-07-301-17/+21
| | | | | | | | | | | | | | | | It's quite plausible for a host to have both IPv4 and IPv6 connectivity, but only via different interfaces. For example, this will happen in the case that IPv6 connectivity is via a tunnel (e.g. 6in4 or 6rd). It would also happen in the case that IPv4 access is via a tunnel on an otherwise IPv6 only local network, which is a setup that might become more common in the post IPv4 address exhaustion world. In turns out there's no real need for passt/pasta to get its IPv4 and IPv6 connectivity via the same interface, so we can handle this situation fairly easily. Change the core to allow eparate external interfaces for IPv4 and IPv6. We don't actually set these separately for now. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* conf: Reset range endpoints after parsing one excluded port specifier2022_07_14.b86cd00Stefano Brivio2022-07-141-0/+1
| | | | | | | | | I forgot to reset the range endpoints after parsing an item of the comma-separated list in commit 220759efb89a ("conf: Allow to specify ranges and ports excluded from given ranges") -- fix that. Fixes: 220759efb89a ("conf: Allow to specify ranges and ports excluded from given ranges") Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* conf: Allow to specify ranges and ports excluded from given rangesStefano Brivio2022-07-141-11/+101
| | | | | | | | | | | | | | | | | | | | | This is useful in environments where we want to forward a large number of ports, or all non-ephemeral ones, and some other service running on the host needs a few selected ports. I'm using ~ as prefix for the specification of excluded ranges and ports to avoid the need for explicit command line quoting. Ranges and ports can be excluded from given ranges by adding them in the comma-separated list, prefixed by ~. Some quick examples: -t 5000-6000,~5555: forward ports 5000 to 6000, but not 5555 -t ~20000-20010: forward all non-ephemeral, allowed ports, except for ports 20000 to 20010 ...more details in usage message and man page. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* conf: Fix initialisation of IPv6 unicast and link-local addressesStefano Brivio2022-07-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | In commit 675174d4ba25 ("conf, tap: Split netlink and pasta functions, allow interface configuration"), I broke the initial setting of the observed IPv6 addresses in two ways: - the size copied from the configured addresses corresponds to an IPv4 address, not to an IPv6 address - the observed link-local address is initialised to the configured unicast address, not the link-local one If we haven't seen the guest using some type of addresses yet, we should default to the configured values, hence these initial settings: fix both. This resulted in UDP flows to the guest from a unique local address on the network not working before the guest shows passt a valid address itself, as reported by Alona. Reported-by: Alona Paz <alkaplan@redhat.com> Link: https://bugs.passt.top/show_bug.cgi?id=16 Fixes: 675174d4ba25 ("conf, tap: Split netlink and pasta functions, allow interface configuration") Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* Handle the case of a DNS server on localhostDavid Gibson2022-07-141-0/+16
| | | | | | | | | | | | | | | | | | By default, passt detects the nameserver used by the host system by reading /etc/resolv.conf, and advertises that to the guest via DHCP. However this breaks down if the host's nameserver is local (on 127.0.0.1 or ::1); connecting to localhost on the guest won't reach the host's nameserver. Using a local nameserver is a reasonably common case when using dnsmasq or similar to merge name resolution on a home network with name resolution from an organization-private VPN. We already have the gateway mapping support to allow reaching host-local services from the guest via the address of the default gateway. Add code to detect the case of a local DNS server and use the gateway mapping to advertise it usefully to the guest. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Parse resolv.conf with new lineread implementationDavid Gibson2022-07-061-8/+14
| | | | | | | | | | | Switch the resolv.conf parsing in conf.c to use the new lineread implementation. This means that it can now handle a resolv.conf file which contains blank lines. There are quite a few other fragilities with the resolv.conf parsing, but that's out of scope for this patch. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* conf: In conf_runas(), on static builds, group information is also unusedStefano Brivio2022-06-181-0/+1
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* conf: Fix one Coverity CID 258163 warning, work around another oneStefano Brivio2022-05-201-5/+3
| | | | | | | | | | | | | | | In conf_runas(), Coverity reports that we might dereference uid and gid despite possibly being NULL (CWE-476) because of the check after the first sscanf(). They can't be NULL, but I actually wanted to check that UID and GID are non-zero (the user could otherwise pass --runas root:root and defy the whole mechanism). Later on, we have the same type of warning for 'gr': it's compared against NULL, so it might be NULL, which is actually the case: but in that case, we don't dereference it, because we'll return -ENOENT right away. Rewrite the clause to silence the warning. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* conf: Add --runas option, changing to given UID and GID if started as rootStefano Brivio2022-05-191-0/+70
| | | | | | | | | | | | | | On some systems, user and group "nobody" might not be available. The new --runas option allows to override the default "nobody" choice if started as root. Now that we allow this, drop the initgroups() call that was used to add any additional groups for the given user, as that might now grant unnecessarily broad permissions. For instance, several distributions have a "kvm" group to allow regular user access to /dev/kvm, and we don't need that in passt or pasta. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* conf, tcp, udp: Allow address specification for forwarded portsStefano Brivio2022-05-011-25/+95
| | | | | | | | | | | | | This feature is available in slirp4netns but was missing in passt and pasta. Given that we don't do dynamic memory allocation, we need to bind sockets while parsing port configuration. This means we need to process all other options first, as they might affect addressing and IP version support. It also implies a minor rework of how TCP and UDP implementations bind sockets. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* conf, tap: False "Buffer not null terminated" positives, CWE-170Stefano Brivio2022-04-071-3/+3
| | | | | | | Those strings are actually guaranteed to be NULL-terminated. Reported by Coverity. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* conf: False "Assign instead of compare" positive, CWE-481Stefano Brivio2022-04-071-1/+1
| | | | | | | This really just needs to be an assignment before line_read() -- turn it into a for loop. Reported by Coverity. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* conf, packet: Operands don't affect result, CWE-569Stefano Brivio2022-04-071-2/+5
| | | | | | Reported by Coverity. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* treewide: Fix android-cloexec-* clang-tidy warnings, re-enable checksStefano Brivio2022-03-291-1/+5
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* treewide: Mark constant references as constStefano Brivio2022-03-291-2/+6
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* conf, util, tap: Implement --trace option for extra verbose loggingStefano Brivio2022-03-251-0/+15
| | | | | | | | --debug can be a bit too noisy, especially as single packets or socket messages are logged: implement a new option, --trace, implying --debug, that enables all debug messages. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* conf, ndp: Disable router advertisements on --config-netStefano Brivio2022-02-231-0/+3
| | | | | | | | | If we statically configure a default route, and also advertise it for SLAAC, the kernel will try moments later to add the same route: ICMPv6: RA: ndisc_router_discovery failed to add default route Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* passt: Drop PASST_LEGACY_NO_OPTIONS sectionsStefano Brivio2022-02-221-12/+0
| | | | | | ...nobody uses those builds anymore. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* pasta: By default, quit if filesystem-bound net namespace goes awayStefano Brivio2022-02-211-10/+33
| | | | | | | | | | | | This should be convenient for users managing filesystem-bound network namespaces: monitor the base directory of the namespace and exit if the namespace given as PATH or NAME target is deleted. We can't add an inotify watch directly on the namespace directory, that won't work with nsfs. Add an option to disable this behaviour, --no-netns-quit. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* conf, udp: Introduce basic DNS forwardingStefano Brivio2022-02-211-21/+81
| | | | | | | | | | | | | | | | | | | | | | | | | For compatibility with libslirp/slirp4netns users: introduce a mechanism to map, in the UDP routines, an address facing guest or namespace to the first IPv4 or IPv6 address resulting from configuration as resolver. This can be enabled with the new --dns-forward option. This implies that sourcing and using DNS addresses and search lists, passed via command line or read from /etc/resolv.conf, is not bound anymore to DHCP/DHCPv6/NDP usage: for example, pasta users might just want to use addresses from /etc/resolv.conf as mapping target, while not passing DNS options via DHCP. Reflect this in all the involved code paths by differentiating DHCP/DHCPv6/NDP usage from DNS configuration per se, and in the new options --dhcp-dns, --dhcp-search for pasta, and --no-dhcp-dns, --no-dhcp-search for passt. This should be the last bit to enable substantial compatibility between slirp4netns.sh and slirp4netns(1): pass the --dns-forward option from the script too. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* conf: Given IPv4 address and no netmask, assign RFC 790-style classesStefano Brivio2022-02-211-10/+10
| | | | | | | Provide a sane default, instead of /0, if an address is given, and it doesn't correspond to any host address we could find via netlink. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* conf: Don't print configuration on --quietStefano Brivio2022-02-211-1/+2
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* Makefile, conf, passt: Drop passt4netns references, explicit argc checkStefano Brivio2022-02-211-6/+6
| | | | | | | | | | | Nobody currently calls this as passt4netns, that was the name I used before 'pasta', drop any reference before it's too late. While at it, explicitly check that argc is bigger than or equal to one, just as a defensive measure: argv[0] being NULL is not an issue anyway. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* passt, pasta: Namespace-based sandboxing, defer seccomp policy applicationStefano Brivio2022-02-211-21/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To reach (at least) a conceptually equivalent security level as implemented by --enable-sandbox in slirp4netns, we need to create a new mount namespace and pivot_root() into a new (empty) mountpoint, so that passt and pasta can't access any filesystem resource after initialisation. While at it, also detach IPC, PID (only for passt, to prevent vulnerabilities based on the knowledge of a target PID), and UTS namespaces. With this approach, if we apply the seccomp filters right after the configuration step, the number of allowed syscalls grows further. To prevent this, defer the application of seccomp policies after the initialisation phase, before the main loop, that's where we expect bad things to happen, potentially. This way, we get back to 22 allowed syscalls for passt and 34 for pasta, on x86_64. While at it, move #syscalls notes to specific code paths wherever it conceptually makes sense. We have to open all the file handles we'll ever need before sandboxing: - the packet capture file can only be opened once, drop instance numbers from the default path and use the (pre-sandbox) PID instead - /proc/net/tcp{,v6} and /proc/net/udp{,v6}, for automatic detection of bound ports in pasta mode, are now opened only once, before sandboxing, and their handles are stored in the execution context - the UNIX domain socket for passt is also bound only once, before sandboxing: to reject clients after the first one, instead of closing the listening socket, keep it open, accept and immediately discard new connection if we already have a valid one Clarify the (unchanged) behaviour for --netns-only in the man page. To actually make passt and pasta processes run in a separate PID namespace, we need to unshare(CLONE_NEWPID) before forking to background (if configured to do so). Introduce a small daemon() implementation, __daemon(), that additionally saves the PID file before forking. While running in foreground, the process itself can't move to a new PID namespace (a process can't change the notion of its own PID): mention that in the man page. For some reason, fork() in a detached PID namespace causes SIGTERM and SIGQUIT to be ignored, even if the handler is still reported as SIG_DFL: add a signal handler that just exits. We can now drop most of the pasta_child_handler() implementation, that took care of terminating all processes running in the same namespace, if pasta started a shell: the shell itself is now the init process in that namespace, and all children will terminate once the init process exits. Issuing 'echo $$' in a detached PID namespace won't return the actual namespace PID as seen from the init namespace: adapt demo and test setup scripts to reflect that. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* passt: Address new clang-tidy warnings from LLVM 13.0.1Stefano Brivio2022-01-301-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | clang-tidy from LLVM 13.0.1 reports some new warnings from these checkers: - altera-unroll-loops, altera-id-dependent-backward-branch: ignore for the moment being, add a TODO item - bugprone-easily-swappable-parameters: ignore, nothing to do about those - readability-function-cognitive-complexity: ignore for the moment being, add a TODO item - altera-struct-pack-align: ignore, alignment is forced in protocol headers - concurrency-mt-unsafe: ignore for the moment being, add a TODO item Fix bugprone-implicit-widening-of-multiplication-result warnings, though, that's doable and they seem to make sense. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* conf: Fix support for --stderr as short option (-e)Stefano Brivio2022-01-271-1/+9
| | | | | | I forgot --stderr could also be -e, fix handling. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* seccomp: Add a number of alternate and per-arch syscallsStefano Brivio2022-01-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Depending on the C library, but not necessarily in all the functions we use, statx() might be used instead of stat(), getdents() instead of getdents64(), readlinkat() instead of readlink(), openat() instead of open(). On aarch64, it's clone() and not fork(), and dup3() instead of dup2() -- just allow the existing alternative instead of dealing with per-arch selections. Since glibc commit 9a7565403758 ("posix: Consolidate fork implementation"), we need to allow set_robust_list() for fork()/clone(), even in a single-threaded context. On some architectures, epoll_pwait() is provided instead of epoll_wait(), but never both. Same with newfstat() and fstat(), sigreturn() and rt_sigreturn(), getdents64() and getdents(), readlink() and readlinkat(), unlink() and unlinkat(), whereas pipe() might not be available, but pipe2() always is, exclusively or not. Seen on Fedora 34: newfstatat() is used on top of fstat(). syslog() is an actual system call on some glibc/arch combinations, instead of a connect()/send() implementation. On ppc64 and ppc64le, _llseek(), recv(), send() and getuid() are used. For ppc64 only: ugetrlimit() for the getrlimit() implementation, plus sigreturn() and fcntl64(). On s390x, additionally, we need to allow socketcall() (on top of socket()), and sigreturn() also for passt (not just for pasta). Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* conf, pasta: Explicitly pass CLONE_{NEWUSER,NEWNET} to setns()Stefano Brivio2022-01-261-2/+2
| | | | | | | Only allow the intended types of namespaces to be joined via setns() as a defensive measure. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>