| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
| |
If we define die() as a variadic macro, passing __VA_ARGS__ to err(),
and calling exit() outside err() itself, we can drop the workarounds
introduced in commit 36f0199f6ef4 ("conf, tap: Silence two false
positive invalidFunctionArg from cppcheck").
Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Andrea reports that with a Fedora 37 guest running on a Fedora 37
host, both using systemd-resolved, with passt connecting them,
running with default options, DNS queries don't work.
systemd-resolved on the host is reachable only at the loopback
address 127.0.0.53.
We advertise the default gateway address to the guest as resolver,
because our local address is of course unreachable from there, which
means we see DNS queries directed to the default gateway, and we
redirect them to 127.0.0.1. However, systemd-resolved doesn't answer
on 127.0.0.1.
To fix this, set @dns_match to the address of the default gateway,
unless a different resolver address is explicitly configured, so that
we know we explicitly have to map DNS queries, in this case, to the
address of the local resolver.
This means that in udp_tap_handler() we need to check, first, if
the destination address of packets matches @dns_match: even if it's
the address of the local gateway, we want to map that to a specific
address, which isn't necessarily 127.0.0.1.
Do the same for IPv6 for consistency, even though IPv6 defines a
single loopback address.
Reported-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The logic handling which resolvers we add, and whether to add them,
is getting rather cramped in get_dns(): split it into separate
functions.
No functional changes intended.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ouch, I accidentally pushed the previous change without running the
tests:
- we need to check, in conf_ports(), that udp_sock_init()
managed to bind at least a port, not the opposite
- for -T and -U, we have no way to know if we'll manage to bind
the port later, so never report an error for those
Fixes: 3d0de2c1d727 ("conf, tcp, udp: Exit if we fail to bind sockets for all given ports")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
| |
The newly introduced die() calls exit(), but cppcheck doesn't see it
and warns about possibly invalid arguments used after the check which
triggers die(). Add return statements to silence the warnings.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
passt supports ranges of forwarded ports as well as 'all' for TCP and
UDP, so it might be convenient to proceed if we fail to bind only
some of the desired ports.
But if we fail to bind even a single port for a given specification,
we're clearly, unexpectedly, conflicting with another network
service. In that case, report failure and exit.
Reported-by: Yalan Zhang <yalzhang@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
| |
Signed-off-by: Laine Stump <laine@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
| |
...and return void to simplify the caller.
Signed-off-by: Laine Stump <laine@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
| |
Again, it can then be made to return void, simplifying the caller.
Signed-off-by: Laine Stump <laine@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
| |
As with conf_ports, this allows us to make the function return void.
Signed-off-by: Laine Stump <laine@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than having conf_ports() (possibly) log an error, and then
letting the caller log the entire usage() message and exit, just log
the errors and exit immediately (using die()).
For some errors, conf_ports would previously not log any specific
message, leaving it up to the user to determine the problem by
guessing. We replace all of those silent returns with die()
(logging a specific error), thus permitting us to make conf_ports()
return void, which simplifies the caller.
While modifying the two callers to conf_ports() to not check for a
return value, we can further simplify the code by removing the check
for a non-null optarg, as that is guaranteed to never happen (due to
prior calls to getopt_long() with "argument required" for all relevant
options - getopt_long() would have already caught this error).
Signed-off-by: Laine Stump <laine@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Nearly all of the calls to usage() in conf() occur immediately after
logging a more detailed error message, and the fact that these errors
are occuring indicates that the user has already seen the passt usage
message (or read the manpage). Spamming the logfile with the complete
contents of the usage message serves only to obscure the more detailed
error message. The only time when the full usage message should be output
is if the user explicitly asks for it with -h (or its synonyms)
As a start to eliminating the excessive calls to usage(), this patch
replaces most calls to err() followed by usage() with a call to die()
instead. A few other usage() calls remain, but their removal involves
bit more nuance that should be properly explained in separate commit
messages.
Signed-off-by: Laine Stump <laine@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
| |
This passes a fully connected stream socket to passt.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
[sbrivio: reuse fd_tap instead of adding a new descriptor,
imply --one-off on --fd, add to optstring and usage()]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tcp_sock_init*() can create either sockets listening on the host, or in
the pasta network namespace (with @ns==1). There are, however, a number
of differences in how these two cases work in practice though. "ns"
sockets are only used in pasta mode, and they always lead to spliced
connections only. The functions are also only ever called in "ns" mode
with a NULL address and interface name, and it doesn't really make sense
for them to be called any other way.
Later changes will introduce further differences in behaviour between these
two cases, so it makes more sense to use separate functions for creating
the ns listening sockets than the regular external/host listening sockets.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
| |
Some style issues and a typo.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Given that we use just the first valid DNS resolver address
configured, or read from resolv.conf(5) on the host, to forward DNS
queries to, in case --dns-forward is used, we don't need to duplicate
dns[] to dns_send[]:
- rename dns_send[] back to dns[]: those are the resolvers we
advertise to the guest/container
- for forwarding purposes, instead of dns[], use a single field (for
each protocol version): dns_host
- and rename dns_fwd to dns_match, so that it's clear this is the
address we are matching DNS queries against, to decide if they need
to be forwarded
Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
| |
Reported-by: Paul Holzinger <pholzing@redhat.com>
Fixes: dd09cceaee21 ("Minor improvements to IPv4 netmask handling")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With --dns-forward, if the host has a loopback address configured as
DNS server, we should actually use it to forward queries, but, if
--no-map-gw is passed, we shouldn't offer the same address via DHCP,
NDP and DHCPv6, because it's not going to be reachable.
Problematic configuration:
* systemd-resolved configuring the usual 127.0.0.53 on the host: we
read that from /etc/resolv.conf
* --dns-forward specified with an unrelated address, for example
198.51.100.1
We still want to forward queries to 127.0.0.53, if we receive one
directed to 198.51.100.1, so we can't drop 127.0.0.53 from our list:
we want to use it for forwarding. At the same time, we shouldn't
offer 127.0.0.53 to the guest or container either.
With this change, I'm only covering the case of automatically
configured DNS servers from /etc/resolv.conf. We could extend this to
addresses configured with command-line options, but I don't really
see a likely use case at this point.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Seen in a Google Compute Engine environment with a machine configured
via cloud-init-dhcp, while testing Podman integration for pasta: the
assigned address has a /32 netmask, and there's a default route,
which can be added on the host because there's another route, also
/32, pointing to the default gateway. For example, on the host:
ip -4 address add 10.156.0.2/32 dev eth0
ip -4 route add 10.156.0.1/32 dev eth0
ip -4 route add default via 10.156.0.1
This is not a valid configuration as far as I can tell: if the
address is configured as /32, it shouldn't be used to reach a gateway
outside its derived netmask. However, Linux allows that, and
everything works.
The problem comes when pasta --config-net sources address and default
route from the host, and it can't configure the route in the target
namespace because the gateway is invalid. That is, we would skip
configuring the first route in the example, which results in the
equivalent of doing:
ip -4 address add 10.156.0.2/32 dev eth0
ip -4 route add default via 10.156.0.1
where, at this point, 10.156.0.1 is unreachable, and hence invalid
as a gateway.
Sourcing more routes than just the default is doable, but probably
undesirable: pasta users want to provide connectivity to a container,
not reflect exactly whatever trickery is configured on the host.
Add a consistency check and an adjustment: if the configured default
gateway is not reachable, shrink the given netmask until we can reach
it.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We recently corrected some errors handling the endianness of IPv4
addresses. These are very easy errors to make since although we mostly
store them in network endianness, we sometimes need to manipulate them in
host endianness.
To reduce the chances of making such mistakes again, change to always using
a (struct in_addr) instead of a bare in_addr_t or uint32_t to store network
endian addresses. This makes it harder to accidentally do arithmetic or
comparisons on such addresses as if they were host endian.
We introduce a number of IN4_IS_ADDR_*() helpers to make it easier to
directly work with struct in_addr values. This has the additional benefit
of making the IPv4 and IPv6 paths more visually similar.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This macro checks if an IPv4 address is in the loopback network
(127.0.0.0/8). There are two places where we open code an identical check,
use the macro instead.
There are also a number of places we specifically exclude the loopback
address (127.0.0.1), but we should actually be excluding anything in the
loopback network. Change those sites to use the macro as well.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are several minor problems with our parsing of IPv4 netmasks (-n).
First, we don't reject nonsensical netmasks like 0.255.0.255. Address this
structurally by using prefix length instead of netmask as the primary
variable, only converting (and validating) when we need to. This has the
added benefit of making some things more uniform with the IPv6 path.
Second, when the user specifies a prefix length, we truncate the output
from strtol() to an integer, which means we would treat -n 4294967320 as
valid (equivalent to 24). Fix types to check for this.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The INADDR_LOOPBACK constant is in host endianness, and similarly the
IN_MULTICAST macro expects a host endian address. However, there are some
places in passt where we use those with network endian values. This means
that passt will incorrectly allow you to set 127.0.0.1 or a multicast
address as the guest address or DNS forwarding address. Add the necessary
conversions to correct this.
INADDR_ANY and INADDR_BROADCAST logically behave the same way, although
because they're palindromes it doesn't have an effect in practice. Change
them to be logically correct while we're there, though.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Having -f implied by -d (and --trace) usually saves some typing, but
debug mode in background (with a log file) is quite useful if pasta
is started by Podman, and is probably going to be handy for passt
with libvirt later, too.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 84fec4e998b6 ("Clean up parsing of port ranges") drops the
strspn() call before the parsing of excluded port ranges, because now
we're checking against any stray characters at every step.
However, that also has the effect of passing ~ as first character to
the new parse_port_range(), which makes no sense: we already checked
that ~ is the first character before the call, so skip it.
Alona reported this output:
Invalid port specifier ~15000,~15001,~15006,~15008,~15020,~15021,~15090
while the whole specifier is indeed valid.
Reported-by: Alona Paz <alkaplan@redhat.com>
Fixes: 84fec4e998b6 ("Clean up parsing of port ranges")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Even if CAP_NET_BIND_SERVICE is granted, we'll lose the capability in
the target user namespace as we isolate the process, which means
we're unable to bind to low ports at that point.
Bind inbound ports, and only those, before isolate_user(). Keep the
handling of outbound ports (for pasta mode only) after the setup of
the namespace, because that's where we'll bind them.
To this end, initialise the netlink socket for the init namespace
before isolate_user() as well, as we actually need to know the
addresses of the upstream interface before binding ports, in case
they're not explicitly passed by the user.
As we now call nl_sock_init() twice, checking its return code from
conf() twice looks a bit heavy: make it exit(), instead, as we
can't do much if we don't have netlink sockets.
While at it:
- move the v4_only && v6_only options check just after the first
option processing loop, as this is more strictly related to
option parsing proper
- update the man page, explaining that CAP_NET_BIND_SERVICE is
*not* the preferred way to bind ports, because passt and pasta
can be abused to allow other processes to make effective usage
of it. Add a note about the recommended sysctl instead
- simplify nl_sock_init_do() now that it's called once for each
case
Reported-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When in passt mode, or pasta mode spawning a command, we create a userns
for ourselves. This is used both to isolate the pasta/passt process itself
and to run the spawned command, if any.
Since eed17a47 "Handle userns isolation and dropping root at the same time"
we've handled both cases the same, configuring the UID and GID mappings in
the new userns to map whichever UID we're running as to root within the
userns.
This mapping is desirable when spawning a shell or other command, so that
the user gets a root shell with reasonably clear abilities within the
userns and netns. It's not necessarily essential, though. When not
spawning a shell, it doesn't really have any purpose: passt itself doesn't
need to be root and can operate fine with an unmapped user (using some of
the capabilities we get when entering the userns instead).
Configuring the uid_map can cause problems if passt is running with any
capabilities in the initial namespace, such as CAP_NET_BIND_SERVICE to
allow it to forward low ports. In this case the kernel makes files in
/proc/pid owned by root rather than the starting user to prevent the user
from interfering with the operation of the capability-enhanced process.
This includes uid_map meaning we are not able to write to it.
Whether this behaviour is correct in the kernel is debatable, but in any
case we might as well avoid problems by only initializing the user mappings
when we really want them.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current implementation of drop_caps() doesn't really work because it
attempts to drop capabilities from the bounding set. That's not the set
that really matters, it's about limiting the abilities of things we might
later exec() rather than our own capabilities. It also requires
CAP_SETPCAP which we won't usually have.
Replace it with a new version which uses setcap(2) to drop capabilities
from the effective and permitted sets. For now we leave the inheritable
set as is, since we don't want to preclude the user from passing
inheritable capabilities to the command spawed by pasta.
Correctly dropping caps reveals that we were relying on some capabilities
we'd supposedly dropped. Re-divide the dropping of capabilities between
isolate_initial(), isolate_user() and isolate_prefork() to make this work.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
| |
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
| |
Fixes: 745a9ba4284c ("pasta: By default, quit if filesystem-bound net namespace goes away")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since kernel version 5.7, commit c427bfec18f2 ("net: core: enable
SO_BINDTODEVICE for non-root users"), we can bind sockets to
interfaces, if they haven't been bound yet (as in bind()).
Introduce an optional interface specification for forwarded ports,
prefixed by %, that can be passed together with an address.
Reported use case: running local services that use ports we want
to have externally forwarded:
https://github.com/containers/podman/issues/14425
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
| |
This is practical to avoid explicit lifecycle management in users,
e.g. libvirtd, and is trivial to implement.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
| |
Add a --version option displaying that, and also include this
information in the log files.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some environments, such as KubeVirt pods, we might not have a
system logger available. We could choose to run in foreground, but
this takes away the convenient synchronisation mechanism derived from
forking to background when interfaces are ready.
Add optional logging to file with -l/--log-file and --log-size.
Unfortunately, this means we need to duplicate features that are more
appropriately implemented by a system logger, such as rotation. Keep
that reasonably simple, by using fallocate() with range collapsing
where supported (Linux kernel >= 3.15, extent-based ext4 and XFS) and
falling back to an unsophisticated block-by-block moving of entries
toward the beginning of the file once we reach the (mandatory) size
limit.
While at it, clarify the role of LOG_EMERG in passt.c.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This originated as a result of copy and paste to introduce a second
stage for processing options related to port forwarding, has already
bitten David in the past, and just gave me hours of fun.
As a matter of fact, the second set of optstring assignments was
already incorrect, but it didn't matter because the first one was
more restrictive, not allowing optional arguments for -P, -D, -S.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
| |
Logging to file is going to add some further complexity that we don't
want to squeeze into util.c.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
| |
strtok() is non-reentrant and old-fashioned, so cppcheck would complains
about its use in conf.c if it weren't suppressed. We're single threaded
and strtok() is convenient though, so it's not really worth reworking at
this time. Convert this to an inline suppression so it's adjacent to the
code its annotating.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The counter 'i' is used in a number of places in conf_ports(), but in one
of those we unnecessarily shadow it in an inner scope. We could re-use the
same 'i' every time, but each use is logically separate, so instead remove
the outer declaration and declare it locally in each of the clauses where
we need it.
While we're there change it from a signed to unsigned int, since it's used
to iterate over port numbers which are generally treated as unsigned.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
conf_runas() handles several of the different possible cases for the
--runas argument in a slightly odd order. Although it can parse both
numeric UIDs/GIDs and user/group names, it can't parse a numeric UID
combined with a group name or vice versa. That's not obviously useful, but
it's slightly surprising gap to have.
Rework the parsing to be more systematic: first split the option into
user and (optional) group parts, then separately parse each part as either
numeric or a name. As a bonus this removes some clang-tidy warnings.
While we're there also add cppcheck suppressions for getpwnam() and
getgrnam(). It complains about those because they're not reentrant.
passt is single threaded though, and is always likely to be during
this initialization code, even if we multithread later.
There were some existing suppressions for these in the cppcheck
invocation but they're no longer up to date. Replace them with inline
suppressions which, being next to the code, are more likely to stay
correct.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
conf_ports() parses ranges of ports for the -t, -u, -T and -U options.
The code is quite difficult to the follow, to the point that clang-tidy
and cppcheck disagree on whether one of the pointers can be NULL at some
points.
Rework the code with the use of two new helper functions:
* parse_port_range() operates a bit like strtoul(), but can parse a whole
port range specification (e.g. '80' or '1000-1015')
* next_chunk() does the necessary wrapping around strchr() to advance to
just after the next given delimiter, while cleanly handling if there
are no more delimiters
The new version is easier to follow, and also removes some cppcheck
warnings.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Port numbers (for both TCP and UDP) are 16-bit, and so fit exactly into a
'short'. USHRT_MAX is therefore the maximum port number and this is widely
used in the code. Unfortunately, a lot of those places don't actually
want the maximum port number (USHRT_MAX == 65535), they want the total
number of ports (65536). This leads to a number of potentially nasty
consequences:
* We have buffer overruns on the port_fwd::delta array if we try to use
port 65535
* We have similar potential overruns for the tcp_sock_* arrays
* Interestingly udp_act had the correct size, but we can calculate it in
a more direct manner
* We have a logical overrun of the ports bitmap as well, although it will
just use an unused bit in the last byte so isnt harmful
* Many loops don't consider port 65535 (which does mitigate some but not
all of the buffer overruns above)
* In udp_invert_portmap() we incorrectly compute the reverse port
translation for return packets
Correct all these by using a new NUM_PORTS defined explicitly for this
purpose.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
| |
Port numbers are unsigned values, but we're storing them in (signed) int
variables in some places. This isn't actually harmful, because int is
large enough to hold the entire range of ports. However in places we don't
want to use an in_port_t (usually to avoid overflow on the last iteration
of a loop) it makes more conceptual sense to use an unsigned int. This will
also avoid some problems with later cleanups.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
| |
conf_ports() switches on the optname argument to select the target array
for several updates. Now that all these maps are in a common structure, we
can simplify by just passing in a pointer to the whole struct port_fwd to
update.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
| |
Now that we've delayed initialization of the UDP specific "reverse" map
until udp_init(), the only difference between the various 'remap' functions
used in conf_ports() is which array they target. So, simplify by open
coding the logic into conf_ports() with a pointer to the correct mapping
array.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The configuration for how to forward ports in and out of the guest/ns is
divided between several different variables. For each connect direction
and protocol we have a mode in the udp/tcp context structure, a bitmap
of which ports to forward also in the context structure and an array of
deltas to apply if the outward facing and inward facing port numbers are
different. This last is a separate global variable, rather than being in
the context structure, for no particular reason. UDP also requires an
additional array which has the reverse mapping used for return packets.
Consolidate these into a re-used substructure in the context structure.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
enum conf_port_type is local to conf.c and is used to track the port
forwarding mode during configuration. We don't keep it around in the
context structure, however the 'init_detect_ports' and 'ns_detect_ports'
fields in the context are based solely on this. Rather than changing
encoding, just include the forwarding mode into the context structure.
Move the type definition to a new port_fwd.h, which is kind of trivial at
the moment but will have more stuff later.
While we're there, "conf_port_type" doesn't really convey that this enum is
describing how port forwarding is configured. Rename it to port_fwd_mode.
The variables (now fields) of this type also have mildly confusing names
since it's not immediately obvious whether 'ns' and 'init' refer to the
source or destination of the packets. Use "in" (host to guest / init to
ns) and "out" (guest to host / ns to init) instead.
This has the added bonus that we no longer have locals 'udp_init' and
'tcp_init' which shadow global functions.
In addition, add a typedef 'port_fwd_map' for a bitmap of each port number,
which is used in several places.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reported by David but also by Coverity (CWE-119):
In conf_ports: Out-of-bounds access to a buffer
...not in practice, because the allocation size is rounded up
anyway, but not nice either.
Reported-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently --userns is only allowed when pasta is attaching to an existing
netns or PID, and is prohibited when creating a new netns by spawning a
command or shell.
With the new handling of userns, this check isn't neccessary. I'm not sure
if there's any use case for --userns with a spawned command, but it's
strictly more flexible and requires zero extra code, so we might as well.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
passt/pasta can interact with user namespaces in a number of ways:
1) With --netns-only we'll remain in our original user namespace
2) With --userns or a PID option to pasta we'll join either the given
user namespace or that of the PID
3) When pasta spawns a shell or command we'll start a new user namespace
for the command and then join it
4) With passt we'll create a new user namespace when we sandbox()
ourself
However (3) and (4) turn out to have essentially the same effect. In both
cases we create one new user namespace. The spawned command starts there,
and passt/pasta itself will live there from sandbox() onwards.
Because of this, we can simplify user namespace handling by moving the
userns handling earlier, to the same point we drop root in the original
namespace. Extend the drop_user() function to isolate_user() which does
both.
After switching UID and GID in the original userns, isolate_user() will
either join or create the userns we require. When we spawn a command with
pasta_start_ns()/pasta_setup_ns() we no longer need to create a userns,
because we're already made one. sandbox() likewise no longer needs to
create (or join) an userns because we're already in the one we need.
We no longer need c->pasta_userns_fd, since the fd is only used locally
in isolate_user(). Likewise we can replace c->netns_only with a local
in conf(), since it's not used outside there.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
conf_ns_open() opens file descriptors for the namespaces pasta needs, but
it doesnt really have anything to do with configuration any more. For
better clarity, move it to pasta.c and rename it pasta_open_ns(). This
makes the symmetry between it and pasta_start_ns() more clear, since these
represent the two basic ways that pasta can operate, either attaching to
an existing namespace/process or spawning a new one.
Since its no longer validating options, the errors it could return
shouldn't cause a usage message. Just exit directly with an error instead.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|