diff options
author | Stefano Brivio <sbrivio@redhat.com> | 2023-05-21 15:03:31 +0200 |
---|---|---|
committer | Stefano Brivio <sbrivio@redhat.com> | 2023-05-23 16:13:28 +0200 |
commit | 770d1a4502dd214c75b1418b49c0f51fdbb2ad8e (patch) | |
tree | 15940dba5c5556877e54375c32a2115503a65bac | |
parent | b0e450aa8500648cc89ca4f68b6a6155267ccd9d (diff) | |
download | passt-770d1a4502dd214c75b1418b49c0f51fdbb2ad8e.tar passt-770d1a4502dd214c75b1418b49c0f51fdbb2ad8e.tar.gz passt-770d1a4502dd214c75b1418b49c0f51fdbb2ad8e.tar.bz2 passt-770d1a4502dd214c75b1418b49c0f51fdbb2ad8e.tar.lz passt-770d1a4502dd214c75b1418b49c0f51fdbb2ad8e.tar.xz passt-770d1a4502dd214c75b1418b49c0f51fdbb2ad8e.tar.zst passt-770d1a4502dd214c75b1418b49c0f51fdbb2ad8e.zip |
isolation: Initially Keep CAP_SETFCAP if running as UID 0 in non-init
If pasta spawns a child process while running as UID 0, which is only
allowed from a non-init namespace, we need to keep CAP_SETFCAP before
pasta_start_ns() is called: otherwise, starting from Linux 5.12, we
won't be able to update /proc/self/uid_map with the intended mapping
(from 0 to 0). See user_namespaces(7).
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
-rw-r--r-- | isolation.c | 17 |
1 files changed, 14 insertions, 3 deletions
diff --git a/isolation.c b/isolation.c index 5f89047..19932bf 100644 --- a/isolation.c +++ b/isolation.c @@ -177,6 +177,8 @@ static void clamp_caps(void) */ void isolate_initial(void) { + uint64_t keep; + /* We want to keep CAP_NET_BIND_SERVICE in the initial * namespace if we have it, so that we can forward low ports * into the guest/namespace @@ -193,9 +195,18 @@ void isolate_initial(void) * further capabilites in isolate_user() and * isolate_prefork(). */ - drop_caps_ep_except(BIT(CAP_NET_BIND_SERVICE) | - BIT(CAP_SETUID) | BIT(CAP_SETGID) | - BIT(CAP_SYS_ADMIN) | BIT(CAP_NET_ADMIN)); + keep = BIT(CAP_NET_BIND_SERVICE) | BIT(CAP_SETUID) | BIT(CAP_SETGID) | + BIT(CAP_SYS_ADMIN) | BIT(CAP_NET_ADMIN); + + /* Since Linux 5.12, if we want to update /proc/self/uid_map to create + * a mapping from UID 0, which only happens with pasta spawning a child + * from a non-init user namespace (pasta can't run as root), we need to + * retain CAP_SETFCAP too. + */ + if (!ns_is_init() && !geteuid()) + keep |= BIT(CAP_SETFCAP); + + drop_caps_ep_except(keep); } /** |