aboutgitcodebugslistschat
path: root/README.md
Commit message (Collapse)AuthorAgeFilesLines
* README: Add links to Debian package trackerStefano Brivio2022-11-161-7/+10
| | | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
* passt, qrap, README: Update notes and documentation for AF_UNIX support in qemuStefano Brivio2022-11-041-6/+4
| | | | | | | | We can't get rid of qrap quite yet, but at least we should start telling users it's not going to be needed anymore starting from qemu 7.2. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Add Podman, vhost-user links, and links to Bugzilla queriesStefano Brivio2022-10-271-4/+8
| | | | | | | | | | Unfortunately Bugzilla doesn't enable sharing of queries to unregistered users: https://bugzilla.mozilla.org/show_bug.cgi?id=400063 ...but we can still use ugly search links. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* log, conf: Add support for logging to fileStefano Brivio2022-10-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | In some environments, such as KubeVirt pods, we might not have a system logger available. We could choose to run in foreground, but this takes away the convenient synchronisation mechanism derived from forking to background when interfaces are ready. Add optional logging to file with -l/--log-file and --log-size. Unfortunately, this means we need to duplicate features that are more appropriately implemented by a system logger, such as rotation. Keep that reasonably simple, by using fallocate() with range collapsing where supported (Linux kernel >= 3.15, extent-based ext4 and XFS) and falling back to an unsophisticated block-by-block moving of entries toward the beginning of the file once we reach the (mandatory) size limit. While at it, clarify the role of LOG_EMERG in passt.c. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
* README: Add missing parenthesis in Try It sectionStefano Brivio2022-09-241-1/+1
| | | | Signed-off-by: Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Drop excess whitespace in Try It sectionStefano Brivio2022-09-241-2/+2
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Add legend for Features sectionStefano Brivio2022-09-241-0/+3
| | | | | | | As suggested by David: those emojis might not be entirely obvious. Suggested-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Fix paragraph in Try It section of passtStefano Brivio2022-09-241-3/+4
| | | | | | | The qemu patch isn't mentioned there anymore: replace reference with a link. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Fix indentation in "Try It" sectionStefano Brivio2022-09-241-3/+3
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Point openSUSE links to Dario's OBS repositoryStefano Brivio2022-09-241-4/+4
| | | | | | | ...instead of my Copr. It's also not official yet, but surely more appropriate now. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Fix misspellings of openSUSEStefano Brivio2022-09-241-4/+4
| | | | | | For some reason, I used a capital O everywhere. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Update Availability and Try It sections with new packagesStefano Brivio2022-09-221-25/+32
| | | | | | | | | We now have official packages for Fedora, unofficial (Fedora Copr) for other common RPM-based distributions, and the existing packages with static builds for Debian, and for other RPM-based distributions. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Add link to Copr repositoriesStefano Brivio2022-08-181-0/+8
| | | | | | | These have packages covering all recent versions of CentOS Stream, EPEL, Fedora, Mageia and OpenSUSE Tumbleweed. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* doc: Rewrite demo scriptStefano Brivio2022-08-181-16/+13
| | | | | | | | | | | | | | | The original demo script was written when pasta wasn't a thing yet, so it needed to run as root, set up a veth pair, and configure addresses and routes by itself. Now pasta can do all that for us, and become part of the demo as well. Further, extend it to start qemu, optionally preparing a basic demo image with mbuto (https://mbuto.sh), and execute one logical step at a time, for clarity. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* passt: Allow exit_group() system call in seccomp profilesStefano Brivio2022-07-141-1/+1
| | | | | | | | | | | | We handle SIGQUIT and SIGTERM calling exit(), which is usually implemented with the exit_group() system call. If we don't allow exit_group(), we'll get a SIGSYS while handling SIGQUIT and SIGTERM, which means a misleading non-zero exit code. Reported-by: Wenli Quan <wquan@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2101990 Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Fix links to static buildsStefano Brivio2022-06-081-2/+2
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Fix link to contrib/debianStefano Brivio2022-03-301-1/+1
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Drop red notice about early development phaseStefano Brivio2022-03-301-3/+1
| | | | | | Last famous words: it should be tested enough by now. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* contrib: Add example of Debian package filesStefano Brivio2022-03-301-1/+3
| | | | | | | ...using dh_apparmor to ship and apply AppArmor profiles. Tried on current Debian testing (Bookworm, 12). Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* tap: Allow ioctl() and openat() for tap_ns_tun() re-initialisationStefano Brivio2022-03-301-1/+1
| | | | | | | If the tun interface disappears, we'll call tap_ns_tun() after the seccomp profile is applied: add ioctl() and openat() to it. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* passt, pasta: Add examples of SELinux policy modulesStefano Brivio2022-03-291-0/+2
| | | | | | These should cover any reasonably common use case in distributions. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* treewide: Packet abstraction with mandatory boundary checksStefano Brivio2022-03-291-1/+1
| | | | | | | | | | | | | | | | | | | | Implement a packet abstraction providing boundary and size checks based on packet descriptors: packets stored in a buffer can be queued into a pool (without storage of its own), and data can be retrieved referring to an index in the pool, specifying offset and length. Checks ensure data is not read outside the boundaries of buffer and descriptors, and that packets added to a pool are within the buffer range with valid offset and indices. This implies a wider rework: usage of the "queueing" part of the abstraction mostly affects tap_handler_{passt,pasta}() functions and their callees, while the "fetching" part affects all the guest or tap facing implementations: TCP, UDP, ICMP, ARP, NDP, DHCP and DHCPv6 handlers. Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Update Interfaces and Availability sectionsStefano Brivio2022-03-291-4/+9
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Avoid "here" linksStefano Brivio2022-03-291-20/+19
| | | | | | They look a bit lame: rephrase sentences to avoid them. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* tcp: Rework timers to use timerfd instead of periodic bitmap scanStefano Brivio2022-03-291-3/+1
| | | | | | | | | | | | | | | | | | With a lot of concurrent connections, the bitmap scan approach is not really sustainable. Switch to per-connection timerfd timers, set based on events and on two new flags, ACK_FROM_TAP_DUE and ACK_TO_TAP_DUE. Timers are added to the common epoll list, and implement the existing timeouts. While at it, drop the CONN_ prefix from flag names, otherwise they get quite long, and fix the logic to decide if a connection has a local, possibly unreachable endpoint: we shouldn't go through the rest of tcp_conn_from_tap() if we reset the connection due to a successful bind(2), and we'll get EACCES if the port number is low. Suggested by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Make it somewhat readable on mobile devicesStefano Brivio2022-03-041-35/+84
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* hooks, README: gzipped js snippets, webp alternatives for pngStefano Brivio2022-03-021-2/+10
| | | | | | | | Upload gzipped js snippets for usage with gzip_static in nginx or equivalent. Convert png drawings to webp for smaller size, use them as alternatives in README. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Don't preload CI recording, show poster from end of runStefano Brivio2022-03-011-1/+1
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: s/guest/namespace/ in pasta "Try it" sectionStefano Brivio2022-03-011-1/+1
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* Makefile, hooks: Static target precondition for pkgs, copy .avx2 buildsStefano Brivio2022-03-011-2/+0
| | | | | | Convenience packages are anyway built from static builds. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* passt, pasta: Run-time selection of AVX2 buildStefano Brivio2022-02-281-17/+11
| | | | | | | | | | | | | Build-time selection of AVX2 flags and routines is not practical for distributions, but limiting AVX2 usage to checksum routines with specific run-time detection doesn't allow for easy performance gains from auto-vectorisation of batched packet handling routines. For x86_64, build non-AVX2 and AVX2 binaries, and implement a simple wrapper replacing the current executable with the AVX2 build if it's available, and if AVX2 is supported by the current CPU. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Fix demo div grid layoutStefano Brivio2022-02-231-17/+23
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* demo, ci: Switch to asciinema(1) for terminal recordingsStefano Brivio2022-02-221-14/+38
| | | | | | | | | | For demos, cool-retro-term(1) looked fancier, but several threads of that and ffmpeg(1) are just messing up with performance testing. The CI videos started getting really big as well, and they were difficult to read. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* test: Add demo for Podman with pastaStefano Brivio2022-02-221-3/+8
| | | | | | | | ...showing setup steps, some peculiarities as --net option, and a general side-to-side comparison with slirp4netns(1), including "quick" TCP and UDP throughput and latency benchmarks. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README, hooks: Build HTML man page on push, add a linkStefano Brivio2022-02-211-0/+2
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* passt, pasta: Namespace-based sandboxing, defer seccomp policy applicationStefano Brivio2022-02-211-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To reach (at least) a conceptually equivalent security level as implemented by --enable-sandbox in slirp4netns, we need to create a new mount namespace and pivot_root() into a new (empty) mountpoint, so that passt and pasta can't access any filesystem resource after initialisation. While at it, also detach IPC, PID (only for passt, to prevent vulnerabilities based on the knowledge of a target PID), and UTS namespaces. With this approach, if we apply the seccomp filters right after the configuration step, the number of allowed syscalls grows further. To prevent this, defer the application of seccomp policies after the initialisation phase, before the main loop, that's where we expect bad things to happen, potentially. This way, we get back to 22 allowed syscalls for passt and 34 for pasta, on x86_64. While at it, move #syscalls notes to specific code paths wherever it conceptually makes sense. We have to open all the file handles we'll ever need before sandboxing: - the packet capture file can only be opened once, drop instance numbers from the default path and use the (pre-sandbox) PID instead - /proc/net/tcp{,v6} and /proc/net/udp{,v6}, for automatic detection of bound ports in pasta mode, are now opened only once, before sandboxing, and their handles are stored in the execution context - the UNIX domain socket for passt is also bound only once, before sandboxing: to reject clients after the first one, instead of closing the listening socket, keep it open, accept and immediately discard new connection if we already have a valid one Clarify the (unchanged) behaviour for --netns-only in the man page. To actually make passt and pasta processes run in a separate PID namespace, we need to unshare(CLONE_NEWPID) before forking to background (if configured to do so). Introduce a small daemon() implementation, __daemon(), that additionally saves the PID file before forking. While running in foreground, the process itself can't move to a new PID namespace (a process can't change the notion of its own PID): mention that in the man page. For some reason, fork() in a detached PID namespace causes SIGTERM and SIGQUIT to be ignored, even if the handler is still reported as SIG_DFL: add a signal handler that just exits. We can now drop most of the pasta_child_handler() implementation, that took care of terminating all processes running in the same namespace, if pasta started a shell: the shell itself is now the init process in that namespace, and all children will terminate once the init process exits. Issuing 'echo $$' in a detached PID namespace won't return the actual namespace PID as seen from the init namespace: adapt demo and test setup scripts to reflect that. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* test: Add distribution tests for several architectures and kernel versionsStefano Brivio2022-01-281-2/+2
| | | | | | | | | | | | | | | The new tests check build and a simple case with pasta sending a short message in both directions (namespace to init, init to namespace). Tests cover a mix of Debian, Fedora, OpenSUSE and Ubuntu combinations on aarch64, i386, ppc64, ppc64le, s390x, x86_64. Builds tested starting from approximately glibc 2.19, gcc 4.7, and actual functionality approximately from 4.4 kernels, glibc 2.25, gcc 4.8, all the way up to current glibc/gcc/kernel versions. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Fix link to IGMP/MLD proxy ticketStefano Brivio2022-01-281-1/+1
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Fix anchor for Performance sectionStefano Brivio2022-01-271-1/+1
| | | | | | It shouldn't refer to the subsection under "Features". Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* seccomp: Add a number of alternate and per-arch syscallsStefano Brivio2022-01-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Depending on the C library, but not necessarily in all the functions we use, statx() might be used instead of stat(), getdents() instead of getdents64(), readlinkat() instead of readlink(), openat() instead of open(). On aarch64, it's clone() and not fork(), and dup3() instead of dup2() -- just allow the existing alternative instead of dealing with per-arch selections. Since glibc commit 9a7565403758 ("posix: Consolidate fork implementation"), we need to allow set_robust_list() for fork()/clone(), even in a single-threaded context. On some architectures, epoll_pwait() is provided instead of epoll_wait(), but never both. Same with newfstat() and fstat(), sigreturn() and rt_sigreturn(), getdents64() and getdents(), readlink() and readlinkat(), unlink() and unlinkat(), whereas pipe() might not be available, but pipe2() always is, exclusively or not. Seen on Fedora 34: newfstatat() is used on top of fstat(). syslog() is an actual system call on some glibc/arch combinations, instead of a connect()/send() implementation. On ppc64 and ppc64le, _llseek(), recv(), send() and getuid() are used. For ppc64 only: ugetrlimit() for the getrlimit() implementation, plus sigreturn() and fcntl64(). On s390x, additionally, we need to allow socketcall() (on top of socket()), and sigreturn() also for passt (not just for pasta). Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Feature list, links to lists, bugs, chatStefano Brivio2021-10-231-10/+122
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README, perf_report: Markdown and CSS fixesStefano Brivio2021-10-221-21/+22
| | | | | | Updating md2html on the server needs a few adjustments. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: .. doesn't actually work for comments in MarkdownStefano Brivio2021-10-201-3/+5
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* LICENSES: Add license text files, add missing notices, fix SPDX tagsStefano Brivio2021-10-201-0/+4
| | | | | | | | | | SPDX tags don't replace license files. Some notices were missing and some tags were not according to the SPDX specification, too. Now reuse --lint from the REUSE tool (https://reuse.software/) passes. Reported-by: Martin Hauke <mardnh@gmx.de> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Drop domain part in absolute linksStefano Brivio2021-10-071-25/+25
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Fix pasta anchor in Try it sectionStefano Brivio2021-09-281-1/+1
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Add demo sectionStefano Brivio2021-09-271-0/+15
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: pasta mode, CI, performance, updated links, etc.Stefano Brivio2021-09-271-66/+185
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Source jsStefano Brivio2021-09-181-0/+8
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
* README: Mention the -DDEBUG flagStefano Brivio2021-05-101-0/+5
| | | | Signed-off-by: Stefano Brivio <sbrivio@redhat.com>