| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the TCP throughput tests, we adjust the guest's MTU in order to test
various packet sizes. Some of those are below 1280 which causes IPv6 to
be deconfigured on the guest interface. When we increase it above 1280
again, IPv6 is re-enabled and we get an address in the right prefix with
NDP, but we don't get exactly the expected address back - that's only
communicated with --config-net or DHCPv6.
With changes to how we handle NAT this can cause some of the IPv6 tests to
fail, because they don't use the address that passt/pasta expects, and the
guest doesn't initiate any traffic which allows us to learn what the new
address is.
Work around this by re-invoking dhclient -6 between adjusting the MTU and
running IPv6 test cases.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The @gw fields in the ip4_ctx and ip6_ctx give the (host's) default
gateway. We use this for two quite distinct things: advertising the
gateway that the guest should use (via DHCP, NDP and/or --config-net)
and for a limited form of NAT. So that the guest can access services
on the host, we map the gateway address within the guest to the
loopback address on the host.
Using the gateway address for this isn't necessarily the best choice
for this purpose, certainly not for all circumstances. So, start off
by splitting the notion of these into two different values: @guest_gw
which is the gateway address the guest should use and @nat_host_loopback,
which is the guest visible address to remap to the host's loopback.
Usually nat_host_loopback will have the same value as guest_gw. However
when --no-map-gw is specified we leave them unspecified instead. This
means when we use nat_host_loopback, we don't need to separately check
c->no_map_gw to see if it's relevant.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When sending frames to the guest over the tap link, we need a source MAC
address. Currently we take that from the MAC address of the main interface
on the host, but that doesn't actually make much sense:
* We can't preserve the real MAC address of packets from anywhere
external so there's no transparency case here
* In fact, it's confusingly different from how we handle IP addresses:
whereas we give the guest the same IP as the host, we're making the
host's MAC the one MAC that the guest *can't* use for itself.
* We already need a fallback case if the host doesn't have an Ethernet
like MAC (e.g. if it's connected via a point to point interface, such
as a wireguard VPN).
Change to just just use an arbitrary fixed MAC address - I've picked
9a:55:9a:55:9a:55. It's simpler and has the small advantage of making
the fact that passt/pasta is in use typically obvious from guest side
packet dumps. This can still, of course, be overridden with the -M option.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ip4.gw conflates 3 conceptually different things, which (for now) have the
same value:
1. The router/gateway address as seen by the guest
2. An address to NAT to the host with --no-map-gw isn't specified
3. An address to use as source when nothing else makes sense
Case 3 occurs in two situations:
a) for our DHCP responses - since they come from passt internally there's
no naturally meaningful address for them to come from
b) for forwarded connections coming from an address that isn't guest
accessible (localhost or the guest's own address).
(b) occurs even with --no-map-gw, and the expected behaviour of forwarding
local connections requires it.
For IPv6 role (3) is now taken by ip6.our_tap_ll (which usually has the
same value as ip6.gw). For future flexibility we may want to make this
"address of last resort" different from the gateway address, so split them
logically for IPv4 as well.
Specifically, add a new ip4.our_tap_addr field for the address with this
role, and initialise it to ip4.gw for now. Unlike IPv6 where we can always
get a link-local address, we might not be able to get a (non 0.0.0.0)
address here (e.g. if the host is disconnected or only has a point to point
link with no gateway address). In that case we have to disable forwarding
of inbound connections with guest-inaccessible source addresses.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We usually avoid NAT, but in a few cases we need to apply address
translations. For inbound connections that happens for addresses which
make sense to the host but are either inaccessible, or mean a different
location from the guest's point of view.
Add some helper functions to determine such addresses, and use them in
fwd_nat_from_host(). In doing so clarify some of the reasons for the
logic. We'll also have further use for these helpers in future.
While we're there fix one unneccessary inconsistency between IPv4 and IPv6.
We always translated the guest's observed address, but for IPv4 we didn't
translate the guest's assigned address, whereas for IPv6 we did. Change
this to translate both in all cases for consistency.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
| |
In every place we use our_tap_ll, we only use it as a fallback if the
IPv6 gateway address is not link-local. We can avoid that conditional at
use time by doing it at initialisation of our_tap_ll instead.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Some are guest visible addresses and may not be valid on the host, others
are host visible addresses and may not be valid on the guest. Rearrange
and comment the ip[46]_ctx definitions to make it clearer which is which.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
| |
c->ip6.addr_ll is not like c->ip6.addr. The latter is an address for the
guest, but the former is an address for our use on the tap link. Rename it
accordingly, to 'our_tap_ll'.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When binding an IPv6 socket in sock_l4() we need to supply a scope id
if the address is link-local. We check for this by comparing the
given address to c->ip6.addr_ll. This is correct only by accident:
while c->ip6.addr_ll is typically set to the host interface's link
local address, the actual purpose of it is to provide a link local
address for passt's private use on the tap interface.
Instead set the scope id for any link-local address we're binding to.
We're going to need something and this is what makes sense for sockets
on the host. It doesn't make sense for PIF_SPLICE sockets, but those
should always have loopback, not link-local addresses.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Despite the names, addr_ll_seen does not relate to addr_ll the same
way addr_seen relates to addr. addr_ll_seen is an observed address
from the guest, whereas addr_ll is *our* link-local address for use on
the tap link when we can't use an external endpoint address. It's
used both for passt provided services (DHCPv6, NDP) and in some cases
for connections from addresses the guest can't access.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Although it's not 100% explicit in the man page, addresses given to
the --dns option are intended to be addresses as seen by the guest.
This differs from addresses taken from the host's /etc/resolv.conf,
which must be translated to guest accessible versions in some cases.
Our implementation is currently inconsistent on this: when using
--dns-forward, you must usually also give --dns with the matching address,
which is meaningful only in the guest's address view. However if you give
--dns with a loopback addres, it will be translated like a host view
address.
Move the remapping logic for DNS addresses out of add_dns4() and add_dns6()
into add_dns_resolv() so that it is only applied for host nameserver
addresses, not for nameservers given explicitly with --dns.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
add_dns6() (but not add_dns4()) has a bug setting dns_match: it sets it to
the given address, rather than the gateway address. This is doubly wrong:
- We've just established the given address is a host loopback address
the guest can't access
- We've just set ip6.dns[] to tell the guest to use the gateway address,
so it won't use the dns_match address we're setting
Correct this to use the gateway address, like IPv4.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
| |
get_dns() is already quite deeply nested, and future changes I have in
mind will add more complexity. Prepare for this by splitting out the
adding of a single nameserver to the configuration into its own function.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Every time we call add_dns[46] we need to first check if there's space in
the c->ip[46].dns array for the new entry. We might as well make that
check in add_dns[46]() itself.
In fact it looks like the calls in get_dns() had an off by one error, not
allowing the last entry of the array to be filled. So, that bug is also
fixed by the change.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
get_dns() counts the number of guest DNS servers it adds, and gives an
error if it couldn't add any. However, this count ignores the fact that
add_dns[46]() may in some cases *not* add an entry. Use the array indices
we're already tracking to get an accurate count.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently add_dns[46]() take a somewhat awkward double pointer to the
entry in the c->ip[46].dns array to update. It turns out to be easier to
work with indices into that array instead.
This diff does add some lines, but it's comments, and will allow some
future code reductions.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
We rely on C11 already, so we can use clearer and more type-checkable
struct assignment instead of mempcy() for copying IP addresses around.
This exposes some "pointer could be const" warnings from cppcheck, so
address those too.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
c->mac isn't a great name, because it doesn't say whose mac address it is
and it's not necessarily obvious in all the contexts we use it. Since this
is specifically the address that we (passt/pasta) use on the tap interface,
rename it to "our_tap_mac". Rename the "mac_guest" field to "guest_mac"
to be grammatically consistent.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
| |
There are a couple of places where we somewhat messily open code formatting
an Ethernet like MAC address for display. Add an eth_ntop() helper for
this.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The term "forwarding address" to indicate the local-to-passt address was
well-intentioned, but ends up being kinda confusing. As discussed on a
recent call, let's try "our" instead.
(While we're there correct an error in flow_initiate_af()s comments where
we referred to parameters by the wrong name).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
| |
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As soon as we the kernel notifier for IPv6 address configuration
(addrconf_notify()) sees that we bring the target interface up
(NETDEV_UP), it will schedule duplicate address detection, so, by
itself, setting the nodad flag later is useless, because that won't
stop a detection that's already in progress.
However, if we disable neighbour solicitations with IFF_NOARP (which
is a misnomer for IPv6 interfaces, but there's no possibility of
mixing things up), the notifier will not trigger DAD, because it can't
be done, of course, without neighbour solicitations.
Set IFF_NOARP as we bring up the device, and drop it after we had a
chance to set the nodad attribute on the link.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
| |
As soon as we bring up the interface, the Linux kernel will set up a
link-local address for it, so we can fetch it and start using right
away, if we need a link-local address to communicate to the container
before we see any traffic coming from it.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It makes no sense for a container or a guest to try and perform
duplicate address detection for their link-local address, as we'll
anyway not relay neighbour solicitations with an unspecified source
address.
While they perform duplicate address detection, the link-local address
is not usable, which prevents us from bringing up especially
containers and communicate with them right away via IPv6.
This is not enough to prevent DAD and reach the container right away:
we'll need a couple more patches.
As we send NLM_F_REPLACE requests right away, while we still have to
read out other addresses on the same socket, we can't use nl_do():
keep track of the last sequence we sent (last address we changed), and
deal with the answers to those NLM_F_REPLACE requests in a separate
loop, later.
Link: https://github.com/containers/podman/pull/23561#discussion_r1711639663
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
| |
In the next patches, we'll reuse it to set flags other than IFF_UP.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
| |
As we'll use nl_link_up() for more than just bringing up devices, it
will become awkward to carry empty MTU values around whenever we call
it.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
| |
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have a number of delays when we switch to new layouts that were
added to make the tests visually easier to follow, together with
blinking status bars. Shorten the delays and avoid blinking the
status bar if $FAST is set to 1 (no demo mode).
Shorten delays in busy loops to 10ms, instead of 100ms, and skip the
one-second fixed delay when we wait for the status of a command.
Cut the duration of throughput and latency tests to one second, down
from ten. Somewhat surprisingly, the results we get are rather
consistent, and not significantly different from what we'd get with
10 seconds.
This, together with Podman's commit 20f3e8909e3a ("test/system:
pasta_test_do add explicit port check"), cuts the time needed on my
setup for full test run from approximately 37 minutes to...:
$ time ./run
[exited]
PASS: 165, FAIL: 0
Log at /home/sbrivio/passt/test/test_logs/test.log
real 15m34.253s
user 0m0.011s
sys 0m0.011s
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using a zero port on TCP or UDP is dubious, and we can't really deal with
forwarding such a flow within the constraints of the socket API. Hence
we ASSERT()ed that we had non-zero ports in flow_hash().
The intention was to make sure that the protocol code sanitizes such ports
before completing a flow entry. Unfortunately, flow_hash() is also called
on new packets to see if they have an existing flow, so the unsanitized
guest packet can crash passt with the assert.
Correct this by moving the assert from flow_hash() to flow_sidx_hash()
which is only used on entries already in the table, not on unsanitized
data.
Reported-by: Matt Hamilton <matt@thmail.io>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
f6d5a5239264 moved handling of -D into a later loop. However as a side
effect it moved this from a switch block to an if block. I left a couple
of 'break' statements that don't make sense in the new context. They
should be 'continue' so that we go onto the next option, rather than
leaving the loop entirely.
Fixes: f6d5a5239264 ("conf: Delay handling -D option until after addresses are configured")
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Add structs for NA, RA, NS, MTU, prefix info, option header,
link-layer address, RDNSS, DNSSL and link-layer for RA message.
- Turn NA message from purely imperative, going byte by byte,
to declarative by filling it's struct.
- Turn part of RA message into declarative.
- Move packet_add() to be before the call of ndp() in tap6_handler()
if the protocol of the packet is ICMPv6.
- Add a pool of packets as an additional parameter to ndp().
- Check the size of NS packet with packet_get() before sending an NA
packet.
- Add documentation for the structs.
- Add an enum for NDP option types.
Link: https://bugs.passt.top/show_bug.cgi?id=21
Signed-off-by: AbdAlRahman Gad <abdobngad@gmail.com>
[sbrivio: Minor coding style fixes]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
add_dns[46]() rely on the gateway address and c->no_map_gw being already
initialised, in order to properly handle DNS servers which need NAT to be
accessed from the guest.
Usually these are called from get_dns() which is well after the addresses
are configured, so that's fine. However, they can also be called earlier
if an explicit -D command line option is given. In this case no_map_gw
and/or c->ip[46].gw may not get be initialised properly, leading to this
doing the wrong thing.
Luckily we already have a second pass of option parsing for things which
need addresses to already be configured. Move handling of -D to there.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
These fields are described as being an address for an external, routable
interface. That's not necessarily the case when using -a. But, more
importantly, saying where the value comes from is not as useful as what
it's used for. The real purpose of this field is as the address which we
assign to the guest via DHCP or --config-net.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we prefix the second part of messages printed through
logmsg_perror() by the timestamp, on debug, we'll have two timestamps
and a weird separator in the result, such as this beauty:
0.0013: Failed to clone process with detached namespaces0.0013: : Operation not permitted
Add a parameter to logmsg() and vlogmsg() which indicates a message
continuation. If that's set, don't print the timestamp in vlogmsg().
Link: https://github.com/moby/moby/issues/48257#issuecomment-2282875092
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Given that pasta supports specifying a command to be executed on the
command line, even without the usual -- separator as long as there's
no ambiguity, we shouldn't eat up options that are not meant for us.
Paul reports, for instance, that with:
pasta --config-net ip -6 route
-6 is taken by pasta to mean --ipv6-only, and we execute 'ip route'.
That's because getopt_long(), by default, shuffles the argument list
to shift non-option arguments at the end.
Avoid that by adding '+' at the beginning of 'optstring'.
Reported-by: Paul Holzinger <pholzing@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a parent accidentally or due to implementation reasons leaks any
open file, we don't want to have access to them, except for the file
passed via --fd, if any.
This is the case for Podman when Podman's parent leaks files into
Podman: it's not practical for Podman to close unrelated files before
starting pasta, as reported by Paul.
Use close_range(2) to close all open files except for standard streams
and the one from --fd.
Given that parts of conf() depend on other files to be already opened,
such as the epoll file descriptor, we can't easily defer this to a
more convenient point, where --fd was already parsed. Introduce a
minimal, duplicate version of --fd parsing to keep this simple.
As we need to check that the passed --fd option doesn't exceed
INT_MAX, because we'll parse it with strtol() but file descriptor
indices are signed ints (regardless of the arguments close_range()
take), extend the existing check in the actual --fd parsing in conf(),
also rejecting file descriptors numbers that match standard streams,
while at it.
Suggested-by: Paul Holzinger <pholzing@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Particularly in shell it's sometimes natural to save the pid from a process
run and later kill it. If doing this with nstool exec, however, it will
kill nstool itself, not the program it is running, which isn't usually what
you want or expect.
Address this by having nstool propagate SIGTERM to its child process. It
may make sense to propagate some other signals, but some introduce extra
complications, so we'll worry about them when and if it seems useful.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
| |
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We use logtime() to get a timestamp for the log in two places:
- in vlogmsg(), which is used only for debug_print messages
- in logfile_write() which is only used messages to the log file
These cases are mutually exclusive, so we don't ever print the same message
with different timestamps, but that's not particularly obvious to see.
It's possible future tweaks to logging logic could mean we log to two
different places with different timestamps, which would be confusing.
Refactor to have a single logtime() call in vlogmsg() and use it for all
the places we need it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
clock_gettime() can, theoretically, fail, although it probably won't until
2038 on old 32-bit systems. Still, it's possible someone could run with
a wildly out of sync clock, or new errors could be added, or it could fail
due to a bug in libc or the kernel.
We don't handle this well. In the debug_print case in vlogmsg we'll just
ignore the failure, and print a timestamp based on uninitialised garbage.
In logfile_write() we exit early and won't log anything at all, which seems
like a good way to make an already weird situation undebuggable.
Add some helpers to instead handle this by using "<error>" in place of a
timestamp if something goes wrong with clock_gettime().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
logtime_fmt_and_arg() is a rather odd macro, producing both a format
string and an argument, which can only be used in quite specific printf()
like formulations. It also has a significant bug: it tries to display 4
digits after the decimal point (so down to tenths of milliseconds) using
%04i. But the field width in printf() is always a *minimum* not maximum
field width, so this will not truncate the given value, but will redisplay
the entire tenth-of-milliseconds difference again after the decimal point.
Replace the macro with an snprintf() like function which will format the
timestamp, and use an explicit % to correct the display.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[sbrivio: Make logtime_fmt() static]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The comment for timespec_diff_us() claims it will wrap after 2^64µs. This
is incorrect for two reasons:
* It returns a long long, which is probably 64-bits, but might not be
* It returns a signed value, so even if it is 64 bits it will wrap after
2^63µs
Correct the comment and use an explicitly 64-bit type to avoid that
imprecision.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
version only
Paul reports that setting IPv4 address and gateway manually, using
--address and --gateway, causes pasta to fail inserting IPv6 routes
in a setup where multiple, inter-dependent IPv6 routes are present
on the host.
That's because, currently, any -g option implies --no-copy-routes
altogether, and any -a implies --no-copy-addrs.
Limit this implication to the matching IP version, instead, by having
two copies of no_copy_routes and no_copy_addrs in the context
structure, separately for IPv4 and IPv6.
While at it, change them to 'bool': we had them as 'int' because
getopt_long() used to set them directly, but it hasn't been the case
for a while already.
Reported-by: Paul Holzinger <pholzing@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are two cases where we want to stop printing to stderr: if it's
closed, and if pasta spawned a shell (and --debug wasn't given).
But if passt is running in foreground, we currently stop to report any
message, even error messages, once we're ready, as reported by
Laurent, because we set the log_runtime flag, which we use to indicate
we're ready, regardless of whether we're running in foreground or not.
Turn that flag (back) to log_stderr, and set it only when we really
want to stop printing to stderr.
Reported-by: Laurent Vivier <lvivier@redhat.com>
Fixes: afd9cdc9bb48 ("log, passt: Always print to stderr before initialisation is complete")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the "from" (input) side for a given transfer is 0, and we can't
complete the write right away, what we need to be waiting for is for
output readiness on side 1, not 0, and the other way around as well.
This causes random transfer failures for local TCP connections,
depending if we ever need to wait for output readiness.
Reported-by: Paul Holzinger <pholzing@redhat.com>
Link: https://github.com/containers/podman/issues/23517
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Tested-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "correct" type for the length of an IOV is unclear: writev() and
readv() use an int, but sendmsg() and recvmsg() use a size_t. Using the
unsigned size_t has some advantages, though, and it makes more sense for
the case of write_remainder. Using size_t throughout here means we don't
have a signed vs. unsigned comparison, and we don't have to deal with
the case of iov_skip_bytes() returning a value which becomes negative
when assigned to an integer.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
No code change.
They need to be exported to be available by the vhost-user version of
passt.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
To be used with the vhost-user version of udp.c, we need to export the
udp_flow functions. To avoid to export udp_meta_t too that is specific
to the socket version of udp.c, don't pass udp_meta_t to it,
but the only needed field, s_in.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
logfile_write() is not used outside log.c, nor should it be. It should
only be used externall via the general logging functions. Make it static
in log.c. To avoid forward declarations this requires moving a bunch of
functions earlier in the file.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ed reported this:
# Error: pasta failed with exit code 1:
# Couldn't drop cap 3 from bounding set
# : No child processes
in a Podman CI run with tests being run in parallel. The error message
itself, by the way, is fixed by commit 1cd773081f12 ("log: Drop
newlines in the middle of the perror()-like messages"), but how can we
possibly get ECHILD as failure code for prctl()?
Well, we don't, but if we exit early enough, pasta_child_handler()
might run before we're even done with isolation steps, and it calls
waitid(), which sets errno. We need to restore it before returning
from the signal handler (if we return after calling functions that
might set it), as signal-safety(7) also implies:
Fetching and setting the value of errno is async-signal-safe
provided that the signal handler saves errno on entry and
restores its value before returning.
Eventually, we'll probably need to switch to signalfd(2) the day we
want to implement multithreading, but this will do for the moment.
Reported-by: Ed Santiago <santiago@redhat.com>
Link: https://github.com/containers/podman/issues/23478
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Paul Holzinger <pholzing@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|