| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we ignore all events other than EPOLLIN on UDP sockets. This
means that if we ever receive an EPOLLERR event, we'll enter an infinite
loop on epoll, because we'll never do anything to clear the error.
Luckily that doesn't seem to have happened in practice, but it's certainly
fragile. Furthermore changes in how we handle UDP sockets with the flow
table mean we will start receiving error events.
Add handling of EPOLLERR events. For now we just read the error from the
error queue (thereby clearing the error state) and print a debug message.
We can add more substantial handling of specific events in future if we
want to.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We abort the UDP socket handler if the no_udp flag is set. But if UDP
was disabled we should never have had a UDP socket to trigger the handler
in the first place. If we somehow did, ignoring it here isn't really going
to help because aborting without doing anything is likely to lead to an
epoll loop. The same is the case for the TCP socket and timer handlers and
the no_tcp flag.
Change these checks on the flag to ASSERT()s. Similarly add ASSERT()s to
several other entry points to the protocol specific code which should never
be called if the protocol is disabled.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
| |
Through an oversight this was previously declared as a public function
although it's only used in udp.c and there is no prototype in any header.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we receive datagrams on a socket, we need to split them into batches
depending on how they need to be forwarded (either via a specific splice
socket, or via tap). The logic to do this, is somewhat awkwardly split
between udp_buf_sock_handler() itself, udp_splice_send() and
udp_tap_send().
Move all the batching logic into udp_buf_sock_handler(), leaving
udp_splice_send() to just send the prepared batch. udp_tap_send() reduces
to just a call to tap_send_frames() so open-code that call in
udp_buf_sock_handler().
This will allow separating the batching logic from the rest of the datagram
forwarding logic, which we'll need for upcoming flow table support.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
udp_buf_sock_handler(), udp_splice_send() and udp_tap_send loosely, do four
things between them:
1. Receive some datagrams from a socket
2. Split those datagrams into batches depending on how they need to be
sent (via tap or via a specific splice socket)
3. Prepare buffers for each datagram to send it onwards
4. Actually send it onwards
Split (1) and (3) into specific helper functions. This isn't
immediately useful (udp_splice_prepare(), in particular, is trivial),
but it will make further reworks clearer.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we split our packet frame buffers into different pieces, we have
a single buffer per IP version for the ethernet header, rather than one
per frame. This makes sense since our ethernet header is alwaus the same.
However we initialise those buffers udp[46]_eth_hdr inside a per frame
loop. Pull that outside the loop so we just initialise them once.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The only differences between these arrays are that udp4_l2_iov is
pre-initialised to point to the IPv4 ethernet header, and IPv4 per-frame
header and udp6_l2_iov points to the IPv6 versions.
We already have to set up a bunch of headers per-frame, including updating
udp[46]_l2_iov[i][UDP_IOV_PAYLOAD].iov_len. It makes more sense to adjust
the IOV entries to point at the correct headers for the frame than to have
two complete sets of iovecs.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have separate mmsghdr arrays for splicing IPv4 and IPv6 packets, where
the only difference is that they point to different sockaddr buffers for
the destination address.
Unify these by having the common array point at a sockaddr_inany as the
address. This does mean slightly more work when we're about to splice,
because we need to write the whole socket address, rather than just the
port. However it removes 32 mmsghdr structures and we're going to need
more flexibility constructing that target address for the flow table.
Because future changes might mean that the address isn't always loopback,
change the name of the common address from *_localname to udp_splicename.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make the salient points about these various arrays clearer with renames:
* udp_l2_iov_sock and udp[46]_l2_mh_sock don't really have anything to do
with L2. They are, however, specific to receiving not sending. Rename
to udp_iov_recv and udp[46]_mh_recv.
* udp[46]_l2_iov_tap is redundant - "tap" implies L2 and vice versa.
Rename to udp[46]_l2_iov
* udp[46]_localname are (for now) pre-populated with the local address but
the more salient point is that these are the destination address for the
splice arrays. Rename to udp[46]_splice_to
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
udp_buf_sock_handler() takes the epoll reference from the receiving socket,
and passes the UDP relevant part on to several other functions. Future
changes are going to need several different epoll types for UDP, and to
pass that information through to some of those functions. To avoid extra
noise in the patches making the real changes, change those functions now
to take the full epoll reference, rather than just the UDP part.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
sock_l4() creates a socket of the given IP protocol number, and adds it to
the epoll state. Currently it determines the correct tag for the epoll
data based on the protocol. However, we have some future cases where we
might want different semantics, and therefore epoll types, for sockets of
the same protocol. So, change sock_l4() to take the epoll type as an
explicit parameter, and determine the protocol from that.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cppcheck 2.14 warns that the scope of the rport variable could be
reduced: do that, as reverted commit c80fa6a6bb44 ("udp: Make rport
calculation more local") did, but keep the temporary variable of
in_port_t type, otherwise the sum gets promoted to int.
While at it, add a comment explaining why we calculate rport like
this instead of directly using the sum as array index.
Reported-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit c80fa6a6bb4415ad48f9e11424310875d0d99bc7, as it
reintroduces the issue fixed by commit 1e6f92b995a9 ("udp: Fix 16-bit
overflow in udp_invert_portmap()").
Reported-by: Laurent Jacquot <jk@lutty.net>
Link: https://bugs.passt.top/show_bug.cgi?id=80
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mostly, udp_sock_handler() is independent of how the datagrams it processes
will be forwarded (tap or splice). However, it also updates the msg_name
fields for spliced sends, which doesn't really make sense here. Move it
into udp_splice_send() which is all about spliced sends. This does
potentially mean we'll update the field to the same value several times,
but we're going to need this in future anyway: with the extensions the
flow table allows, it might not be the same value each time after all.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
udp_sock_handler() takes a number of datagrams from sockets that depending
on their addresses could be forwarded either to the L2 interface ("tap")
or to another socket ("spliced"). In the latter case we can also only
send packets together if they have the same source port, and therefore
are sent via the same socket.
To reduce the total number of system calls we gather contiguous batches of
datagrams with the same destination interface and socket where applicable.
The determination of what the target is is made by udp_mmh_splice_port().
It returns the source port for splice packets and -1 for "tap" packets.
We find batches by looking ahead in our queue until we find a datagram
whose "splicefrom" port doesn't match the first in our current batch.
udp_mmh_splice_port() is moderately expensive, and unfortunately we
can call it twice on the same datagram: once as the (last + 1) entry
in one batch (to check it's not in that batch), then again as the
first entry in the next batch.
Avoid this by keeping track of the "splice port" in the metadata structure,
and filling it in one entry ahead of the one we're currently considering.
This is a bit subtle, but not that hard. It will also generalise better
when we have more complex possibilities based on the flow table.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
udp_mmh_splice_port() is used to determine if a UDP datagram can be
"spliced" (forwarded via a socket instead of tap). We only invoke it if
the origin socket has the 'splice' flag set.
Fold the checking of the flag into the helper itself, which makes the
caller simpler. It does mean we have a loop looking for a batch of
spliceable or non-spliceable packets even in the case where the flag is
clear. This shouldn't be that expensive though, since each call to
udp_mmh_splice_port() will return without accessing memory in that case.
In any case we're going to need a similar loop in more cases with upcoming
flow table work.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
As we are going to introduce the MODE_VU that will act like
the mode MODE_PASST, compare to MODE_PASTA rather than to add
a comparison to MODE_VU when we check for MODE_PASST.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
| |
We are going to introduce a variant of the function to use
vhost-user buffers rather than passt internal buffers.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit refactors the udp_update_hdr4() and udp_update_hdr6() functions
to improve code portability by replacing the udp_meta_t parameter with
more specific parameters for the IPv4 and IPv6 headers (iphdr/ipv6hdr)
and the source socket address (sockaddr_in/sockaddr_in6).
It also moves the tap_hdr_update() function call inside the udp_tap_send()
function not to have to pass the TAP header to udp_update_hdr4() and
udp_update_hdr6()
This refactor reduces complexity by making the functions more modular and
ensuring that each function operates on more narrowly scoped data structures.
This will facilitate future backend introduction like vhost-user.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
| |
cppcheck 2.14.1 complains about the rport variable not being in as small
as scope as it could be. It's also only used once, so we might as well
just open code the calculation for it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we have separate arrays for IPv4 and IPv6 which contain the
headers for guest-bound packets, and also the originating socket address.
We can combine these into a single array of "metadata" structures with
space for both pre-cooked IPv4 and IPv6 headers, as well as shared space
for the tap specific header and socket address (using sockaddr_inany).
Because we're using IOVs to separately address the pieces of each frame,
these structures don't need to be packed to keep the headers contiguous
so we can more naturally arrange for the alignment we want.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Currently each tap-bound frame buffer has room for its own ethernet header.
However the ethernet header is always the same for such frames, so now
that we're indirectly referencing the ethernet header via iov, we can use
a single buffer for all of them.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the IPv4 and IPv6 paths unnecessarily use different buffers for
the UDP payload. Now that we're handling the various pieces of the UDP
packets with an iov, we can split the payload part of the buffers off into
its own array shared between IPv4 and IPv6. As well as saving a little
memory, this allows the payload buffers to be neatly page aligned.
With the buffers merged, udp[46]_l2_iov_sock contain exactly the same thing
as each other and can also be merged. Likewise udp[46]_iov_splice can be
merged together.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For IPv4, UDP checksums are optional and can just be set to 0.
udp_update_hdr4() ignores the checksum field entirely. Since these are set
to 0 during startup, this works as intended for now.
However, we'd like to share payload and UDP header buffers betweem IPv4 and
IPv6, which does calculate UDP checksums. Therefore, for robustness, we
should explicitly set the checksum field to 0 for guest-bound UDP packets.
In the tap_udp4_send() slow path, however, we do allow IPv4 UDP checksums
to be calculated as a compile time option. For consistency, use the same
thing in the udp_update_hdr4() path, which will typically initialize to 0,
but calculate a real checksum if configured to do so.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
We're going to introduce more sharing between the IPv4 and IPv6 buffer
structures. Prepare for this by combinng the initialisation functions.
While we're at it remove the misleading "sock" from the name since these
initialise both tap side and sock side structures.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When sending to the tap device, currently we assemble the headers and
payload into a single contiguous buffer. Those are described by a single
struct iovec, then a batch of frames is sent to the device with
tap_send_frames().
In order to better integrate the IPv4 and IPv6 paths, we want the IP
header in a different buffer that might not be contiguous with the
payload. To prepare for that, split the UDP packet into an iovec of
buffers. We use the same split that Laurent recently introduced for
TCP for convenience.
This removes the last use of tap_hdr_len_(), tap_frame_base() and
tap_frame_len(), so remove those too.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Laurent's recent changes mean we use IO vectors much more heavily in the
TCP code. In many of those cases, and few others around the code base,
individual iovs of these vectors are constructed to exactly cover existing
variables or fields. We can make initializing such iovs shorter and
clearer with a macro for the purpose.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At various points we need to track the lengths of a packet including or
excluding various different sets of headers. We don't always use the same
variable names for doing so. Worse in some places we use the same name
for different things: e.g. tcp_fill_headers[46]() use ip_len for the
length including the IP headers, but then tcp_send_flag() which calls it
uses it to mean the IP payload length only.
To improve clarity, standardise on these names:
dlen: L4 protocol payload length ("data length")
l4len: plen + length of L4 protocol header
l3len: l4len + length of IPv4/IPv6 header
l2len: l3len + length of L2 (ethernet) header
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
csum_ip4_header() takes the packet length as a network endian value. In
general it's very error-prone to pass non-native-endian values as a raw
integer. It's particularly bad here because this differs from other
checksum functions (e.g. proto_ipv4_header_psum()) which take host native
lengths.
It turns out all the callers have easy access to the native endian value,
so switch it to use host order like everything else.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some places (well, actually only UDP now) we use struct tap_hdr to
represent both tap backend specific and L2 ethernet headers. Handling
these together seemed like a good idea at the time, but Laurent's changes
in the TCP code working towards vhost-user support suggest that treating
them separately is more useful, more often.
Alter struct tap_hdr to represent only the TAP backend specific headers.
Updated related helpers and the UDP code to match.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit bb9bf0bb ("tcp, udp: Don't precompute port remappings in epoll
references") changed the epoll reference for UDP sockets to include the
bound port as seen by the socket itself, rather than the bound port it
would be translated to on the guest side. As a side effect, it also means
that udp_tap_map[] is indexed by the bound port on the host side, rather
than on the guest side. This is consistent and a good idea, however we
forgot to account for it when finding the correct outgoing socket for
packets originating in the guest. This means that if forwarding UDP
inbound with a port number change, reply packets would be misdirected.
Fix this by applying the reverse mapping before looking up the socket in
udp_tap_handler(). While we're at it, use 'port' directly instead of
'uref.port' in udp_sock_init(). Those now always have the same value -
failing to realise that is the same error as above.
Reported-by: Laurent Jacquot <jk@lutty.net>
Link: https://bugs.passt.top/show_bug.cgi?id=87
Fixes: bb9bf0bb8f57 ("tcp, udp: Don't precompute port remappings in epoll references")
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Paul reports that if pasta is configured with --dns-forward, and the
container queries a resolver which is configured on the host directly,
without using the address given for --dns-forward, we'll translate
the source address of the response pretending it's coming from the
address passed as --dns-forward, and the client will discard the
reply.
That is,
$ cat /etc/resolv.conf
198.51.100.1
$ pasta --config-net --dns-forward 192.0.2.1 nslookup passt.top
will not work, because we change the source address of the reply from
198.51.100.1 to 192.0.2.1. But the client contacted 198.51.100.1, and
it's from that address that it expects an answer.
Add a PORT_DNS_FWD flag for tap-facing ports, which is triggered by
activity in the opposite direction as the other flags. If the
tap-facing port was seen sending a DNS query that was remapped, we'll
remap the source address of the response, otherwise we'll leave it
unaffected.
Reported-by: Paul Holzinger <pholzing@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
|
|
|
|
|
|
|
|
|
|
|
| |
These two functions are typically used to calculate values to go into the
iov_base and iov_len fields of a struct iovec. They don't have to be used
for that, though. Rename them in terms of what they actually do: calculate
the base address and total length of the complete frame, including both L2
and tap specific headers.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
tap_send_frames() takes a vector of buffers and requires exactly one frame
per buffer. We have future plans where we want to have multiple buffers
per frame in some circumstances, so extend tap_send_frames() to take the
number of buffers per frame as a parameter.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[sbrivio: Improve comment to rembufs calculation]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we open code the calculation of the UDP checksum in
udp_update_hdr6(). We calling a helper to handle the IPv6 pseudo-header,
and preset the checksum field to 0 so an uninitialised value doesn't get
folded in. We already have a helper to do this: csum_udp6() which we use
in some slow paths. Use it here as well.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
| |
We carry around the source address as a pointer to a constant struct
in_addr. But it's silly to carry around a 4 or 8 byte pointer to a 4 byte
IPv4 address. Just copy the IPv4 address around by value.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The order of things in these functions is a bit odd for historical reasons.
We initialise some IP header fields early, the more later after making
some tests. Likewise we declare some variables without initialisation,
but then unconditionally set them to values we could calculate at the
start of the function.
Previous cleanups have removed the reasons for some of these choices, so
reorder for clarity, and where possible move the first assignment into an
initialiser.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
These functions take an index to the L2 buffer whose header information to
update. They use that for two things: to locate the buffer pointer itself,
and to retrieve the length of the received message from the paralllel
udp[46]_l2_mh_sock array. The latter is arguably a failure to separate
concerns. Change these functions to explicitly take a buffer pointer and
payload length as parameters.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
| |
In these functions we have 'dstport' for the destination port, but
'src_port' for the source port. Change the latter to 'srcport' for
consistency.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each of these functions have 3 essentially identical loops in a row.
Merge the loops into a single common udp_sock_iov_init() function, calling
udp_sock[46]_iov_init_one() helpers to initialize each "slot" in the
various parallel arrays. This is slightly neater now, and more naturally
allows changes we want to make where more initialization will become common
between IPv4 and IPv6.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Use ethhdr rather than tap_hdr.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-ID: <20240303135114.1023026-9-lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The TCP and UDP checksums are computed using the data in the TCP/UDP
payload but also some informations in the IP header (protocol,
length, source and destination addresses).
We add two functions, proto_ipv4_header_psum() and
proto_ipv6_header_psum(), to compute the checksum of the IP
header part.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-ID: <20240303135114.1023026-8-lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can find the same function to compute the IPv4 header
checksum in tcp.c, udp.c and tap.c
Use the function defined for tap.c, csum_ip4_header(), but
with the code used in tcp.c and udp.c as it doesn't need a fully
initialiazed IPv4 header, only protocol, tot_len, saddr and daddr.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-ID: <20240303135114.1023026-7-lvivier@redhat.com>
[dwg: Fix weird cppcheck regression; it appears to be a problem
in pre-existing code, but somehow this patch is exposing it]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in udp_update_hdr4():
Assign the source address to src, either b->s_in.sin_addr,
c->ip4.dns_match or c->ip4.gw and then set b->iph.saddr to src->s_addr.
in udp_update_hdr6():
Assign the source address to src, either b->s_in6.sin6_addr,
c->ip6.dns_match, c->ip6.gw or c->ip6.addr_ll.
Assign the destination to dst, either c->ip6.addr_seen or
&c->ip6.addr_ll_seen.
Then set dst to b->ip6h.daddr and src to b->ip6h.saddr.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-ID: <20240303135114.1023026-6-lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce ip.[ch] file to encapsulate IP protocol handling functions and
structures. Modify various files to include the new header ip.h when
it's needed.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-ID: <20240303135114.1023026-5-lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently port_fwd.[ch] contains helpers related to port forwarding,
particular automatic port forwarding. We're planning to allow much more
flexible sorts of forwarding, including both port translation and NAT based
on the flow table. This will subsume the existing port forwarding logic,
so rename port_fwd.[ch] to fwd.[ch] with matching updates to all the names
within.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The epoll references for both TCP listening sockets and UDP sockets
includes a port number. This gives the destination port that traffic
to that socket will be sent to on the other side. That will usually
be the same as the socket's bound port, but might not if the -t, -u,
-T or -U options are given with different original and forwarded port
numbers.
As we move towards a more flexible forwarding model for passt, it's
going to become possible for that destination port to vary depending
on more things (for example the source or destination address). So,
it will no longer make sense to have a fixed value for a listening
socket.
Change to simpler semantics where this field in the reference gives
the bound port of the socket. We apply the translations to the
correct destination port later on, when we're actually forwarding.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our inany_addr type is used in some places to represent either IPv4 or
IPv6 addresses, and we plan to use it more widely. We don't yet
provide constants of this type for special addresses (loopback and
"any"). Add some of these, both the IPv4 and IPv6 variants of those
addresses, but typed as union inany_addr.
To avoid actually adding more things to .data we can use some macros and
casting to overlay the IPv6 versions of these with the standard library's
in6addr_loopback and in6addr_any. For the IPv4 versions we need to create
new constant globals.
For complicated historical reasons, the standard library doesn't
provide constants for IPv4 loopback and any addresses as struct
in_addr. It just has macros of type in_addr_t == uint32_t, which has
some gotchas w.r.t. endianness. We can use some more macros to
address this lack, using macros to effectively create these IPv4
constants as pieces of the inany constants above.
We use this last to avoid some awkward temporary variables just used
to get an address of an IPv4 loopback address.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
If the configured output address is unspecified, we don't set the bind
address to it when creating a new socket in udp_tap_handler(). That sounds
sensible, but what we're leaving the bind address as is, exactly, the
unspecified address, so this test makes no difference. Remove it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
When forwarding IPv4 packets in udp_tap_handler(), we incorrectly use an
IPv6 address test on our IPv4 address (which could cause an out of bounds
access), and possibly set our bind interface to the IPv6 interface based on
it. Adjust to correctly look at the IPv4 address and IPv4 interface.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
|