<feed xmlns='http://www.w3.org/2005/Atom'>
<title>passt, branch 2024_11_27.c0fbc7e</title>
<subtitle>Plug A Simple Socket Transport</subtitle>
<link rel='alternate' type='text/html' href='https://passt.top/passt/'/>
<entry>
<title>dhcp: Honour broadcast flag (RFC 2131, 4.1)</title>
<updated>2024-11-27T04:37:28+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2024-11-24T23:52:57+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=c0fbc7ef2ae2919bf6162b4149d341f448289836'/>
<id>c0fbc7ef2ae2919bf6162b4149d341f448289836</id>
<content type='text'>
It's widely considered a legacy option nowadays, and I've haven't seen
clients setting it since Windows 95, but it's convenient for a minimal
DHCP client not using raw IP sockets such as what I'm playing with for
muvm.

Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's widely considered a legacy option nowadays, and I've haven't seen
clients setting it since Windows 95, but it's convenient for a minimal
DHCP client not using raw IP sockets such as what I'm playing with for
muvm.

Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dhcp: Introduce support for Rapid Commit (option 80, RFC 4039)</title>
<updated>2024-11-27T04:37:28+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2024-11-15T17:18:22+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=9da2038485c9334d28df34d2ebd5ba04a3c7662d'/>
<id>9da2038485c9334d28df34d2ebd5ba04a3c7662d</id>
<content type='text'>
I'm trying to speed up and simplify IP address acquisition in muvm.

Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I'm trying to speed up and simplify IP address acquisition in muvm.

Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dhcp: Use -1 as "missing option" length instead of 0</title>
<updated>2024-11-27T04:37:28+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2024-11-15T17:13:17+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=d6e9e2486f092901207e6565f5eee3817cf4e11a'/>
<id>d6e9e2486f092901207e6565f5eee3817cf4e11a</id>
<content type='text'>
We want to add support for option 80 (Rapid Commit, RFC 4039), whose
length is 0.

Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We want to add support for option 80 (Rapid Commit, RFC 4039), whose
length is 0.

Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>treewide: Introduce 'local mode' for disconnected setups</title>
<updated>2024-11-27T04:16:38+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2024-11-22T06:57:43+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=14b84a7f077ecb734bb0e724f70bafeaa6d35a61'/>
<id>14b84a7f077ecb734bb0e724f70bafeaa6d35a61</id>
<content type='text'>
There are setups where no host interface is available or configured
at all, intentionally or not, temporarily or not, but users expect
(Podman) containers to run in any case as they did with slirp4netns,
and we're now getting reports that we broke such setups at a rather
alarming rate.

To this end, if we don't find any usable host interface, instead of
exiting:

- for IPv4, use 169.254.2.1 as guest/container address and 169.254.2.2
  as default gateway

- for IPv6, don't assign any address (forcibly disable DHCPv6), and
  use the *first* link-local address we observe to represent the
  guest/container. Advertise fe80::1 as default gateway

- use 'tap0' as default interface name for pasta

Change ifi4 and ifi6 in struct ctx to int and accept a special -1
value meaning that no host interface was selected, but the IP family
is enabled. The fact that the kernel uses unsigned int values for
those is not an issue as 1. one can't create so many interfaces
anyway and 2. we otherwise handle those values transparently.

Fix a botched conditional in conf_print() to actually skip printing
DHCPv6 information if DHCPv6 is disabled (and skip printing NDP
information if NDP is disabled).

Link: https://github.com/containers/podman/issues/24614
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are setups where no host interface is available or configured
at all, intentionally or not, temporarily or not, but users expect
(Podman) containers to run in any case as they did with slirp4netns,
and we're now getting reports that we broke such setups at a rather
alarming rate.

To this end, if we don't find any usable host interface, instead of
exiting:

- for IPv4, use 169.254.2.1 as guest/container address and 169.254.2.2
  as default gateway

- for IPv6, don't assign any address (forcibly disable DHCPv6), and
  use the *first* link-local address we observe to represent the
  guest/container. Advertise fe80::1 as default gateway

- use 'tap0' as default interface name for pasta

Change ifi4 and ifi6 in struct ctx to int and accept a special -1
value meaning that no host interface was selected, but the IP family
is enabled. The fact that the kernel uses unsigned int values for
those is not an issue as 1. one can't create so many interfaces
anyway and 2. we otherwise handle those values transparently.

Fix a botched conditional in conf_print() to actually skip printing
DHCPv6 information if DHCPv6 is disabled (and skip printing NDP
information if NDP is disabled).

Link: https://github.com/containers/podman/issues/24614
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>test: Improve logic for waiting for SLAAC &amp; DAD to complete in NDP tests</title>
<updated>2024-11-26T07:30:18+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2024-11-26T03:27:27+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=c6e61064139ba94a763097144d1a84bd4fbafade'/>
<id>c6e61064139ba94a763097144d1a84bd4fbafade</id>
<content type='text'>
Since 9a0e544f05bf the NDP tests attempt to explicitly wait for DAD to
complete, rather than just having a hard coded sleep.  However, the
conditions we use are a bit sloppy and allow for a number of possible cases
where it might not work correctly.  Stefano seems to be hitting one of
these (though I'm not sure which) with some later patches.

 - We wait for *lack* of a tentative address, so if the first check occurs
   before we have even a tentative address it will bypass the delay
 - It's not entirely clear if the permanent address will always appear
   as soon as the tentative address disappears
 - We weren't filtering on interface
 - We were doing the filtering with ip-address options rather than in jq.
   However in at least in some circumstances this seems to result in an
   empty .addr_info field, rather than omitting it entirely, which could
   cause us to get the wrong result

So, instead, explicitly wait for the address we need to be present: an
RA provided address on the external interface.  While we're here we remove
the requirement that it have global scope: the "kernel_ra" check is already
sufficient to make sure this address comes from an NDP RA, not something
else.  If it's not the global scope address we expect, better to check it
and fail, rather than keep waiting.

Fixes: 9a0e544f05bf ("test: Improve test for NDP assigned prefix")
Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since 9a0e544f05bf the NDP tests attempt to explicitly wait for DAD to
complete, rather than just having a hard coded sleep.  However, the
conditions we use are a bit sloppy and allow for a number of possible cases
where it might not work correctly.  Stefano seems to be hitting one of
these (though I'm not sure which) with some later patches.

 - We wait for *lack* of a tentative address, so if the first check occurs
   before we have even a tentative address it will bypass the delay
 - It's not entirely clear if the permanent address will always appear
   as soon as the tentative address disappears
 - We weren't filtering on interface
 - We were doing the filtering with ip-address options rather than in jq.
   However in at least in some circumstances this seems to result in an
   empty .addr_info field, rather than omitting it entirely, which could
   cause us to get the wrong result

So, instead, explicitly wait for the address we need to be present: an
RA provided address on the external interface.  While we're here we remove
the requirement that it have global scope: the "kernel_ra" check is already
sufficient to make sure this address comes from an NDP RA, not something
else.  If it's not the global scope address we expect, better to check it
and fail, rather than keep waiting.

Fixes: 9a0e544f05bf ("test: Improve test for NDP assigned prefix")
Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ndp: Don't send first periodic router advertisement right after guest connects</title>
<updated>2024-11-26T07:30:18+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2024-11-25T07:50:39+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=cda7f160f091515770a103765d50bac0f136faef'/>
<id>cda7f160f091515770a103765d50bac0f136faef</id>
<content type='text'>
This is very visible with muvm, but it also happens with QEMU: we're
sending the first unsolicited router advertisement milliseconds after
the guest connects.

That's usually pointless because, when the hypervisor connects, the
guest is typically not ready yet to process anything of that sort:
it's still booting. And if we happen to send it late enough (still
milliseconds), with muvm, while the message is discarded, it
sometimes (slightly) delays the response to the first solicited
router advertisement, which is the one we need to have coming fast.

Skip sending the unsolicited advertisement on the first timer run,
just calculate the next delay. Keep it simple by observing that we're
probably not trying to reach the 1970s with IPv6.

Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is very visible with muvm, but it also happens with QEMU: we're
sending the first unsolicited router advertisement milliseconds after
the guest connects.

That's usually pointless because, when the hypervisor connects, the
guest is typically not ready yet to process anything of that sort:
it's still booting. And if we happen to send it late enough (still
milliseconds), with muvm, while the message is discarded, it
sometimes (slightly) delays the response to the first solicited
router advertisement, which is the one we need to have coming fast.

Skip sending the unsolicited advertisement on the first timer run,
just calculate the next delay. Keep it simple by observing that we're
probably not trying to reach the 1970s with IPv6.

Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>test/perf: Select a single IPv6 namespace address in pasta tests</title>
<updated>2024-11-26T07:30:18+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2024-11-25T10:53:10+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=2bf8ffcf078c5933e6a31dbffbfb4dc31bfd7bc5'/>
<id>2bf8ffcf078c5933e6a31dbffbfb4dc31bfd7bc5</id>
<content type='text'>
By dropping the filter on prefix length, commit 910f4f910301
("test: Don't require 64-bit prefixes in perf tests") broke tests on
setups where two global unicast IPv6 addresses are available, which
is the typical case when the "host" is a VM running under passt with
addresses from SLAAC and DHCPv6, because two addresses will be
returned.

Pick the first one instead. We don't really care about the prefix
length, any of these addresses will work.

Fixes: 910f4f910301 ("test: Don't require 64-bit prefixes in perf tests")
Link: https://archives.passt.top/passt-dev/20241119214344.6b4a5b3a@elisabeth/
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
By dropping the filter on prefix length, commit 910f4f910301
("test: Don't require 64-bit prefixes in perf tests") broke tests on
setups where two global unicast IPv6 addresses are available, which
is the typical case when the "host" is a VM running under passt with
addresses from SLAAC and DHCPv6, because two addresses will be
returned.

Pick the first one instead. We don't really care about the prefix
length, any of these addresses will work.

Fixes: 910f4f910301 ("test: Don't require 64-bit prefixes in perf tests")
Link: https://archives.passt.top/passt-dev/20241119214344.6b4a5b3a@elisabeth/
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>conf, passt.1: Update --mac-addr default in usage() and man page</title>
<updated>2024-11-26T07:30:18+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2024-11-25T10:46:33+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=6819b2e1020411661dc0487ee3614f012d45b049'/>
<id>6819b2e1020411661dc0487ee3614f012d45b049</id>
<content type='text'>
Fixes: 90e83d50a9bd ("Don't take "our" MAC address from the host")
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fixes: 90e83d50a9bd ("Don't take "our" MAC address from the host")
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>passt.1: Fix "default" note about --map-guest-addr</title>
<updated>2024-11-26T07:30:18+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2024-11-25T10:40:53+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=b61be8468a804f5660cebcfdc10aa94b7ecac7a3'/>
<id>b61be8468a804f5660cebcfdc10aa94b7ecac7a3</id>
<content type='text'>
It's not true that there's no mapping by default: there's no mapping
in the --map-guest-addr sense, by default, but in that case
the default --map-host-loopback behaviour prevails.

While at it, fix a typo.

Fixes: 57b7bd2a48a1 ("fwd, conf: Allow NAT of the guest's assigned address")
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's not true that there's no mapping by default: there's no mapping
in the --map-guest-addr sense, by default, but in that case
the default --map-host-loopback behaviour prevails.

While at it, fix a typo.

Fixes: 57b7bd2a48a1 ("fwd, conf: Allow NAT of the guest's assigned address")
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: Acknowledge keep-alive segments, ignore them for the rest</title>
<updated>2024-11-21T05:52:36+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2024-11-19T19:53:44+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=238c69f9af458e41dea5ad8c988dbf65b05b5172'/>
<id>238c69f9af458e41dea5ad8c988dbf65b05b5172</id>
<content type='text'>
RFC 9293, 3.8.4 says:

   Implementers MAY include "keep-alives" in their TCP implementations
   (MAY-5), although this practice is not universally accepted.  Some
   TCP implementations, however, have included a keep-alive mechanism.
   To confirm that an idle connection is still active, these
   implementations send a probe segment designed to elicit a response
   from the TCP peer.  Such a segment generally contains SEG.SEQ =
   SND.NXT-1 and may or may not contain one garbage octet of data.  If
   keep-alives are included, the application MUST be able to turn them
   on or off for each TCP connection (MUST-24), and they MUST default to
   off (MUST-25).

but currently, tcp_data_from_tap() is not aware of this and will
schedule a fast re-transmit on the second keep-alive (because it's
also a duplicate ACK), ignoring the fact that the sequence number was
rewinded to SND.NXT-1.

ACK these keep-alive segments, reset the activity timeout, and ignore
them for the rest.

At some point, we could think of implementing an approximation of
keep-alive segments on outbound sockets, for example by setting
TCP_KEEPIDLE to 1, and a large TCP_KEEPINTVL, so that we send a single
keep-alive segment at approximately the same time, and never reset the
connection. That's beyond the scope of this fix, though.

Reported-by: Tim Besard &lt;tim.besard@gmail.com&gt;
Link: https://github.com/containers/podman/discussions/24572
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
RFC 9293, 3.8.4 says:

   Implementers MAY include "keep-alives" in their TCP implementations
   (MAY-5), although this practice is not universally accepted.  Some
   TCP implementations, however, have included a keep-alive mechanism.
   To confirm that an idle connection is still active, these
   implementations send a probe segment designed to elicit a response
   from the TCP peer.  Such a segment generally contains SEG.SEQ =
   SND.NXT-1 and may or may not contain one garbage octet of data.  If
   keep-alives are included, the application MUST be able to turn them
   on or off for each TCP connection (MUST-24), and they MUST default to
   off (MUST-25).

but currently, tcp_data_from_tap() is not aware of this and will
schedule a fast re-transmit on the second keep-alive (because it's
also a duplicate ACK), ignoring the fact that the sequence number was
rewinded to SND.NXT-1.

ACK these keep-alive segments, reset the activity timeout, and ignore
them for the rest.

At some point, we could think of implementing an approximation of
keep-alive segments on outbound sockets, for example by setting
TCP_KEEPIDLE to 1, and a large TCP_KEEPINTVL, so that we send a single
keep-alive segment at approximately the same time, and never reset the
connection. That's beyond the scope of this fix, though.

Reported-by: Tim Besard &lt;tim.besard@gmail.com&gt;
Link: https://github.com/containers/podman/discussions/24572
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</pre>
</div>
</content>
</entry>
</feed>
