<feed xmlns='http://www.w3.org/2005/Atom'>
<title>passt/tcp.h, branch ndebug</title>
<subtitle>Plug A Simple Socket Transport</subtitle>
<link rel='alternate' type='text/html' href='https://passt.top/passt/'/>
<entry>
<title>fwd: Unify TCP and UDP forwarding tables</title>
<updated>2026-03-11T21:11:30+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2026-03-11T12:03:11+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=d460ca3236bafa724686a5ad7f585d70962f7373'/>
<id>d460ca3236bafa724686a5ad7f585d70962f7373</id>
<content type='text'>
Currently TCP and UDP each have their own forwarding tables.  This is
awkward in a few places, where we need switch statements to select the
correct table.  More importantly, it would make things awkward and messy to
extend to other protocols in future, which we're likely to want to do.

Merge the TCP and UDP tables into a single table per (source) pif, with the
protocol given in each rule entry.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently TCP and UDP each have their own forwarding tables.  This is
awkward in a few places, where we need switch statements to select the
correct table.  More importantly, it would make things awkward and messy to
extend to other protocols in future, which we're likely to want to do.

Merge the TCP and UDP tables into a single table per (source) pif, with the
protocol given in each rule entry.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fwd: Split forwarding table from port scanning state</title>
<updated>2026-03-11T21:11:30+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2026-03-11T12:03:10+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=bb2e4dda0f7c9b92195ab84920430659425afbc0'/>
<id>bb2e4dda0f7c9b92195ab84920430659425afbc0</id>
<content type='text'>
For hsitorical reasons, struct fwd_ports contained both the new forwarding
table and some older state related to port / scanning auto-forwarding
detection.  They are related, but keeping them together prevents some
future reworks we want to do.

Separate them into struct fwd_table (for the table) and struct fwd_scan
for the scanning state.  Adjusting all the users makes for a logically
straightforward, but fairly extensive patch.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For hsitorical reasons, struct fwd_ports contained both the new forwarding
table and some older state related to port / scanning auto-forwarding
detection.  They are related, but keeping them together prevents some
future reworks we want to do.

Separate them into struct fwd_table (for the table) and struct fwd_scan
for the scanning state.  Adjusting all the users makes for a logically
straightforward, but fairly extensive patch.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: Remove stale description of port_to_tap field</title>
<updated>2026-03-11T21:11:30+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2026-03-11T12:03:07+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=d2438efb69558877a0b306247dbcba5ecbd5b794'/>
<id>d2438efb69558877a0b306247dbcba5ecbd5b794</id>
<content type='text'>
This field was removed in 163dc5f18899 ("Consolidate port forwarding
configuration into a common structure"), but the corresponding comment
describing it was not.  Fix the oversight.

Fixes: 163dc5f18899 ("Consolidate port forwarding configuration into a common structure")
Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This field was removed in 163dc5f18899 ("Consolidate port forwarding
configuration into a common structure"), but the corresponding comment
describing it was not.  Fix the oversight.

Fixes: 163dc5f18899 ("Consolidate port forwarding configuration into a common structure")
Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Add missing includes to headers</title>
<updated>2026-03-04T16:39:57+00:00</updated>
<author>
<name>Peter Foley</name>
<email>pefoley@google.com</email>
</author>
<published>2026-02-23T18:11:19+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=adbf5c135f19db5b6751393b7f5cbf516031bde8'/>
<id>adbf5c135f19db5b6751393b7f5cbf516031bde8</id>
<content type='text'>
Support build systems like bazel that check that headers are
self-contained.

Also update includes so that clang-include-cleaner succeeds.

Tested with:
clang-include-cleaner-19 --extra-arg=-D_GNU_SOURCE --extra-arg=-DPAGE_SIZE=4096 --extra-arg=-DVERSION=\"git\" --extra-arg=-DHAS_GETRANDOM *.h *.c

Signed-off-by: Peter Foley &lt;pefoley@google.com&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Support build systems like bazel that check that headers are
self-contained.

Also update includes so that clang-include-cleaner succeeds.

Tested with:
clang-include-cleaner-19 --extra-arg=-D_GNU_SOURCE --extra-arg=-DPAGE_SIZE=4096 --extra-arg=-DVERSION=\"git\" --extra-arg=-DHAS_GETRANDOM *.h *.c

Signed-off-by: Peter Foley &lt;pefoley@google.com&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: Send TCP keepalive segments after a period of tap-side inactivity</title>
<updated>2026-02-24T23:17:45+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2026-02-04T11:41:37+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=d2f7c21cfb949f2b1587b9475917efdd6ac549fd'/>
<id>d2f7c21cfb949f2b1587b9475917efdd6ac549fd</id>
<content type='text'>
There are several circumstances in which a live, but idle TCP connection
can be forgotten by a guest, with no "on the wire" indication that this has
happened.  The most obvious is if the guest abruptly reboots.  A more
subtle case can happen with a half-closed connection, specifically one
in FIN_WAIT_2 state on the guest.  A connection can, legitimately, remain
in this state indefinitely.  If however, a socket in this state is closed
by userspace, Linux at least will remove the kernel socket after 60s
(or as configured in the net.ipv4.tcp_fin_timeout sysctl).

Because there's no on the wire indication in these cases, passt will
pointlessly retain the connection in its flow table, at least until it is
removed by the inactivity timeout after several hours.

To avoid keeping connections around for so long in this state, add
functionality to periodically send TCP keepalive segments to the guest if
we've seen no activity on the tap interface.  If the guest is no longer
aware of the connection, it should respond with an RST which will let
passt remove the stale entry.

To do this we use a method similar to the inactivity timeout - a 1-bit
page replacement / clock algorithm, but with a shorter interval, and only
checking for tap side activity.  Currently we use a 300s interval, meaning
we'll send a keepalive after 5-10 minutes of (tap side) inactivity.

Link: https://bugs.passt.top/show_bug.cgi?id=179
Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are several circumstances in which a live, but idle TCP connection
can be forgotten by a guest, with no "on the wire" indication that this has
happened.  The most obvious is if the guest abruptly reboots.  A more
subtle case can happen with a half-closed connection, specifically one
in FIN_WAIT_2 state on the guest.  A connection can, legitimately, remain
in this state indefinitely.  If however, a socket in this state is closed
by userspace, Linux at least will remove the kernel socket after 60s
(or as configured in the net.ipv4.tcp_fin_timeout sysctl).

Because there's no on the wire indication in these cases, passt will
pointlessly retain the connection in its flow table, at least until it is
removed by the inactivity timeout after several hours.

To avoid keeping connections around for so long in this state, add
functionality to periodically send TCP keepalive segments to the guest if
we've seen no activity on the tap interface.  If the guest is no longer
aware of the connection, it should respond with an RST which will let
passt remove the stale entry.

To do this we use a method similar to the inactivity timeout - a 1-bit
page replacement / clock algorithm, but with a shorter interval, and only
checking for tap side activity.  Currently we use a 300s interval, meaning
we'll send a keepalive after 5-10 minutes of (tap side) inactivity.

Link: https://bugs.passt.top/show_bug.cgi?id=179
Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: Re-introduce inactivity timeouts based on a clock algorithm</title>
<updated>2026-02-24T23:17:38+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2026-02-04T11:41:35+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=1820103fbbf13df98257a3f5c3ba625de624b0b3'/>
<id>1820103fbbf13df98257a3f5c3ba625de624b0b3</id>
<content type='text'>
We previously had a mechanism to remove TCP connections which were
inactive for 2 hours.  That was broken for a long time, due to poor
interactions with the timerfd handling, so we removed it.

Adding this long scale timer onto the timerfd handling, which mostly
handles much shorter timeouts is tricky to reason about.  However, for the
inactivity timeouts, we don't require precision.  Instead, we can use
a 1-bit page replacement / "clock" algorithm.  Every INACTIVITY_INTERVAL
(2 hours), a global timer marks every TCP connection as tentatively
inactive.  That flag is cleared if we get any events, either tap side or
socket side.

If the inactive flag is still set when the next INACTIVITY_INTERVAL expires
then the connection has been inactive for an extended period and we reset
and close it.  In practice this means that connections will be removed
after 2-4 hours of inactivity.

This is not a true fix for bug 179, but it does mitigate the damage, by
limiting the time that inactive connections will remain around,

Link: https://bugs.passt.top/show_bug.cgi?id=179
Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We previously had a mechanism to remove TCP connections which were
inactive for 2 hours.  That was broken for a long time, due to poor
interactions with the timerfd handling, so we removed it.

Adding this long scale timer onto the timerfd handling, which mostly
handles much shorter timeouts is tricky to reason about.  However, for the
inactivity timeouts, we don't require precision.  Instead, we can use
a 1-bit page replacement / "clock" algorithm.  Every INACTIVITY_INTERVAL
(2 hours), a global timer marks every TCP connection as tentatively
inactive.  That flag is cleared if we get any events, either tap side or
socket side.

If the inactive flag is still set when the next INACTIVITY_INTERVAL expires
then the connection has been inactive for an extended period and we reset
and close it.  In practice this means that connections will be removed
after 2-4 hours of inactivity.

This is not a true fix for bug 179, but it does mitigate the damage, by
limiting the time that inactive connections will remain around,

Link: https://bugs.passt.top/show_bug.cgi?id=179
Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fwd, tcp, udp: Add forwarding rule to listening socket epoll references</title>
<updated>2026-01-18T11:48:06+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2026-01-16T00:59:25+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=fe37028466d3d29d74ebf53e9c53c9f139fbc74e'/>
<id>fe37028466d3d29d74ebf53e9c53c9f139fbc74e</id>
<content type='text'>
Now that we have a table of all our forwarding rules, every listening
socket can be associated with a specific rule.  Add an index allowing us to
locate that rule from the socket's epoll reference.  We don't use it yet,
but we'll use it to optimise rule lookup when forwarding new flows.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that we have a table of all our forwarding rules, every listening
socket can be associated with a specific rule.  Add an index allowing us to
locate that rule from the socket's epoll reference.  We don't use it yet,
but we'll use it to optimise rule lookup when forwarding new flows.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fwd, tcp, udp: Set up listening sockets based on forward table</title>
<updated>2026-01-18T11:47:47+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2026-01-16T00:59:19+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=b223bec48213060304c09882ce5b3055b15b7e07'/>
<id>b223bec48213060304c09882ce5b3055b15b7e07</id>
<content type='text'>
Previously we created inbound listening sockets as we parsed the forwarding
options (-t, -u) whereas outbound listening sockets were created during
{tcp,udp}_init().  Now that we have a data structure recording the full
details of the listening options we can move all listening socket creation
to {tcp,udp}_init().  This means that errors for either direction are
detected and reported the same way.

Introduce fwd_listen_sync() which synchronizes the state of listening
sockets to the forwarding rules table, both for fixed and automatic
forwards.

This does cause a change in semantics for "exclude only" port
specifications.  Previously an option like -t ~6000 wouldn't cause a
fatal error, as long as we could bind at least one port.  Now, it
requires at least one port for each generated rule; that is for each
of the contiguous blocks of ports the specification resolves to.  With
typical ephemeral ports settings that's one port each in 1..5999,
6001..32767 and 61000..65535.

Preserving the exact behaviour for this case would require a considerably
more complex data structure, so I'm hoping this is a sufficiently niche
case for the change to be acceptable.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previously we created inbound listening sockets as we parsed the forwarding
options (-t, -u) whereas outbound listening sockets were created during
{tcp,udp}_init().  Now that we have a data structure recording the full
details of the listening options we can move all listening socket creation
to {tcp,udp}_init().  This means that errors for either direction are
detected and reported the same way.

Introduce fwd_listen_sync() which synchronizes the state of listening
sockets to the forwarding rules table, both for fixed and automatic
forwards.

This does cause a change in semantics for "exclude only" port
specifications.  Previously an option like -t ~6000 wouldn't cause a
fatal error, as long as we could bind at least one port.  Now, it
requires at least one port for each generated rule; that is for each
of the contiguous blocks of ports the specification resolves to.  With
typical ephemeral ports settings that's one port each in 1..5999,
6001..32767 and 61000..65535.

Preserving the exact behaviour for this case would require a considerably
more complex data structure, so I'm hoping this is a sufficiently niche
case for the change to be acceptable.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fwd, tcp, udp: Consolidate epoll refs for listening sockets</title>
<updated>2026-01-10T19:54:13+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2026-01-08T02:14:50+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=9ea9dde5b5f64562d5fb0385dfee967a8cfec0f3'/>
<id>9ea9dde5b5f64562d5fb0385dfee967a8cfec0f3</id>
<content type='text'>
The epoll references we use for TCP listening sockets and UDP "listening"
sockets have identical information.  Combine them into a single structure.
Note that, despite the name, epoll_ref.udp was only ever used for
"listening" sockets, not flow sockets.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Reviewed-by: Laurent Vivier &lt;lvivier@redhat.com&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The epoll references we use for TCP listening sockets and UDP "listening"
sockets have identical information.  Combine them into a single structure.
Note that, despite the name, epoll_ref.udp was only ever used for
"listening" sockets, not flow sockets.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Reviewed-by: Laurent Vivier &lt;lvivier@redhat.com&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: Remove unused tcp_epoll_ref</title>
<updated>2026-01-10T19:54:13+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2026-01-08T02:14:48+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=ad5670a980d0e44399f7d982f377d3744365a039'/>
<id>ad5670a980d0e44399f7d982f377d3744365a039</id>
<content type='text'>
This union has been unused for some time.  Remove it.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Reviewed-by: Laurent Vivier &lt;lvivier@redhat.com&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This union has been unused for some time.  Remove it.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Reviewed-by: Laurent Vivier &lt;lvivier@redhat.com&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
