passt, branch podman23739

Check behaviour of connect() with multicast destinations

2026-06-09T04:16:25+00:00

Work in progress debugging patch.

flow, treewide: Promote priority of selected flow-linked messages

2026-06-09T02:28:20+00:00

Most of out flow specific log messages are debug level for fear of flooding
the logs, even when they report real error conditions that might be off
significance.

Now that we have the mechanisms for log message rate limiting, we can do
better.  Promote many flow related messages to warning or error level, with
rate limiting.  While we're there add ratelimiting to a handful of existing
warning or error level messages.

They general heuristic is to promote messages that report a failure which
is not something that should be triggered by the guest doing something
weird.  This mostly means failures from socket operations we expect to be
legitimate.

Adding the ratelimiting means plumbing the 'now' timestamp through much
more of the code, hence the large churn.

Signed-off-by: David Gibson

flow: Safer errno handling in flowside_connect() callers

2026-06-09T02:18:40+00:00

flowside_connect() behaves much like connect(2) itself, returning -1 on
error with errno set to the error code.  One of the callers, in
udp_flow_sock(), uses the errno code with flow_dbg_perror() *after* it's
called epoll_del() and close() either of which could clobber errno.

Change flowside_connect() to use the more regular convention for internal
functions: return a negative errno code on error, rather than just -1.
Save it in the callers and use that rather than raw errno to print the
message.

Signed-off-by: David Gibson

flow: Include flow details with higher priority log messages

2026-06-09T02:18:40+00:00

Currently flow_log() and related functions / macros have a 'details'
parameter which indicates whether to add extra messages with details of the
flow's addresses.  This is still a bit awkward to invoke, and only used in
a few places.  Change the logic, to automatically include the details if
and only if the log priority is greater than LOG_DEBUG.

Rationale:

If at debug log level, there are already a bunch of debug messages tracking
the flow life cycle, which include those details (we make sure to retain
those).  It's usually pretty easy to cross reference a specific flow debug
message with the flow's history including the details.

If at higher log level, and we generate a flow-connected error or warning
we don't have those life cycle messages.  So, just giving the flow index
doesn't really tell you anything about which flow tripped the error.
Adding the address details make the error message significantly more
useful.

Signed-off-by: David Gibson

flow: Regularise flow specific logging helpers

2026-06-09T02:18:40+00:00

flow.h has a collection of logging helpers that automatically include
information about a specific flow.  Which variants are present are a bit
ad-hoc, based on what we happened to want use (e.g. there are no
LOG_WARNING level versions, at present).  There's also a rather awkward
and only occasionally used flow_log_details_() helper to print additional
log messages with more details of the flow (basically its addresses).
It's particularly awkward to try to combine that with ratelimiting.

Re-organise this to be based around a flow_log__() internal helper, which
has bool parameters to include strerror() / perror information and/or
the extra details.  Add wrapper macros for all combinations of perror,
ratelimiting and DEBUG/WARNING/ERR priorities.

Be a little more consistent about parameter order between the various
functions / macros / wrappers while we're at it.

Signed-off-by: David Gibson

tcp_splice: Improve EOF and read stall exit conditions

2026-06-05T07:46:52+00:00

At the end of our loop we have a conditional 'break' that exits if we're
at EOF on the read side and have nothing left in the pipe.  This makes
sense: at EOF there's nothing left to do read-side and with nothing in the
pipe there's nothing to do write side either.

The same is true if the read side hit an EAGAIN and the pipe is empty:
there's nothing we can do (for now) read side, and with an empty pipe
nothing write side either.  So, generalise the condition to exit on either
EOF or EAGAIN read side.

Furthermore, if the read side is at EOF or EAGAIN and there's already
nothing in the pipe before the write-side splice(), then that write side
splice() can't accomplish anything, so exit the loop early in that case
avoiding a harmless but unnecessary write-splice().

Signed-off-by: David Gibson 
[sbrivio: Minor comment fix]
Signed-off-by: Stefano Brivio

passt, tcp: Inline CALL_PROTO_HANDLER() and merge tcp_timer()

2026-06-04T04:45:09+00:00

Since 260075bde769 ("tcp, udp, fwd: Run all port scanning from a
single timer"), CALL_PROTO_HANDLER() has only one user (tcp), so
inline it at the call site and remove the macro.

Merge tcp_timer() into tcp_defer_handler(), moving the timer interval
check there, matching the pattern used by flow_defer_handler() and
fwd_scan_ports_timer().

The weak declaration and null check for tcp_defer_handler are also
dropped as the function is always defined.

Signed-off-by: Laurent Vivier 
Reviewed-by: David Gibson 
Signed-off-by: Stefano Brivio

tcp_splice: Remove questionable "optimisation" of pending bytes tracking

2026-06-04T04:35:29+00:00

We have a special path that avoids updating conn->pending when the amounts
read and written are equal.  This has a conceptual complexity cost, in
particular, it means that conn->pending[] is not accurate to its normal
meaning for a section of the loop body.

conn->pending[] shares a cacheline with conn->pipe[] and conn->s[], so it's
almost certainly cache-hot.  It's questionable that avoiding the update
of pending even outweighs the extra conditional branch, let alone saves
anything of significance.  Remove it.

This allows us to move the updates to conn->pending closer to the actual
splice() calls, making it easier to reason about its value.  It also lets
us move the conn->pending updates so they can piggy back on existing tests
rather than needing a conditional expression to avoid clobbering it when
splice() returns -1 (EAGAIN).

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

tcp_splice: Simplify / correct OUT_WAIT flag handling

2026-06-04T04:35:26+00:00

We set the OUT_WAIT flag if we stop forwarding due to EAGAIN, but there's
still data in the pipe.  That ensures we wake up when the output socket has
room to drain the pipe into.

We clear the OUT_WAIT flag when we complete forwarding on an EPOLLOUT
event, but that's not quite right.  Even though it's called on an EPOLLOUT,
tcp_splice_forward() could, in principle empty the pipe, but also read
enough new data from the other side to fill it again.  That would set
OUT_WAIT internally, but it would be cleared after returning meaning
we could miss a necessary wakeup.

The condition on whether we need write side wakeups is actually fairly
simple: we need them if and only if we return to the main loop with data
in the pipe.  Maintain that in a single place - right after we exit the
forwarding loop in tcp_splice_forward().

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

tcp_splice: Simplify shutdown(2) handling

2026-06-04T04:35:23+00:00

At the end of tcp_splice_forward(), we check for half-closed connections
in either direction and propagate the FIN to the other side with a
shutdown(2).

However, it's unnecessary to check both directions: a FIN from side X will
cause an EPOLLRDUP on side X's socket, which will trigger
tcp_splice_forward() from side X to side !X.  Likewise for the other side.
So we only need to check for "forward" FIN propagation.

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio