passt, branch 2023_03_29.b10b983

fedora: Adjust path for SELinux policy and interface file to latest guidelines

2023-03-29T20:11:07+00:00

Forget about:
  https://fedoraproject.org/wiki/SELinux_Policy_Modules_Packaging_Draft

and:
  https://fedoraproject.org/wiki/PackagingDrafts/SELinux_Independent_Policy

The guidelines to follow are:
  https://fedoraproject.org/wiki/SELinux/IndependentPolicy

Start from fixing the most pressing issue, that is, a path conflict
with policy-selinux-devel about passt.if, and, while at it, adjust
the installation paths for policy files too.

Reported-by: Xose Vazquez Perez 
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2182476
Signed-off-by: Stefano Brivio

fedora: Don't install useless SELinux interface file for pasta

2023-03-29T11:48:12+00:00

That was meant to be an example, and I just dropped it in the
previous commit -- passt.if should be more than enough as a possible
example.

Reported-by: Carl G. 
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2182145
Signed-off-by: Stefano Brivio

selinux: Drop useless interface file for pasta

2023-03-29T11:48:12+00:00

This was meant to be an example, but I managed to add syntax errors
to it. Drop it altogether.

Reported-by: Carl G. 
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2182145
Signed-off-by: Stefano Brivio

conf: Allow binding to ports on an interface without a specific address

2023-03-29T11:48:12+00:00

Somebody might want to bind listening sockets to a specific
interface, but not a specific address, and there isn't really a
reason to prevent that. For example:

  -t %eth0/2022

Alternatively, we support options such as -t 0.0.0.0%eth0/2022 and
-t ::%eth0/2022, but not together, for the same port.

Enable this kind of syntax and add examples to the man page.

Reported-by: Paul Holzinger 
Link: https://github.com/containers/podman/issues/14425#issuecomment-1485192195
Signed-off-by: Stefano Brivio

tcp: Clear ACK_FROM_TAP_DUE also on unchanged ACK sequence from peer

2023-03-29T11:47:17+00:00

Since commit cc6d8286d104 ("tcp: Reset ACK_FROM_TAP_DUE flag only as
needed, update timer"), we don't clear ACK_FROM_TAP_DUE whenever we
process an ACK segment, but, more correctly, only if we're really not
waiting for a further ACK segment, that is, only if the acknowledged
sequence matches what we sent.

In the new function implementing this, tcp_update_seqack_from_tap(),
we also reset the retransmission counter and store the updated ACK
sequence. Both should be done iff forward progress is acknowledged,
implied by the fact that the new ACK sequence is greater than the
one we previously stored.

At that point, it looked natural to also include the statements that
clear and set the ACK_FROM_TAP_DUE flag inside the same conditional
block: if we're not making forward progress, the need for an ACK, or
lack thereof, should remain unchanged.

There might be cases where this isn't true, though: without the
previous commit 4e73e9bd655c ("tcp: Don't special case the handling
of the ack of a syn"), this would happen if a tap-side client
initiated a connection, and the server didn't send any data.

At that point we would never, in the established state of the
connection, call tcp_update_seqack_from_tap() with reported forward
progress.

That issue itself is fixed by the previous commit, now, but clearing
ACK_FROM_TAP_DUE only on ACK sequence progress doesn't really follow
any logic.

Clear the ACK_FROM_TAP_DUE flag regardless of reported forward
progress. If we clear it when it's already unset, conn_flag() will do
nothing with it.

This doesn't fix any known functional issue, rather a conceptual one.

Fixes: cc6d8286d104 ("tcp: Reset ACK_FROM_TAP_DUE flag only as needed, update timer")
Reported-by: David Gibson 
Analysed-by: David Gibson 
Signed-off-by: Stefano Brivio

tcp: Don't special case the handling of the ack of a syn

2023-03-29T11:47:07+00:00

TCP treats the SYN packets as though they occupied 1 byte in the logical
data stream described by the sequence numbers.  That is, the very first ACK
(or SYN-ACK) each side sends should acknowledge a sequence number one
greater than the initial sequence number given in the SYN or SYN-ACK it's
responding to.

In passt we were tracking that by advancing conn->seq_to_tap by one when
we send a SYN or SYN-ACK (in tcp_send_flag()).  However, we also
initialized conn->seq_ack_from_tap, representing the acks we've already
seen from the tap side, to ISN+1, meaning we treated it has having
acknowledged the SYN before it actually did.

There were apparently reasons for this in earlier versions, but it causes
problems now.  Because of this when we actually did receive the initial ACK
or SYN-ACK, we wouldn't see the acknoweldged serial number as advancing,
and so wouldn't clear the ACK_FROM_TAP_DUE flag.

In most cases we'd get away because subsequent packets would clear the
flag.  However if one (or both) sides didn't send any data, the other side
would (correctly) keep sending ISN+1 as the acknowledged sequence number,
meaning we would never clear the ACK_FROM_TAP_DUE flag.  That would mean
we'd treat the connection as if we needed to retransmit (although we had
0 bytes to retransmit), and eventaully (after around 30s) reset the
connection due to too many retransmits.  Specifically this could cause the
iperf3 throughput tests in the testsuite to fail if set for a long enough
test period.

Correct this by initializing conn->seq_ack_from_tap to the ISN and only
advancing it when we actually get the first ACK (or SYN-ACK).

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

tcp: Clarify allowed state for tcp_data_from_tap()

2023-03-29T11:46:58+00:00

Comments suggest that this should only be called for an ESTABLISHED
connection.  However, it's non-trivial to ascertain that from the actual
control flow in the caller.  Add an ASSERT() to make it very clear that
this is only called in ESTABLISHED state.

In fact, there were some circumstances where it could be called on a CLOSED
connection.  In a sense that is "established", but with that assert this
does require specific (trivial) handling to avoid a spurious abort().

Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio

tcp: Don't reset ACK_TO_TAP_DUE on any ACK, reschedule timer as needed

2023-03-21T22:14:58+00:00

This is mostly symmetric with commit cc6d8286d104 ("tcp: Reset
ACK_FROM_TAP_DUE flag only as needed, update timer"): we shouldn't
reset the ACK_TO_TAP_DUE flag on any inbound ACK segment, but only
once we acknowledge everything we received from the guest or the
container.

If we don't, a client might unnecessarily hold off further data,
especially during slow start, and in general we won't converge to the
usable bandwidth.

This is very visible especially with traffic tests on links with
non-negligible latency, such as in the reported issue. There, a
public iperf3 server sometimes aborts the test due do what appears to
be a low iperf3's --rcv-timeout (probably less than a second). Even
if this doesn't happen, the throughput will converge to a fraction of
the usable bandwidth.

Clear ACK_TO_TAP_DUE if we acknowledged everything, set it if we
didn't, and reschedule the timer in case the flag is still set as the
timer expires.

While at it, decrease the ACK timer interval to 10ms.

A 50ms interval is short enough for any bandwidth-delay product I had
in mind (local connections, or non-local connections with limited
bandwidth), but here I am, testing 1gbps transfers to a peer with
100ms RTT.

Indeed, we could eventually make the timer interval dependent on the
current window and estimated bandwidth-delay product, but at least
for the moment being, 10ms should be long enough to avoid any
measurable syscall overhead, yet usable for any real-world
application.

Reported-by: Lukas Mrtvy 
Link: https://bugs.passt.top/show_bug.cgi?id=44
Signed-off-by: Stefano Brivio

tcp: When a connection flag it set, don't negate it for debug print

2023-03-21T18:39:55+00:00

Fix a copy and paste typo I added in commit 5474bc5485d8 ("tcp,
tcp_splice: Get rid of false positive CWE-394 Coverity warning from
fls()") and --debug altogether.

Fixes: 5474bc5485d8 ("tcp, tcp_splice: Get rid of false positive CWE-394 Coverity warning from fls()")
Signed-off-by: Stefano Brivio

Fix false positive if cppcheck doesn't give a false positive

2023-03-21T15:38:44+00:00

da46fdac "tcp: Suppress knownConditionTrueFalse cppcheck false positive"
introduced a suppression to work around a cppcheck bug causing a false
positive warning.  However, the suppression will itself cause a spurious
unmatchedSuppression warning if used with a version of cppcheck from before
the bug was introduced.  That includes the packaged version of cppcheck in
Fedora.

Suppress the unmatchedSuppression as well.

Fixes: da46fdac3605 ("tcp: Suppress knownConditionTrueFalse cppcheck false positive")
Signed-off-by: David Gibson 
Signed-off-by: Stefano Brivio