From b145441913eef6f8885b6b84531e944ff593790c Mon Sep 17 00:00:00 2001 From: Stefano Brivio Date: Thu, 2 Oct 2025 00:41:54 +0200 Subject: tcp: Don't consider FIN flags with mismatching sequence MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit If a guest or container sends us a FIN segment but its sequence number doesn't match the highest sequence of data we *accepted* (not necessarily the highest sequence we received), that is, conn->seq_from_tap, plus any data we're accepting in the current batch, we should discard the flag (not necessarily the segment), because there's still data we need to receive (again) before the end of the stream. If we consider those FIN flags as such, we'll end up in the situation described below. Here, 192.168.10.102 is a HTTP server in a Podman container, and 192.168.10.44 is a client fetching approximately 121 KB of data from it: 82 2.026811 192.168.10.102 → 192.168.10.44 54 TCP 55414 → 44992 [FIN, ACK] Seq=121441 Ack=143 Win=65536 Len=0 the server is done sending 83 2.026898 192.168.10.44 → 192.168.10.102 54 TCP 44992 → 55414 [ACK] Seq=143 Ack=114394 Win=216192 Len=0 pasta (client) acknowledges a previous sequence, because of a short sendmsg() 84 2.027324 192.168.10.44 → 192.168.10.102 54 TCP 44992 → 55414 [FIN, ACK] Seq=143 Ack=114394 Win=216192 Len=0 pasta (client) sends FIN, ACK as the client has no more data to send (a single GET request), while still acknowledging a previous sequence, because the retransmission didn't happen yet 85 2.027349 192.168.10.102 → 192.168.10.44 54 TCP 55414 → 44992 [ACK] Seq=121442 Ack=144 Win=65536 Len=0 the server acknowledges the FIN, ACK 86 2.224125 192.168.10.102 → 192.168.10.44 4150 TCP [TCP Retransmission] 55414 → 44992 [ACK] Seq=114394 Ack=144 Win=65536 Len=4096 [TCP segment of a reassembled PDU] and finally a retransmission comes, but as we wrongly switched to the CLOSE-WAIT state, 87 2.224202 192.168.10.44 → 192.168.10.102 54 TCP 44992 → 55414 [RST] Seq=144 Win=0 Len=0 we consider frame #86 as an acknowledgement for the FIN segment we sent, and close the connection, while we still had to re-receive (and finally send) the missing data segment, instead. Link: https://github.com/containers/podman/issues/27179 Signed-off-by: Stefano Brivio --- tcp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tcp.c b/tcp.c index a648174..0cca3de 100644 --- a/tcp.c +++ b/tcp.c @@ -1774,7 +1774,7 @@ static int tcp_data_from_tap(const struct ctx *c, struct tcp_tap_conn *conn, } } - if (th->fin) + if (th->fin && seq == seq_from_tap) fin = 1; if (!len) -- cgit v1.2.3