diff options
| author | Stefano Brivio <sbrivio@redhat.com> | 2025-12-13 14:19:13 +0100 |
|---|---|---|
| committer | Stefano Brivio <sbrivio@redhat.com> | 2025-12-15 08:11:54 +0100 |
| commit | b40f5cd8c8e16c6eceb1f26eb895527fda84068b (patch) | |
| tree | 5997fab5dbbacf13e304a7dbe8395c17cda988a5 /test | |
| parent | 35fa86a7871767d6a382b13e71c429abf47f88ab (diff) | |
| download | passt-master.tar passt-master.tar.gz passt-master.tar.bz2 passt-master.tar.lz passt-master.tar.xz passt-master.tar.zst passt-master.zip | |
tcp: Use less-than-MSS window on no queued data, or no data sent recentlyHEAD2025_12_15.b40f5cdmaster
We limit the advertised window to guests and containers to the
available length of the sending buffer, and if it's less than the MSS,
since commit cf1925fb7b77 ("tcp: Don't limit window to less-than-MSS
values, use zero instead"), we approximate that limit to zero.
This way, we'll trigger a window update as soon as we realise that we
can advertise a larger value, just like we do in all other cases where
we advertise a zero-sized window.
By doing that, we don't wait for the peer to send us data before we
update the window. This matters because the guest or container might
be trying to aggregate more data and won't send us anything at all if
the advertised window is too small.
However, this might be problematic in two situations:
1. one, reported by Tyler, where the remote (receiving) peer
advertises a window that's smaller than what we usually get and
very close to the MSS, causing the kernel to give us a starting
size of the buffer that's less than the MSS we advertise to the
guest or container.
If this happens, we'll never advertise a non-zero window after
the handshake, and the container or guest will never send us any
data at all.
With a simple 'curl https://cloudflare.com/', we get, with default
TCP memory parameters, a 65535-byte window from the peer, and 46080
bytes of initial sending buffer from the kernel. But we advertised
a 65480-byte MSS, and we'll never actually receive the client
request.
This seems to be specific to Cloudflare for some reason, probably
deriving from a particular tuning of TCP parameters on their
servers.
2. another one, hypothesised by David, where the peer might only be
willing to process (and acknowledge) data in batches.
We might have queued outbound data which is, at the same time, not
enough to fill one of these batches and be acknowledged and removed
from the sending queue, but enough to make our available buffer
smaller than the MSS, and the connection will hang.
Take care of both cases by:
a. not approximating the sending buffer to zero if we have no outboud
queued data at all, because in that case we don't expect the
available buffer to increase if we don't send any data, so there's
no point in waiting for it to grow larger than the MSS.
This fixes problem 1. above.
b. also using the full sending buffer size if we haven't send data to
the socket for a while (reported by tcpi_last_data_sent). This part
was already suggested by David in:
https://archives.passt.top/passt-dev/aTZzgtcKWLb28zrf@zatzit/
and I'm now picking ten times the RTT as a somewhat arbitrary
threshold.
This is meant to take care of potential problem 2. above, but it
also happens to fix 1.
Reported-by: Tyler Cloud <tcloud@redhat.com>
Link: https://bugs.passt.top/show_bug.cgi?id=183
Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Diffstat (limited to 'test')
0 files changed, 0 insertions, 0 deletions
