<feed xmlns='http://www.w3.org/2005/Atom'>
<title>passt, branch 2023_11_10.5ec3634</title>
<subtitle>Plug A Simple Socket Transport</subtitle>
<link rel='alternate' type='text/html' href='https://passt.top/passt/'/>
<entry>
<title>tap, pasta: Handle short writes to /dev/tap</title>
<updated>2023-11-10T15:51:33+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2023-11-08T03:17:54+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=5ec3634b07215337c2e69d88f9b1d74711897d7d'/>
<id>5ec3634b07215337c2e69d88f9b1d74711897d7d</id>
<content type='text'>
tap_send_frames_pasta() sends frames to the namespace by sending them to
our the /dev/tap device.  If that write() returns an error, we already
handle it.  However we don't handle the case where the write() returns
short, meaning we haven't successfully transmitted the whole frame.

I don't know if this can ever happen with the kernel tap device, but we
should at least report the case so we don't get a cryptic failure.  For
the purposes of the return value for tap_send_frames_pasta() we treat this
case as though it was an error (on the grounds that a partial frame is no
use to the namespace).

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
tap_send_frames_pasta() sends frames to the namespace by sending them to
our the /dev/tap device.  If that write() returns an error, we already
handle it.  However we don't handle the case where the write() returns
short, meaning we haven't successfully transmitted the whole frame.

I don't know if this can ever happen with the kernel tap device, but we
should at least report the case so we don't get a cryptic failure.  For
the purposes of the return value for tap_send_frames_pasta() we treat this
case as though it was an error (on the grounds that a partial frame is no
use to the namespace).

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tap, pasta: Handle incomplete tap sends for pasta too</title>
<updated>2023-11-10T15:51:33+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2023-11-08T03:17:53+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=f0776eac07cfae76a51b5f55d5f95c8a5c62640f'/>
<id>f0776eac07cfae76a51b5f55d5f95c8a5c62640f</id>
<content type='text'>
Since a469fc39 ("tcp, tap: Don't increase tap-side sequence counter for
dropped frames") we've handled more gracefully the case where we get data
from the socket side, but are temporarily unable to send it all to the tap
side (e.g. due to full buffers).

That code relies on tap_send_frames() returning the number of frames it
successfully sent, which in turn gets it from tap_send_frames_passt() or
tap_send_frames_pasta().

While tap_send_frames_passt() has returned that information since b62ed9ca
("tap: Don't pcap frames that didn't get sent"), tap_send_frames_pasta()
always returns as though it succesfully sent every frame.  However there
certainly are cases where it will return early without sending all frames.
Update it report that properly, so that the calling functions can handle it
properly.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since a469fc39 ("tcp, tap: Don't increase tap-side sequence counter for
dropped frames") we've handled more gracefully the case where we get data
from the socket side, but are temporarily unable to send it all to the tap
side (e.g. due to full buffers).

That code relies on tap_send_frames() returning the number of frames it
successfully sent, which in turn gets it from tap_send_frames_passt() or
tap_send_frames_pasta().

While tap_send_frames_passt() has returned that information since b62ed9ca
("tap: Don't pcap frames that didn't get sent"), tap_send_frames_pasta()
always returns as though it succesfully sent every frame.  However there
certainly are cases where it will return early without sending all frames.
Update it report that properly, so that the calling functions can handle it
properly.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: Don't use TCP_WINDOW_CLAMP</title>
<updated>2023-11-10T05:42:19+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2023-11-09T09:54:00+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=cf3eeba6c0d7c7b33824b6e1c53f14dcb90437ae'/>
<id>cf3eeba6c0d7c7b33824b6e1c53f14dcb90437ae</id>
<content type='text'>
On the L2 tap side, we see TCP headers and know the TCP window that the
ultimate receiver is advertising.  In order to avoid unnecessary buffering
within passt/pasta (or by the kernel on passt/pasta's behalf) we attempt
to advertise that window back to the original sock-side sender using
TCP_WINDOW_CLAMP.

However, TCP_WINDOW_CLAMP just doesn't work like this.  Prior to kernel
commit 3aa7857fe1d7 ("tcp: enable mid stream window clamp"), it simply
had no effect on established sockets.  After that commit, it does affect
established sockets but doesn't behave the way we need:
  * It appears to be designed only to shrink the window, not to allow it to
    re-expand.
  * More importantly, that commit has a serious bug where if the
    setsockopt() is made when the existing kernel advertised window for the
    socket happens to be zero, it will now become locked at zero, stopping
    any further data from being received on the socket.

Since this has never worked as intended, simply remove it.  It might be
possible to re-implement the intended behaviour by manipulating SO_RCVBUF,
so we leave a comment to that effect.

This kernel bug is the underlying cause of both the linked passt bug and
the linked podman bug.  We attempted to fix this before with passt commit
d3192f67 ("tcp: Force TCP_WINDOW_CLAMP before resetting STALLED flag").
However while that commit masked the bug for some cases, it didn't really
address the problem.

Fixes: d3192f67c492 ("tcp: Force TCP_WINDOW_CLAMP before resetting STALLED flag")
Link: https://github.com/containers/podman/issues/20170
Link: https://bugs.passt.top/show_bug.cgi?id=74
Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On the L2 tap side, we see TCP headers and know the TCP window that the
ultimate receiver is advertising.  In order to avoid unnecessary buffering
within passt/pasta (or by the kernel on passt/pasta's behalf) we attempt
to advertise that window back to the original sock-side sender using
TCP_WINDOW_CLAMP.

However, TCP_WINDOW_CLAMP just doesn't work like this.  Prior to kernel
commit 3aa7857fe1d7 ("tcp: enable mid stream window clamp"), it simply
had no effect on established sockets.  After that commit, it does affect
established sockets but doesn't behave the way we need:
  * It appears to be designed only to shrink the window, not to allow it to
    re-expand.
  * More importantly, that commit has a serious bug where if the
    setsockopt() is made when the existing kernel advertised window for the
    socket happens to be zero, it will now become locked at zero, stopping
    any further data from being received on the socket.

Since this has never worked as intended, simply remove it.  It might be
possible to re-implement the intended behaviour by manipulating SO_RCVBUF,
so we leave a comment to that effect.

This kernel bug is the underlying cause of both the linked passt bug and
the linked podman bug.  We attempted to fix this before with passt commit
d3192f67 ("tcp: Force TCP_WINDOW_CLAMP before resetting STALLED flag").
However while that commit masked the bug for some cases, it didn't really
address the problem.

Fixes: d3192f67c492 ("tcp: Force TCP_WINDOW_CLAMP before resetting STALLED flag")
Link: https://github.com/containers/podman/issues/20170
Link: https://bugs.passt.top/show_bug.cgi?id=74
Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: Rename and small cleanup to tcp_clamp_window()</title>
<updated>2023-11-10T05:42:10+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2023-11-09T09:53:59+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=930bc3b0f2d2078747172e960807c84a8cd97495'/>
<id>930bc3b0f2d2078747172e960807c84a8cd97495</id>
<content type='text'>
tcp_clamp_window() is _mostly_ about using TCP_WINDOW_CLAMP to control the
sock side advertised window, but it is also responsible for actually
updating the conn-&gt;wnd_from_tap value.

Rename to tcp_tap_window_update() to reflect that broader purpose, and pull
the logic that's not TCP_WINDOW_CLAMP related out to the front.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
tcp_clamp_window() is _mostly_ about using TCP_WINDOW_CLAMP to control the
sock side advertised window, but it is also responsible for actually
updating the conn-&gt;wnd_from_tap value.

Rename to tcp_tap_window_update() to reflect that broader purpose, and pull
the logic that's not TCP_WINDOW_CLAMP related out to the front.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>test/lib/perf_report: Fix up table highlight for pasta's local flows</title>
<updated>2023-11-10T05:35:54+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2023-11-10T05:35:54+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=2c1554c994e56316ff9272f7204f09b478a9b607'/>
<id>2c1554c994e56316ff9272f7204f09b478a9b607</id>
<content type='text'>
As commit 29269705239f ("test/perf: Small MTUs for spliced TCP
aren't interesting") drops all columns for TCP test MTUs except for
one, in throughput test for pasta's local flows, the first column we
need to highlight in that table is now the second one.

Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As commit 29269705239f ("test/perf: Small MTUs for spliced TCP
aren't interesting") drops all columns for TCP test MTUs except for
one, in throughput test for pasta's local flows, the first column we
need to highlight in that table is now the second one.

Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "selinux: Drop user_namespace class rules for Fedora 37"</title>
<updated>2023-11-07T13:58:02+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2023-11-07T13:58:02+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=56d9f6d588306301aed332ca926da91a816bafd1'/>
<id>56d9f6d588306301aed332ca926da91a816bafd1</id>
<content type='text'>
This reverts commit 3fb3f0f7a59498bdea1d199eecfdbae6c608f78f: it was
meant as a patch for Fedora 37 (and no later versions), not something
I should have merged upstream.

Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit 3fb3f0f7a59498bdea1d199eecfdbae6c608f78f: it was
meant as a patch for Fedora 37 (and no later versions), not something
I should have merged upstream.

Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>selinux: Allow passt to talk over unconfined_t UNIX domain socket for --fd</title>
<updated>2023-11-07T11:28:27+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2023-11-07T11:28:27+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=74e6f48038e64bbdfa5fa265db330f95ce68c182'/>
<id>74e6f48038e64bbdfa5fa265db330f95ce68c182</id>
<content type='text'>
If passt is started with --fd to talk over a pre-opened UNIX domain
socket, we don't really know what label might be associated to it,
but at least for an unconfined_t socket, this bit of policy wouldn't
belong to anywhere else: enable that here.

This is rather loose, of course, but on the other hand passt will
sandbox itself into an empty filesystem, so we're not really adding
much to the attack surface except for what --fd is supposed to do.

Reported-by: Matej Hrica &lt;mhrica@redhat.com&gt;
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2247221
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If passt is started with --fd to talk over a pre-opened UNIX domain
socket, we don't really know what label might be associated to it,
but at least for an unconfined_t socket, this bit of policy wouldn't
belong to anywhere else: enable that here.

This is rather loose, of course, but on the other hand passt will
sandbox itself into an empty filesystem, so we're not really adding
much to the attack surface except for what --fd is supposed to do.

Reported-by: Matej Hrica &lt;mhrica@redhat.com&gt;
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2247221
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>log: Match implicit va_start() with va_end() in vlogmsg()</title>
<updated>2023-11-07T11:24:27+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2023-11-07T11:17:07+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=50bc25a23cfb2c9f3708cfdb3e2787ddf3d5ab34'/>
<id>50bc25a23cfb2c9f3708cfdb3e2787ddf3d5ab34</id>
<content type='text'>
According to C99, 7.15.1:

  Each invocation of the va_start and va_copy macros shall be matched
  by a corresponding invocation of the va_end macro in the same
  function

and the same applies to C11. I still have to come across a platform
where va_end() actually does something, but thus spake the standard.
This would be reported by Coverity as "Missing varargs init or
cleanup" (CWE-573).

Fixes: c0426ff10bc9 ("log: Add vlogmsg()")
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
According to C99, 7.15.1:

  Each invocation of the va_start and va_copy macros shall be matched
  by a corresponding invocation of the va_end macro in the same
  function

and the same applies to C11. I still have to come across a platform
where va_end() actually does something, but thus spake the standard.
This would be reported by Coverity as "Missing varargs init or
cleanup" (CWE-573).

Fixes: c0426ff10bc9 ("log: Add vlogmsg()")
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>port_fwd: Don't try to read bound ports from invalid file handles</title>
<updated>2023-11-07T11:24:27+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2023-11-07T11:04:33+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=9494a51a4e079f4aead3e07a6bdf1c43b4516133'/>
<id>9494a51a4e079f4aead3e07a6bdf1c43b4516133</id>
<content type='text'>
This is a minimal fix for what would be reported by Coverity as
"Improper use of negative value" (CWE-394): port_fwd_init() doesn't
guarantee that all the pre-opened file handles are actually valid.

We should probably warn on failing open() and open_in_ns() in
port_fwd_init(), too, but that's outside the scope of this minimal
fix.

Before commit 5a0485425bc9 ("port_fwd: Pre-open /proc/net/* files
rather than on-demand"), we used to have a single open() call and
a check after it.

Fixes: 5a0485425bc9 ("port_fwd: Pre-open /proc/net/* files rather than on-demand")
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is a minimal fix for what would be reported by Coverity as
"Improper use of negative value" (CWE-394): port_fwd_init() doesn't
guarantee that all the pre-opened file handles are actually valid.

We should probably warn on failing open() and open_in_ns() in
port_fwd_init(), too, but that's outside the scope of this minimal
fix.

Before commit 5a0485425bc9 ("port_fwd: Pre-open /proc/net/* files
rather than on-demand"), we used to have a single open() call and
a check after it.

Fixes: 5a0485425bc9 ("port_fwd: Pre-open /proc/net/* files rather than on-demand")
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netlink: Sequence numbers are actually 32 bits wide</title>
<updated>2023-11-07T11:22:13+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2023-11-07T10:13:05+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=b94462296937f59e3750e1c35b80b69a67a535af'/>
<id>b94462296937f59e3750e1c35b80b69a67a535af</id>
<content type='text'>
Harmless, as we use sequence numbers monotonically anyway, but now
clang-tidy reports:

/home/sbrivio/passt/netlink.c:155:7: error: format specifies type 'unsigned short' but the argument has type '__u32' (aka 'unsigned int') [clang-diagnostic-format,-warnings-as-errors]
                    nh-&gt;nlmsg_seq, seq);
                    ^
/home/sbrivio/passt/log.h:26:7: note: expanded from macro 'die'
                err(__VA_ARGS__);                                       \
                    ^~~~~~~~~~~
/home/sbrivio/passt/log.h:19:34: note: expanded from macro 'err'
                                        ^~~~~~~~~~~
Suppressed 222820 warnings (222816 in non-user code, 4 NOLINT).
Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
1 warning treated as error
make: *** [Makefile:255: clang-tidy] Error 1

Fixes: 9d4ab98d538f ("netlink: Add nl_do() helper for simple operations with error checking")
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Harmless, as we use sequence numbers monotonically anyway, but now
clang-tidy reports:

/home/sbrivio/passt/netlink.c:155:7: error: format specifies type 'unsigned short' but the argument has type '__u32' (aka 'unsigned int') [clang-diagnostic-format,-warnings-as-errors]
                    nh-&gt;nlmsg_seq, seq);
                    ^
/home/sbrivio/passt/log.h:26:7: note: expanded from macro 'die'
                err(__VA_ARGS__);                                       \
                    ^~~~~~~~~~~
/home/sbrivio/passt/log.h:19:34: note: expanded from macro 'err'
                                        ^~~~~~~~~~~
Suppressed 222820 warnings (222816 in non-user code, 4 NOLINT).
Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
1 warning treated as error
make: *** [Makefile:255: clang-tidy] Error 1

Fixes: 9d4ab98d538f ("netlink: Add nl_do() helper for simple operations with error checking")
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
