<feed xmlns='http://www.w3.org/2005/Atom'>
<title>passt/vu_common.c, branch 2025_06_11.0293c6f</title>
<subtitle>Plug A Simple Socket Transport</subtitle>
<link rel='alternate' type='text/html' href='https://passt.top/passt/'/>
<entry>
<title>tap: Make size of pool_tap[46] purely a tuning parameter</title>
<updated>2025-03-20T19:33:09+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2025-03-17T09:24:16+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=a41d6d125eca5ac8c54bed8157098be141557b03'/>
<id>a41d6d125eca5ac8c54bed8157098be141557b03</id>
<content type='text'>
Currently we attempt to size pool_tap[46] so they have room for the maximum
possible number of packets that could fit in pkt_buf (TAP_MSGS).  However,
the calculation isn't quite correct: TAP_MSGS is based on ETH_ZLEN (60) as
the minimum possible L2 frame size.  But ETH_ZLEN is based on physical
constraints of Ethernet, which don't apply to our virtual devices.  It is
possible to generate a legitimate frame smaller than this, for example an
empty payload UDP/IPv4 frame on the 'pasta' backend is only 42 bytes long.

Further more, the same limit applies for vhost-user, which is not limited
by the size of pkt_buf like the other backends.  In that case we don't even
have full control of the maximum buffer size, so we can't really calculate
how many packets could fit in there.

If we exceed do TAP_MSGS we'll drop packets, not just use more batches,
which is moderately bad.  The fact that this needs to be sized just so for
correctness not merely for tuning is a fairly non-obvious coupling between
different parts of the code.

To make this more robust, alter the tap code so it doesn't rely on
everything fitting in a single batch of TAP_MSGS packets, instead breaking
into multiple batches as necessary.  This leaves TAP_MSGS as purely a
tuning parameter, which we can freely adjust based on performance measures.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we attempt to size pool_tap[46] so they have room for the maximum
possible number of packets that could fit in pkt_buf (TAP_MSGS).  However,
the calculation isn't quite correct: TAP_MSGS is based on ETH_ZLEN (60) as
the minimum possible L2 frame size.  But ETH_ZLEN is based on physical
constraints of Ethernet, which don't apply to our virtual devices.  It is
possible to generate a legitimate frame smaller than this, for example an
empty payload UDP/IPv4 frame on the 'pasta' backend is only 42 bytes long.

Further more, the same limit applies for vhost-user, which is not limited
by the size of pkt_buf like the other backends.  In that case we don't even
have full control of the maximum buffer size, so we can't really calculate
how many packets could fit in there.

If we exceed do TAP_MSGS we'll drop packets, not just use more batches,
which is moderately bad.  The fact that this needs to be sized just so for
correctness not merely for tuning is a fairly non-obvious coupling between
different parts of the code.

To make this more robust, alter the tap code so it doesn't rely on
everything fitting in a single batch of TAP_MSGS packets, instead breaking
into multiple batches as necessary.  This leaves TAP_MSGS as purely a
tuning parameter, which we can freely adjust based on performance measures.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>packet: More cautious checks to avoid pointer arithmetic UB</title>
<updated>2025-03-20T19:33:06+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2025-03-17T09:24:15+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=e43e00719d7701301e4bc4fb179dc7adff175409'/>
<id>e43e00719d7701301e4bc4fb179dc7adff175409</id>
<content type='text'>
packet_check_range and vu_packet_check_range() verify that the packet or
section of packet we're interested in lies in the packet buffer pool we
expect it to.  However, in doing so it doesn't avoid the possibility of
an integer overflow while performing pointer arithmetic, with is UB.  In
fact, AFAICT it's UB even to use arbitrary pointer arithmetic to construct
a pointer outside of a known valid buffer.

To do this safely, we can't calculate the end of a memory region with
pointer addition when then the length as untrusted.  Instead we must work
out the offset of one memory region within another using pointer
subtraction, then do integer checks against the length of the outer region.
We then need to be careful about the order of checks so that those integer
checks can't themselves overflow.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
packet_check_range and vu_packet_check_range() verify that the packet or
section of packet we're interested in lies in the packet buffer pool we
expect it to.  However, in doing so it doesn't avoid the possibility of
an integer overflow while performing pointer arithmetic, with is UB.  In
fact, AFAICT it's UB even to use arbitrary pointer arithmetic to construct
a pointer outside of a known valid buffer.

To do this safely, we can't calculate the end of a memory region with
pointer addition when then the length as untrusted.  Instead we must work
out the offset of one memory region within another using pointer
subtraction, then do integer checks against the length of the outer region.
We then need to be careful about the order of checks so that those integer
checks can't themselves overflow.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vu_common: Tighten vu_packet_check_range()</title>
<updated>2025-03-20T19:32:50+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2025-03-17T09:24:14+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=4592719a744bcb47db2ff5680be4b8f6362a97ce'/>
<id>4592719a744bcb47db2ff5680be4b8f6362a97ce</id>
<content type='text'>
This function verifies that the given packet is within the mmap()ed memory
region of the vhost-user device.  We can do better, however.  The packet
should be not only within the mmap()ed range, but specifically in the
subsection of that range set aside for shared buffers, which starts at
dev_region-&gt;mmap_offset within there.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This function verifies that the given packet is within the mmap()ed memory
region of the vhost-user device.  We can do better, however.  The packet
should be not only within the mmap()ed range, but specifically in the
subsection of that range set aside for shared buffers, which starts at
dev_region-&gt;mmap_offset within there.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>packet: Don't pass start and offset separately to packet_check_range()</title>
<updated>2025-02-18T07:43:12+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2025-02-18T02:07:18+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=354bc0bab1cb6095592288674d375511443427fd'/>
<id>354bc0bab1cb6095592288674d375511443427fd</id>
<content type='text'>
Fundamentally what packet_check_range() does is to check whether a given
memory range is within the allowed / expected memory set aside for packets
from a particular pool.  That range could represent a whole packet (from
packet_add_do()) or part of a packet (from packet_get_do()), but it doesn't
really matter which.

However, we pass the start of the range as two parameters: @start which is
the start of the packet, and @offset which is the offset within the packet
of the range we're interested in.  We never use these separately, only as
(start + offset).  Simplify the interface of packet_check_range() and
vu_packet_check_range() to directly take the start of the relevant range.
This will allow some additional future improvements.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fundamentally what packet_check_range() does is to check whether a given
memory range is within the allowed / expected memory set aside for packets
from a particular pool.  That range could represent a whole packet (from
packet_add_do()) or part of a packet (from packet_get_do()), but it doesn't
really matter which.

However, we pass the start of the range as two parameters: @start which is
the start of the packet, and @offset which is the offset within the packet
of the range we're interested in.  We never use these separately, only as
(start + offset).  Simplify the interface of packet_check_range() and
vu_packet_check_range() to directly take the start of the relevant range.
This will allow some additional future improvements.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>migrate: Skeleton of live migration logic</title>
<updated>2025-02-12T18:47:07+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2025-02-12T07:07:13+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=5911e08c0f53e46547e7eeb1dd824c8ab96e512e'/>
<id>5911e08c0f53e46547e7eeb1dd824c8ab96e512e</id>
<content type='text'>
Introduce facilities for guest migration on top of vhost-user
infrastructure.  Add migration facilities based on top of the current
vhost-user infrastructure, moving vu_migrate() and related functions
to migrate.c.

Versioned migration stages define function pointers to be called on
source or target, or data sections that need to be transferred.

The migration header consists of a magic number, a version number for the
encoding, and a "compat_version" which represents the oldest version which
is compatible with the current one.  We don't use it yet, but that allows
for the future possibility of backwards compatible protocol extensions.

Co-authored-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduce facilities for guest migration on top of vhost-user
infrastructure.  Add migration facilities based on top of the current
vhost-user infrastructure, moving vu_migrate() and related functions
to migrate.c.

Versioned migration stages define function pointers to be called on
source or target, or data sections that need to be transferred.

The migration header consists of a magic number, a version number for the
encoding, and a "compat_version" which represents the oldest version which
is compatible with the current one.  We don't use it yet, but that allows
for the future possibility of backwards compatible protocol extensions.

Co-authored-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost_user: Turn some vhost-user message reports to trace()</title>
<updated>2025-02-04T00:28:04+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2025-01-31T10:41:51+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=e894d9ae8212c49dc44e52ad583954ed24e6905b'/>
<id>e894d9ae8212c49dc44e52ad583954ed24e6905b</id>
<content type='text'>
Having every vhost-user message printed as part of debug output makes
debugging anything else a bit complicated.

Change per-packet debug() messages in vu_kick_cb() and
vu_send_single() to trace()

[dgibson: switch different messages to trace()]
Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Having every vhost-user message printed as part of debug output makes
debugging anything else a bit complicated.

Change per-packet debug() messages in vu_kick_cb() and
vu_send_single() to trace()

[dgibson: switch different messages to trace()]
Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>util: Rename and make global vu_remove_watch()</title>
<updated>2025-02-03T06:32:51+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2025-01-30T06:52:11+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=0349cf637f64a5128846c79d9537849e1ed3e1cc'/>
<id>0349cf637f64a5128846c79d9537849e1ed3e1cc</id>
<content type='text'>
vu_remove_watch() is used in vhost_user.c to remove an fd from the global
epoll set.  There's nothing really vhost user specific about it though,
so rename, move to util.c and use it in a bunch of places outside
vhost_user.c where it makes things marginally more readable.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
vu_remove_watch() is used in vhost_user.c to remove an fd from the global
epoll set.  There's nothing really vhost user specific about it though,
so rename, move to util.c and use it in a bunch of places outside
vhost_user.c where it makes things marginally more readable.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost_user: Drop packet with unsupported iovec array</title>
<updated>2025-01-21T13:30:42+00:00</updated>
<author>
<name>Laurent Vivier</name>
<email>lvivier@redhat.com</email>
</author>
<published>2025-01-21T13:16:02+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=4f2c8e79130ef3d6132e34c49746e397745f9d73'/>
<id>4f2c8e79130ef3d6132e34c49746e397745f9d73</id>
<content type='text'>
If the iovec array cannot be managed, drop it rather than
passing the second entry to tap_add_packet().

Signed-off-by: Laurent Vivier &lt;lvivier@redhat.com&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If the iovec array cannot be managed, drop it rather than
passing the second entry to tap_add_packet().

Signed-off-by: Laurent Vivier &lt;lvivier@redhat.com&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost_user: remove ASSERT() on iovec number</title>
<updated>2025-01-20T18:51:24+00:00</updated>
<author>
<name>Laurent Vivier</name>
<email>lvivier@redhat.com</email>
</author>
<published>2025-01-20T13:15:22+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=c96a88d550fcda3f1972aee395fcfda19905d0a4'/>
<id>c96a88d550fcda3f1972aee395fcfda19905d0a4</id>
<content type='text'>
Replace ASSERT() on the number of iovec in the element and on
the first entry length by a debug() message.

Signed-off-by: Laurent Vivier &lt;lvivier@redhat.com&gt;
[sbrivio: Fix typo in failure message]
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace ASSERT() on the number of iovec in the element and on
the first entry length by a debug() message.

Signed-off-by: Laurent Vivier &lt;lvivier@redhat.com&gt;
[sbrivio: Fix typo in failure message]
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost-user: add VHOST_USER_SET_DEVICE_STATE_FD command</title>
<updated>2025-01-20T18:51:24+00:00</updated>
<author>
<name>Laurent Vivier</name>
<email>lvivier@redhat.com</email>
</author>
<published>2024-12-19T11:13:59+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=31d70024beda1e49131d7b68dd7554bee16c79f3'/>
<id>31d70024beda1e49131d7b68dd7554bee16c79f3</id>
<content type='text'>
Set the file descriptor to use to transfer the
backend device state during migration.

Signed-off-by: Laurent Vivier &lt;lvivier@redhat.com&gt;
[sbrivio: Fixed nits and coding style here and there]
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Set the file descriptor to use to transfer the
backend device state during migration.

Signed-off-by: Laurent Vivier &lt;lvivier@redhat.com&gt;
[sbrivio: Fixed nits and coding style here and there]
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
