<feed xmlns='http://www.w3.org/2005/Atom'>
<title>passt/tcp_conn.h, branch vhost-user-migration</title>
<subtitle>Plug A Simple Socket Transport</subtitle>
<link rel='alternate' type='text/html' href='https://passt.top/passt/'/>
<entry>
<title>tcp_splice: Eliminate SPLICE_V6 flag</title>
<updated>2024-07-19T16:32:53+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2024-07-18T05:26:33+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=f19a8f71f9df0fb01e81af56cf30dd207958a591'/>
<id>f19a8f71f9df0fb01e81af56cf30dd207958a591</id>
<content type='text'>
Since we're now constructing socket addresses based on information in the
flowside, we no longer need an explicit flag to tell if we're dealing with
an IPv4 or IPv6 connection.  Hence, drop the now unused SPLICE_V6 flag.

As well as just simplifying the code, this allows for possible future
extensions where we could splice an IPv4 connection to an IPv6 connection
or vice versa.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since we're now constructing socket addresses based on information in the
flowside, we no longer need an explicit flag to tell if we're dealing with
an IPv4 or IPv6 connection.  Hence, drop the now unused SPLICE_V6 flag.

As well as just simplifying the code, this allows for possible future
extensions where we could splice an IPv4 connection to an IPv6 connection
or vice versa.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp, flow: Remove redundant information, repack connection structures</title>
<updated>2024-07-19T16:32:41+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2024-07-18T05:26:29+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=f9fe212b1f2bc105939bd2603991dbcd0e6a3b5f'/>
<id>f9fe212b1f2bc105939bd2603991dbcd0e6a3b5f</id>
<content type='text'>
Some information we explicitly store in the TCP connection is now
duplicated in the common flow structure.  Access it from there instead, and
remove it from the TCP specific structure.   With that done we can reorder
both the "tap" and "splice" TCP structures a bit to get better packing for
the new combined flow table entries.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some information we explicitly store in the TCP connection is now
duplicated in the common flow structure.  Access it from there instead, and
remove it from the TCP specific structure.   With that done we can reorder
both the "tap" and "splice" TCP structures a bit to get better packing for
the new combined flow table entries.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp_splice: Use parameterised macros for per-side event/flag bits</title>
<updated>2024-07-17T13:30:11+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2024-07-17T04:52:21+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=66a02c9f7cd5c7a643d9ac5dad5a7209a6e1c467'/>
<id>66a02c9f7cd5c7a643d9ac5dad5a7209a6e1c467</id>
<content type='text'>
Both the events and flags fields in tcp_splice_conn have several bits
which are per-side, e.g. OUT_WAIT_0 for side 0 and OUT_WAIT_1 for side 1.
This necessitates some rather awkward ternary expressions when we need
to get the relevant bit for a particular side.

Simplify this by using a parameterised macro for the bit values.  This
needs a ternary expression inside the macros, but makes the places we use
it substantially clearer.

That simplification in turn allows us to use a loop across each side to
implement several things which are currently open coded to do equivalent
things for each side in turn.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Both the events and flags fields in tcp_splice_conn have several bits
which are per-side, e.g. OUT_WAIT_0 for side 0 and OUT_WAIT_1 for side 1.
This necessitates some rather awkward ternary expressions when we need
to get the relevant bit for a particular side.

Simplify this by using a parameterised macro for the bit values.  This
needs a ternary expression inside the macros, but makes the places we use
it substantially clearer.

That simplification in turn allows us to use a loop across each side to
implement several things which are currently open coded to do equivalent
things for each side in turn.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: Remove interim 'tapside' field from connection</title>
<updated>2024-05-22T21:21:06+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2024-05-21T05:57:08+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=cc801fb38f46157b83060b96ee99fa1669a6f42d'/>
<id>cc801fb38f46157b83060b96ee99fa1669a6f42d</id>
<content type='text'>
We recently introduced this field to keep track of which side of a TCP flow
is the guest/tap facing one.  Now that we generically record which pif each
side of each flow is connected to, we can easily derive that, and no longer
need to keep track of it explicitly.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We recently introduced this field to keep track of which side of a TCP flow
is the guest/tap facing one.  Now that we generically record which pif each
side of each flow is connected to, we can easily derive that, and no longer
need to keep track of it explicitly.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>flow: Make side 0 always be the initiating side</title>
<updated>2024-05-22T21:21:01+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2024-05-21T05:57:06+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=43571852e62dae90fd0404b8a7c73e727930718e'/>
<id>43571852e62dae90fd0404b8a7c73e727930718e</id>
<content type='text'>
Each flow in the flow table has two sides, 0 and 1, representing the
two interfaces between which passt/pasta will forward data for that flow.
Which side is which is currently up to the protocol specific code:  TCP
uses side 0 for the host/"sock" side and 1 for the guest/"tap" side, except
for spliced connections where it uses 0 for the initiating side and 1 for
the target side.  ICMP also uses 0 for the host/"sock" side and 1 for the
guest/"tap" side, but in its case the latter is always also the initiating
side.

Make this generically consistent by always using side 0 for the initiating
side and 1 for the target side.  This doesn't simplify a lot for now, and
arguably makes TCP slightly more complex, since we add an extra field to
the connection structure to record which is the guest facing side. This is
an interim change, which we'll be able to remove later.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;q
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Each flow in the flow table has two sides, 0 and 1, representing the
two interfaces between which passt/pasta will forward data for that flow.
Which side is which is currently up to the protocol specific code:  TCP
uses side 0 for the host/"sock" side and 1 for the guest/"tap" side, except
for spliced connections where it uses 0 for the initiating side and 1 for
the target side.  ICMP also uses 0 for the host/"sock" side and 1 for the
guest/"tap" side, but in its case the latter is always also the initiating
side.

Make this generically consistent by always using side 0 for the initiating
side and 1 for the target side.  This doesn't simplify a lot for now, and
arguably makes TCP slightly more complex, since we add an extra field to
the connection structure to record which is the guest facing side. This is
an interim change, which we'll be able to remove later.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;q
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>flow: Properly type callbacks to protocol specific handlers</title>
<updated>2024-05-22T21:20:52+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2024-05-21T05:57:03+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=7a832a8a0edbdd5a705c1fe03fca0790a535ab11'/>
<id>7a832a8a0edbdd5a705c1fe03fca0790a535ab11</id>
<content type='text'>
The flow dispatches deferred and timer handling for flows centrally, but
needs to call into protocol specific code for the handling of individual
flows.  Currently this passes a general union flow *.  It makes more sense
to pass the specific relevant flow type structure.  That brings the check
on the flow type adjacent to casting to the union variant which it tags.

Arguably, this is a slight abstraction violation since it involves the
generic flow code using protocol specific types.  It's already calling into
protocol specific functions, so I don't think this really makes any
difference.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The flow dispatches deferred and timer handling for flows centrally, but
needs to call into protocol specific code for the handling of individual
flows.  Currently this passes a general union flow *.  It makes more sense
to pass the specific relevant flow type structure.  That brings the check
on the flow type adjacent to casting to the union variant which it tags.

Arguably, this is a slight abstraction violation since it involves the
generic flow code using protocol specific types.  It's already calling into
protocol specific functions, so I don't think this really makes any
difference.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp, tcp_splice: Helpers for getting sockets from the pools</title>
<updated>2024-02-27T11:52:46+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2024-02-19T07:56:50+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=fe27ebce5c59c7fc684c5affa6ce27fdc32d362d'/>
<id>fe27ebce5c59c7fc684c5affa6ce27fdc32d362d</id>
<content type='text'>
We maintain pools of ready-to-connect sockets in both the original and
(for pasta) guest namespace to reduce latency when starting new TCP
connections.  If we exhaust those pools we have to take a higher
latency path to get a new socket.

Currently we open-code that fallback in the places we need it.  To improve
clarity encapsulate that into helper functions.  While we're at it, give
those helpers clearer error reporting.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We maintain pools of ready-to-connect sockets in both the original and
(for pasta) guest namespace to reduce latency when starting new TCP
connections.  If we exhaust those pools we have to take a higher
latency path to get a new socket.

Currently we open-code that fallback in the places we need it.  To improve
clarity encapsulate that into helper functions.  While we're at it, give
those helpers clearer error reporting.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp, tcp_splice: Issue warnings if unable to refill socket pool</title>
<updated>2024-02-27T11:52:44+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2024-02-19T07:56:49+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=fbe81decbdcdfed4b4ff336fcec5fe6ad0dfbe65'/>
<id>fbe81decbdcdfed4b4ff336fcec5fe6ad0dfbe65</id>
<content type='text'>
Currently if tcp_sock_refill_pool() is unable to fill all the slots in the
pool, it will silently exit.  This might lead to a later attempt to get
fds from the pool to fail at which point it will be harder to tell what
originally went wrong.

Instead add warnings if we're unable to refill any of the socket pools when
requested.  We have tcp_sock_refill_pool() return an error and report it
in the callers, because those callers have more context allowing for a
more useful message.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently if tcp_sock_refill_pool() is unable to fill all the slots in the
pool, it will silently exit.  This might lead to a later attempt to get
fds from the pool to fail at which point it will be harder to tell what
originally went wrong.

Instead add warnings if we're unable to refill any of the socket pools when
requested.  We have tcp_sock_refill_pool() return an error and report it
in the callers, because those callers have more context allowing for a
more useful message.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>treewide: Use sa_family_t for address family variables</title>
<updated>2024-02-27T11:52:02+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2024-02-19T07:56:46+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=4e08d9b9c6289ee00687203ce7a08106e9d45dc6'/>
<id>4e08d9b9c6289ee00687203ce7a08106e9d45dc6</id>
<content type='text'>
Sometimes we use sa_family_t for variables and parameters containing a
socket address family, other times we use a plain int.  Since sa_family_t
is what's actually used in struct sockaddr and friends, standardise on
that.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Sometimes we use sa_family_t for variables and parameters containing a
socket address family, other times we use a plain int.  Since sa_family_t
is what's actually used in struct sockaddr and friends, standardise on
that.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>flow: Avoid moving flow entries to compact table</title>
<updated>2024-01-22T22:35:37+00:00</updated>
<author>
<name>David Gibson</name>
<email>david@gibson.dropbear.id.au</email>
</author>
<published>2024-01-16T00:50:43+00:00</published>
<link rel='alternate' type='text/html' href='https://passt.top/passt/commit/?id=8981a720aac4ab22beb3375cd77062a8aed693e6'/>
<id>8981a720aac4ab22beb3375cd77062a8aed693e6</id>
<content type='text'>
Currently we always keep the flow table maximally compact: that is all the
active entries are contiguous at the start of the table.  Doing this
sometimes requires moving an entry when one is freed.  That's kind of
fiddly, and potentially expensive: it requires updating the hash table for
the new location, and depending on flow type, it may require EPOLL_CTL_MOD,
system calls to update epoll tags with the new location too.

Implement a new way of managing the flow table that doesn't ever move
entries.  It attempts to maintain some compactness by always using the
first free slot for a new connection, and mitigates the effect of non
compactness by cheaply skipping over contiguous blocks of free entries.
See the "theory of operation" comment in flow.c for details.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;b
[sbrivio: additional ASSERT(flow_first_free &lt;= FLOW_MAX - 2) to avoid
 Coverity Scan false positive]
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we always keep the flow table maximally compact: that is all the
active entries are contiguous at the start of the table.  Doing this
sometimes requires moving an entry when one is freed.  That's kind of
fiddly, and potentially expensive: it requires updating the hash table for
the new location, and depending on flow type, it may require EPOLL_CTL_MOD,
system calls to update epoll tags with the new location too.

Implement a new way of managing the flow table that doesn't ever move
entries.  It attempts to maintain some compactness by always using the
first free slot for a new connection, and mitigates the effect of non
compactness by cheaply skipping over contiguous blocks of free entries.
See the "theory of operation" comment in flow.c for details.

Signed-off-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;b
[sbrivio: additional ASSERT(flow_first_free &lt;= FLOW_MAX - 2) to avoid
 Coverity Scan false positive]
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
