aboutgitcodebugslistschat
diff options
context:
space:
mode:
authorStefano Brivio <sbrivio@redhat.com>2022-06-16 15:00:06 +0200
committerStefano Brivio <sbrivio@redhat.com>2022-06-18 09:06:00 +0200
commitfca5e11773d06ed4e083a5f0b6b8ba1b81c487be (patch)
tree244b12955bb66843b4957f1694ed0fd8f9de4e13
parent721fa1bf5dc01775de89c2622d927588d7c7d018 (diff)
downloadpasst-fca5e11773d06ed4e083a5f0b6b8ba1b81c487be.tar
passt-fca5e11773d06ed4e083a5f0b6b8ba1b81c487be.tar.gz
passt-fca5e11773d06ed4e083a5f0b6b8ba1b81c487be.tar.bz2
passt-fca5e11773d06ed4e083a5f0b6b8ba1b81c487be.tar.lz
passt-fca5e11773d06ed4e083a5f0b6b8ba1b81c487be.tar.xz
passt-fca5e11773d06ed4e083a5f0b6b8ba1b81c487be.tar.zst
passt-fca5e11773d06ed4e083a5f0b6b8ba1b81c487be.zip
qrap: Add probe retry on connection reset from passt for KubeVirt integration
KubeVirt uses libvirt to start qrap in its current draft integration (https://github.com/kubevirt/kubevirt/pull/7849/), and libvirtd starts qrap three times every time a new virtual machine is created: once on domain creation, and twice on domain start (for "probing") and to finally start it for real. Very often, a subsequent invocation of qrap happen before the previously running instance of qemu terminates, which means that passt will refuse the new connection as the old one is still active. Introduce a single retry with a 100ms delay to work around this. This should be checked again once native libvirt support is there, and that point qrap will have no reason to exist anymore. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
-rw-r--r--qrap.c34
1 files changed, 33 insertions, 1 deletions
diff --git a/qrap.c b/qrap.c
index 17cc472..303e981 100644
--- a/qrap.c
+++ b/qrap.c
@@ -25,6 +25,7 @@
#include <net/if_arp.h>
#include <netinet/in.h>
#include <netinet/if_ether.h>
+#include <time.h>
#include "util.h"
#include "passt.h"
@@ -112,7 +113,7 @@ void usage(const char *name)
int main(int argc, char **argv)
{
struct timeval tv = { .tv_sec = 0, .tv_usec = (long)(500 * 1000) };
- int i, s, qemu_argc = 0, addr_map = 0, has_dev = 0;
+ int i, s, qemu_argc = 0, addr_map = 0, has_dev = 0, retry_on_reset;
char *qemu_argv[ARG_MAX], dev_str[ARG_MAX];
struct sockaddr_un addr = {
.sun_family = AF_UNIX,
@@ -233,6 +234,9 @@ int main(int argc, char **argv)
valid_args:
for (i = 1; i < UNIX_SOCK_MAX; i++) {
+ retry_on_reset = 1;
+
+retry:
s = socket(AF_UNIX, SOCK_STREAM, 0);
if (s < 0) {
perror("socket");
@@ -254,6 +258,34 @@ valid_args:
else
break;
+ /* FIXME: in a KubeVirt environment, libvirtd invokes qrap three
+ * times in a strict sequence when a virtual machine needs to
+ * be started, namely, when:
+ * - the domain XML is saved
+ * - the domain is started (for "probing")
+ * - the virtual machine is started for real
+ * and it often happens that the qemu process is still running
+ * when qrap is invoked again, so passt will refuse the new
+ * connection because the previous one is still active. This
+ * overlap seems to be anywhere between 0 and 3ms.
+ *
+ * If we get a connection reset, retry, just once, after 100ms,
+ * to allow for the previous qemu instance to terminate and, in
+ * turn, for the connection to passt to be closed.
+ *
+ * This should be fixed in libvirt instead. It probably makes
+ * sense to check this behaviour once native libvirt support is
+ * there, and this implies native qemu support too, so at that
+ * point qrap will have no reason to exist anymore -- that is,
+ * this FIXME will probably remain until the tool itself is
+ * obsoleted.
+ */
+ if (retry_on_reset && errno == ECONNRESET) {
+ retry_on_reset = 0;
+ usleep(100 * 1000);
+ goto retry;
+ }
+
fprintf(stderr, "Probe of %s failed\n", addr.sun_path);
close(s);