From 7c33b1208632a9581d0ee7aabd1e0584a5d1fb20 Mon Sep 17 00:00:00 2001 From: David Gibson Date: Sat, 15 Feb 2025 00:08:41 +1100 Subject: vhost_user: Clear ring address on GET_VRING_BASE GET_VRING_BASE stops the queue, clearing the call and kick fds. However, we don't clear vring.avail. That means that if vu_queue_notify() is called it won't realise the queue isn't ready and will die with an EBADFD. We get this during migration, because for some reason, qemu reconfigures the vhost-user device when a migration is triggered. There's a window between the GET_VRING_BASE and re-establishing the call fd where the notify function can be called, causing a crash. Signed-off-by: David Gibson Signed-off-by: Stefano Brivio --- vhost_user.c | 1 + 1 file changed, 1 insertion(+) diff --git a/vhost_user.c b/vhost_user.c index 7ab1377..be1aa94 100644 --- a/vhost_user.c +++ b/vhost_user.c @@ -732,6 +732,7 @@ static bool vu_get_vring_base_exec(struct vu_dev *vdev, msg->hdr.size = sizeof(msg->payload.state); vdev->vq[idx].started = false; + vdev->vq[idx].vring.avail = 0; if (vdev->vq[idx].call_fd != -1) { close(vdev->vq[idx].call_fd); -- cgit v1.2.3