Generalize virtualization stack to different libvirt hypervisor-drivers #259

harshitgupta1337 · 2024-02-12T17:12:25Z

This KubeVirt design proposal discusses how KubeVirt can be used to create libvirt virtual machines that are backed by diverse hypervisor drivers, such as QEMU/KVM, Xen, VirtualBox, etc. The aim of this proposal is to enumerate the design and implementation choices for enabling this multi-hypervisor support in KubeVirt.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.

Design: A design document was considered and is present (link) or not required
PR: The PR description is expressive enough and will help future contributors
Code: Write code that humans can understand and Keep it simple
Refactor: You have left the code cleaner than you found it (Boy Scout Rule)
Upgrade: Impact of this change on upgrade flows was considered and addressed if required
Testing: New code requires new unit tests. New features and bug fixes require at least on e2e test
Documentation: A user-guide update was considered and is present (link) or not required. You want a user-guide update if it's a user facing feature / API change.
Community: Announcement to kubevirt-dev was considered

Release note:

kubevirt-bot · 2024-02-12T17:12:36Z

Hi @harshitgupta1337. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot · 2024-02-12T17:12:43Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign fabiand for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

alicefr · 2024-02-13T07:43:54Z

/cc

fabiand

Thanks for this proposal.

To me the main areas that need to look at:

What is the complexity being added to the code
What functionality will be impacted? And how can we set the user expectations right?
Why do we need this and how will we organize ownership (we are doubling the test matrix) - who is "paying" for it?

fabiand · 2024-02-13T08:38:26Z

design-proposals/generalize-virt-stack/gen-virt-stack.md

+
+Although KubeVirt currently relies on libvirt to create and manage virtual machine instances (VMIs), it relies specifically on the QEMU/KVM virtualization stack (VMM and hypervisor) to host the VMI. This limits KubeVirt from being used in settings where a different VMM or hypervisor is used. 
+
+In fact, libvirt itself is flexible enough to support a diverse set of VMMs and hypervisors. The libvirt API delegates its implementation to one or more internal drivers, dependending on the [connection URI](https://libvirt.org/uri.html) passed when initializing the library. The list of currently supported hypervisor drivers in Libvirt are:


Correct that libvirt is supporting many hypervisors.

However, the supported featureset accross all hypervisors (speak the subset of features) is actually much smaller.

This is why KubeVirt inteintionally had focused on KVM only in order to not consider the special cases of different hypevisors.

I am in touch with the cloud-hypervisor community and they are actively working on achieving parity with qemu-kvm in terms of the VMI features offered by KubeVirt.

fabiand · 2024-02-13T08:38:32Z

design-proposals/generalize-virt-stack/gen-virt-stack.md

+
+## Goals
+
+KubeVirt should be able to offer a choice to its users over which libvirt hypervisor-driver they want to use to create their VMI.


Why?

This section is impotrant: Please provide a justification of how this will help KubeVirt users, or KubeVirt.

Microsoft is a vendor of KubeVirt as it leverages KubeVirt in its Azure Operator Nexus product as a VM orchestrator. The hypervisor currently used in the Nexus product is qemu-kvm, however, in the future MSFT is looking at alternative hypervisors such as cloud-hypervisor.
To continue using KubeVirt for this product it would make sense to make it hypervisor-agnostic.

There is also another project called Virtink which was created to add K8s-based orchestration support for cloud-hypervisor based VMs.
https://github.com/smartxworks/virtink
This shows that there is a need for K8s-based orchestration for cloud-hypervisor VMs, and KubeVirt already interfaces with libvirt - which supports cloud-hypervisor driver.

fabiand · 2024-02-13T08:39:49Z

design-proposals/generalize-virt-stack/gen-virt-stack.md

+
+## API Changes
+
+Addition of a `vmi.hypervisor` field. Example of how a user could request a specific hypervisor as the underlying libvirt hypervisor-driver through the VMI spec:


VMI seems to be quite fine granular.

if,t hen shouldnÄt it be a cluster level setting?

I am considering a scenario in which different cluster nodes could have a different virtualization-stack. In KubeVirt virt-handlers on different cluster nodes are independent, so IMO there is no reason to not set hypervisor at this fine granularity.

What is the reason for having different hypervisors in a single cluster?

There is also a cluster level impact, i.e. the overhead calculation.

No specific reason. Based on my understanding of the KubeVirt code, the overhead calculation is for the virt-launcher pod alone, so one could (in theory) have diff virt-launcher pods with diff hypervisors running on the same cluster. However, I could be wrong, so please correct me.

I don't have a specific scenario in mind as of now that would require multiple hypervisors on the same cluster, but it is a more flexible design choice imo to have the hypervisor-specific logic limited to the components that are tied to specific nodes (i.e., virt-handler and virt-launcher).

IMO, it makes sense to allow multiple hypervisors in the same cluster. Different hypervisor could fit to different use cases and we could have a unified management.

alicefr · 2024-02-15T08:07:06Z

design-proposals/generalize-virt-stack/gen-virt-stack.md

+
+## API Changes
+
+Addition of a `vmi.hypervisor` field. Example of how a user could request a specific hypervisor as the underlying libvirt hypervisor-driver through the VMI spec:


IMO, it makes sense to allow multiple hypervisors in the same cluster. Different hypervisor could fit to different use cases and we could have a unified management.

alicefr · 2024-02-15T08:07:12Z

design-proposals/generalize-virt-stack/gen-virt-stack.md

+
+```yaml
+spec:
+  hypervisor: cloud-hypervisor


I think here it is a good fit for something similar to the kubernetes runtime classes. If we need additional configuration specific to the hypervisor they could go into a new CRD

alicefr · 2024-02-15T08:08:55Z

design-proposals/generalize-virt-stack/gen-virt-stack.md

+
+- `virt-launcher` pod image should be specific to the `vmi.hypervisor`.
+
+- Hypervisor resource needed by the `virt-launcher` pod. For instance, for a VMI with `hypervisor=qemu-kvm`, the corresponding virt-launcher pod requires the resource `devices.kubevirt.io/kvm: 1`.


Please, take emulation also into account

Can you please expand on your comment?

alicefr · 2024-02-15T08:10:01Z

design-proposals/generalize-virt-stack/gen-virt-stack.md

+
+- Hypervisor resource needed by the `virt-launcher` pod. For instance, for a VMI with `hypervisor=qemu-kvm`, the corresponding virt-launcher pod requires the resource `devices.kubevirt.io/kvm: 1`.
+
+- The resource requirement of the virt-launcher pod should be adjusted (w.r.t. to the resource spec of the VMI), to take into account the resources consumed by the requested VMM daemon running in the `virt-launcher` pod. Currently, the field `VirtqemudOverhead` holds the memory overhead of the `virtqemud` process.


Could this go in the new Hypervisor Runtime CRD?

That is a good idea.

alicefr · 2024-02-15T08:15:03Z

@harshitgupta1337 I think what we need here it is an infrastructure for hypervisor plugins. Putting everything into kubevirt code would make the code base very large and hard to maintain.
Do you think it would be possible to change the goal of this proposal and try to develop a plugin instead.
Cloud Hypervisor could be the first hypervisor plugin.

alicefr · 2024-02-15T08:21:33Z

I think we need also a list of features supported by each specific hypervisor.

From the old proposal for Cloud Hypervisor integration, not all the features supported by QEMu are available in CH. We should have here also a similar section with the features as in the old proposal. Then, these options should be advertised in the new CRD if they are supported or not.

harshitgupta1337 · 2024-02-15T17:19:38Z

@harshitgupta1337 I think what we need here it is an infrastructure for hypervisor plugins. Putting everything into kubevirt code would make the code base very large and hard to maintain. Do you think it would be possible to change the goal of this proposal and try to develop a plugin instead. Cloud Hypervisor could be the first hypervisor plugin.

@alicefr You're referring to this feature in Golang, isn' it ?
https://pkg.go.dev/plugin
Then we would need 2 plugins, right? One for qemu for backwards compatibility and a new one for cloud-hypervisor?

alicefr · 2024-02-16T07:38:13Z

@harshitgupta1337 I think what we need here it is an infrastructure for hypervisor plugins. Putting everything into kubevirt code would make the code base very large and hard to maintain. Do you think it would be possible to change the goal of this proposal and try to develop a plugin instead. Cloud Hypervisor could be the first hypervisor plugin.

@alicefr You're referring to this feature in Golang, isn' it ? https://pkg.go.dev/plugin Then we would need 2 plugins, right? One for qemu for backwards compatibility and a new one for cloud-hypervisor?

No, I'm referring more to an API where we can plug different hypervisors, like CRI for the container runtimes

kubevirt-bot · 2024-05-16T07:58:08Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

iholder101 · 2024-05-22T11:38:41Z

/cc
I haven't looked at the proposal yet, but wanted to point out that this proposal reminds me of a similar design proposal from the past: #184.

This proposal is not focused specifically on cloud-hypervisor, which I think is a step forward.
I think that reading the discussion there will be interesting and valuable, especially this comment. To summarize:

So we'd need a similar interface to CRI, but for hypervisor runtimes for kubevirt.

I think that it will definitely not be easy to design and implement something like that, especially since currently Kubevirt heavily relies on libvirt-specific features/APIs. Saying that, if you're willing to invest the effort to achieve it, I think it would be valuable and I will be happy to promote it.

stu-gott · 2024-05-22T13:15:26Z

/remove-lifecycle stale

kubevirt-bot · 2024-08-20T13:25:48Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

iholder101 · 2024-08-20T14:12:48Z

@harshitgupta1337 are you still interested in this?

harshitgupta1337 · 2024-08-20T21:27:59Z

@iholder101 Yes I am working on addressing the suggestions mentioned earlier. I should be able to come up with an updated design by the end of this month.

iholder101 · 2024-08-21T06:42:29Z

@iholder101 Yes I am working on addressing the suggestions mentioned earlier. I should be able to come up with an updated design by the end of this month.

Great to hear that!
Good luck, let me know if I can help in any way.

/remove-lifecycle stale

harshitgupta1337 · 2024-08-30T20:26:03Z

@harshitgupta1337 I think what we need here it is an infrastructure for hypervisor plugins. Putting everything into kubevirt code would make the code base very large and hard to maintain. Do you think it would be possible to change the goal of this proposal and try to develop a plugin instead. Cloud Hypervisor could be the first hypervisor plugin.

@alicefr You're referring to this feature in Golang, isn' it ? https://pkg.go.dev/plugin Then we would need 2 plugins, right? One for qemu for backwards compatibility and a new one for cloud-hypervisor?

No, I'm referring more to an API where we can plug different hypervisors, like CRI for the container runtimes

Hi @alicefr I am curious why a CRI-like interface is necessary in this scenario, given that libvirt provides a uniform abstraction for creating VMs?

xpivarc · 2024-09-02T13:24:26Z

@harshitgupta1337 In the code (virt-handler's one to be specific) you can find out that we directly configure qemu as well. So as such Libvirt does support multiple "runtimes" but since the libvirt is not privilege in Kubevirt, it can't do everything.

harshitgupta1337 · 2024-09-18T14:49:45Z

Open question: How to manage the features that are supported on one Virt Stack and not on another?

kubevirt-bot · 2024-09-26T20:17:11Z

Thanks for your pull request. Before we can look at it, you'll need to add a 'DCO signoff' to your commits.

📝 Please follow instructions in the contributing guide to update your commits with the DCO

Full details of the Developer Certificate of Origin can be found at developercertificate.org.

The list of commits missing DCO signoff:

2ea83ae Add design proposal for KubeVirt generalize virt stack
61244c9 Add design text. Diagram pending
1e50a6f Add Implementation Phases
bb7ad21 Add KubeVirt VMI flow
7fc0c24 Update
d1036a5 Update goals
4b9a5ce Add interface code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

harshitgupta1337 · 2024-09-26T20:20:38Z

Hey @everyone, I have updated the design proposal with a couple of interfaces that I have added in my PoC that expose necessary functions for supporting both qemu-kvm and cloud-hypervisor.
@vladikr

aburdenthehand · 2024-10-14T14:12:08Z

@xpivarc @fabiand @alicefr @iholder101
Please take a look at the updated proposal when you can

iholder101 · 2024-12-03T10:39:58Z

@xpivarc @fabiand @alicefr @iholder101 Please take a look at the updated proposal when you can

Sorry to chime in late on this one.

@harshitgupta1337 I appreciate your initiative and fundamentally support it.

However, being honest here, a huge amount of work is probably required in order for this to actually get in. The biggest challenges here, as I see them, are:

Currently most of our code assumes we use libvirt and QEMU. This assumption is spread around the entire codebase. The code would need to be heavily refactored in order to eliminate this assumption.
We have to keep Kubevirt maintainable. This means that this effort should not significantly hurt the development speed, ease or reliability. For example, some of the current and future features might not be supported in all hypervisors, which should not mean these features shouldn't be implemented.
Testability: we need to ensure we're able to test Kubevirt with multiple hypervisors. This is both a matter of writing these tests, but also paying for CI resources to test them, which someone should pay.

And this is just a very partial list :)
What I'm trying to say is: this is a drastically huge change which would demand an outstanding amount of work. I encourage you and others to pursue this but it's important to me to honestly reflect the challenges. I also tend to think that this work would be too much for one person, and would probably demand multiple people gather in a working group.

Harshit Gupta added 4 commits February 11, 2024 17:48

Add design proposal for KubeVirt generalize virt stack

2ea83ae

Add design text. Diagram pending

61244c9

Add Implementation Phases

1e50a6f

Add KubeVirt VMI flow

bb7ad21

kubevirt-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: no Indicates the PR's author has not DCO signed all their commits. labels Feb 12, 2024

kubevirt-bot requested review from jean-edouard and jobbler February 12, 2024 17:12

kubevirt-bot added the size/L label Feb 12, 2024

kubevirt-bot requested a review from alicefr February 13, 2024 07:43

fabiand reviewed Feb 13, 2024

View reviewed changes

Harshit Gupta added 2 commits February 13, 2024 14:16

Update

7fc0c24

Update goals

d1036a5

harshitgupta1337 marked this pull request as ready for review February 13, 2024 19:36

kubevirt-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 13, 2024

alicefr reviewed Feb 15, 2024

View reviewed changes

vasiliy-ul mentioned this pull request Mar 6, 2024

Add QEMU and Libvirt version in the VMI status kubevirt/kubevirt#11447

Closed

kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 16, 2024

kubevirt-bot requested a review from iholder101 May 22, 2024 11:38

kubevirt-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 22, 2024

kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 20, 2024

kubevirt-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 21, 2024

harshitgupta1337 mentioned this pull request Sep 18, 2024

Guptaharshit/mshv refactor backup harshitgupta1337/kubevirt#3

Draft

Add interface code

4b9a5ce


		Although KubeVirt currently relies on libvirt to create and manage virtual machine instances (VMIs), it relies specifically on the QEMU/KVM virtualization stack (VMM and hypervisor) to host the VMI. This limits KubeVirt from being used in settings where a different VMM or hypervisor is used.

		In fact, libvirt itself is flexible enough to support a diverse set of VMMs and hypervisors. The libvirt API delegates its implementation to one or more internal drivers, dependending on the [connection URI](https://libvirt.org/uri.html) passed when initializing the library. The list of currently supported hypervisor drivers in Libvirt are:


		## Goals

		KubeVirt should be able to offer a choice to its users over which libvirt hypervisor-driver they want to use to create their VMI.


		## API Changes

		Addition of a `vmi.hypervisor` field. Example of how a user could request a specific hypervisor as the underlying libvirt hypervisor-driver through the VMI spec:


		- `virt-launcher` pod image should be specific to the `vmi.hypervisor`.

		- Hypervisor resource needed by the `virt-launcher` pod. For instance, for a VMI with `hypervisor=qemu-kvm`, the corresponding virt-launcher pod requires the resource `devices.kubevirt.io/kvm: 1`.


		- Hypervisor resource needed by the `virt-launcher` pod. For instance, for a VMI with `hypervisor=qemu-kvm`, the corresponding virt-launcher pod requires the resource `devices.kubevirt.io/kvm: 1`.

		- The resource requirement of the virt-launcher pod should be adjusted (w.r.t. to the resource spec of the VMI), to take into account the resources consumed by the requested VMM daemon running in the `virt-launcher` pod. Currently, the field `VirtqemudOverhead` holds the memory overhead of the `virtqemud` process.

Generalize virtualization stack to different libvirt hypervisor-drivers #259

Are you sure you want to change the base?

Generalize virtualization stack to different libvirt hypervisor-drivers #259

Conversation

harshitgupta1337 commented Feb 12, 2024 • edited Loading

kubevirt-bot commented Feb 12, 2024

kubevirt-bot commented Feb 12, 2024

alicefr commented Feb 13, 2024

fabiand left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alicefr commented Feb 15, 2024

alicefr commented Feb 15, 2024 • edited Loading

harshitgupta1337 commented Feb 15, 2024

alicefr commented Feb 16, 2024

kubevirt-bot commented May 16, 2024

iholder101 commented May 22, 2024

stu-gott commented May 22, 2024

kubevirt-bot commented Aug 20, 2024

iholder101 commented Aug 20, 2024

harshitgupta1337 commented Aug 20, 2024

iholder101 commented Aug 21, 2024

harshitgupta1337 commented Aug 30, 2024

xpivarc commented Sep 2, 2024

harshitgupta1337 commented Sep 18, 2024

kubevirt-bot commented Sep 26, 2024

harshitgupta1337 commented Sep 26, 2024 • edited Loading

aburdenthehand commented Oct 14, 2024

iholder101 commented Dec 3, 2024

harshitgupta1337 commented Feb 12, 2024 •

edited

Loading

alicefr commented Feb 15, 2024 •

edited

Loading

harshitgupta1337 commented Sep 26, 2024 •

edited

Loading