Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for VFs within a VM #121

Closed
wants to merge 2 commits into from
Closed

Add support for VFs within a VM #121

wants to merge 2 commits into from

Conversation

cswindle
Copy link

@cswindle cswindle commented Apr 2, 2020

Currently when you have a request for a VF it requires that the PF is present on the system, however in the case of VMs the PF is present on the host and the VF is present in the VM, so the SRIOV plugin does not work. This change allows a VF to be used if there is no PF associated with it, as long as there is no config requested which requires the PF to set it up.

@cswindle cswindle closed this Apr 2, 2020
@cswindle cswindle reopened this Apr 2, 2020
@cswindle
Copy link
Author

cswindle commented Apr 8, 2020

Please could someone review my change as I would like this change to be merged?

@killianmuldoon
Copy link
Collaborator

Can you go into more detail on the use case for this? How are you deploying the VM? Is Kubernetes running on top of a VM?

@cswindle
Copy link
Author

cswindle commented Apr 8, 2020

Yes, this scenario is deploying a Kubernetes rig on top of VMware VMs which are using SRIOV for the interfaces. In this scenario the VMware host has the PF and the VM has the VF, so we would like to be able to move the VF into a Kubernetes pod, which is what this change allows.

@ahalimx86
Copy link
Collaborator

@cswindle how the VF interface is added to VM? As a regular NIC with kernel mode driver?

@adrianchiris
Copy link
Contributor

adrianchiris commented Apr 8, 2020

I wonder if a better direction is to just use host-device CNI

once the VF is passthrough to the VM you are limited to configurations applied on the VF's netdevice.

@cswindle
Copy link
Author

cswindle commented Apr 8, 2020

The VF that gets pulled into the VM looks identical to if it was running on bare metal, then we are binding to the UIO driver and pulling in using the SRIOV plugin.

@cswindle
Copy link
Author

cswindle commented Apr 8, 2020

@adrianchiris, we are making use of the SRIOV device plugin to allow a pool of SRIOV devices to be defined, then getting the SRIOV CNI plugin to move the VF into the pod, this gives a lot of flexibility for a kubernetes deployment, rather than using host-device plugin.

@adrianchiris
Copy link
Contributor

@cswindle, to my knowledge multus can work with host-device CNI as it adds the PCI device associated to the network with via pciBusID attribute.

Other than the name change, isn't this patch essentially reduces sriov-cni to host-device functionality ?

@ahalimx86
Copy link
Collaborator

ahalimx86 commented Apr 8, 2020

I agree with @adrianchiris. The host-device cni would be the perfect choice of plugin where we don't need to apply any VF configuration. But if you are binding the VF to uio driver in the VM then I don't see why you would need a CNI plugin to move the VF into the Pod. A CNI plugin would be needed if the VF was a regular kernel network interface. With dpdk driver(uio/vfio) the sriov-cni plugin only used for applying VF configs, not for moving an interface into Pod. So, for your usecase I don't see why you need a CNI plugin(VF with uio driver and there's no VF config to apply).

@JanScheurich
Copy link

@ahalim-intel: The reason we would like to use a CNI plugin also in this case is to provide CNFs with a uniform way of consuming secondary Multus network attachments, i.e. adding Network annotations to their pods and leave it to the Network Resources Injector web-hook to inject any required resources. The applications should not need to worry about the device pool names for this special case.

@ahalimx86
Copy link
Collaborator

Ok, so for the Pod resource injector you need to add net-attach-cr and a cni plugin. Are there any other reason to support this mode of operation for the sriov-cni where there's no VF config to apply (VF as kernel interface in VM and it;s not possible without a PF) OR there's no net interface to move in container(VF as dpdk interface in VM). For the later scenario, there's nothing to do in sriov-cni in terms of moving a interface OR configuring it.
We just need a plugin that does nothing but serve as reference to the net-attach-def crd. Just so that we are clear on the use-case here, are there anything we expect the sriov-cni plugin should do for VF in VM as a DPDK interface?

@JanScheurich
Copy link

Exactly, we'd like to use the SRIOV CNI in this particular use case to do nothing, except to serve as a CNI plugin placeholder for uniform consumption of SRIOV VF network attachments within a VM. No configuration of the Intel VF when bound to igb_uio for usage with DPDK.

@amorenoz
Copy link
Contributor

@JanScheurich is it really a VF from the guest's point of view? Or is it a plain PCI device (i.e: a PF) backed by your VF in the host?
For instance, in qemu/kvm even if you passhtrough a VF, the guest sees a PF (SR-IOV is not emulated).

@JanScheurich
Copy link

From the guest point of view it is a VF PCI device without a PF as parent context. It is clearly not a PF: the guest OS loads e.g. the iavf driver. From our use case perspective it doesn't matter much if the VF PCI device is "injected" through a NAD CR of type sriov or host-device, but currently both do not work for a VF bound to igb_uio in a VM. The sriov CNI plugin requires the presence of a PF, while the host-device plugin requires the VF to be bound to a kernel VF driver.

@zshi-redhat
Copy link
Collaborator

@ahalim-intel Just a thought: PF (conf.Master) is used for applying/resetting VF config in SR-IOV CNI. Would it make sense to detect the existence of conf.Master and skip the VF config with warnings in VM? This would allow SR-IOV CNI work in VMs for both kernel mode and dpdk mode.

@zshi-redhat
Copy link
Collaborator

@ahalim-intel Just a thought: PF (conf.Master) is used for applying/resetting VF config in SR-IOV CNI. Would it make sense to detect the existence of conf.Master and skip the VF config with warnings in VM? This would allow SR-IOV CNI work in VMs for both kernel mode and dpdk mode.

please ignore, this PR already did it.

@adrianchiris
Copy link
Contributor

Are we OK with closing this PR ?

I see host-device is adding support for DPDK device when vfio-pci is bound to the VF instead of a networking driver.
containernetworking/plugins#490

So for VMs with VF passthrough you would use host-device CNI for both "normal" and dpdk workloads.

@JanScheurich
Copy link

@adrianchiris If the use case is fully covered in the host-device CNI, I am OK with closing this issue.

@martinkennelly
Copy link
Member

@adrianchiris Can we wait to see if the functionality is merged in host-device CNI before closing? Its still in review.

@adrianchiris
Copy link
Contributor

sure

@lynic
Copy link

lynic commented Dec 10, 2020

It would be good to see this patch in for sriov-cni, so that we could also leverage its IPAM implementation for DPDK interface.

What do u think?

@adrianchiris
Copy link
Contributor

There have been several discussions around this in our bi-weekly meeting (see Contributing doc for more info on this)
ATM we think this usecase should not be be part of SR-IOV cni as much of the logic this CNI performs is actually on the PF of the respected VF.

there is a greater commonality with host-device CNI in regards to setting up the VF in a VM environment.
maybe once containernetworking/plugins#490 is merged, a followup PR to trigger IPAM for DPDK interfaces is needed to support the use-case.

please feel free to drop in for one of our meetings as raise this issue if you feel there is need for SR-IOV CNI to support it.

@martinkennelly
Copy link
Member

martinkennelly commented Dec 21, 2020

We should discuss if we need to define a max waiting time to start considering this patch if host-device doesn't want this functionality. I will raise this topic at community meeting.

@martinkennelly
Copy link
Member

At this weeks community meeting, we agreed to wait for a maximum of two months before opening this discussion again - Feb 21st 2021.

@adpempx
Copy link

adpempx commented Jan 15, 2021

Guys,

could you please merge this MR?

This is mandatory use case for DPDK people with SRIOV environment in VMs.

Thanks

@adrianchiris
Copy link
Contributor

@adpempx we think that hostdevice CNI should accomodate this use-case

containernetworking/plugins#490

i see you are familiar with that PR.

@adpempx
Copy link

adpempx commented Jan 17, 2021

@adrianchiris

yes, they have approved the hostdevice CNI pull request.

I have verified that change in my environment (DPDK on SRIOV on VM) and it works.

Thanks

@martinkennelly
Copy link
Member

The host-device DPDK patch containernetworking/plugins#490 has been merged. Are we ok to close this now?

@JanScheurich
Copy link

Yes, we can close the issue here. Thanks for your support!

@adpempx
Copy link

adpempx commented Jan 20, 2021

Thanks guys!

@adrianchiris
Copy link
Contributor

needed functionality merged into host-device cni with: containernetworking/plugins#490

Closing.

@wjun
Copy link

wjun commented Jul 22, 2021

Many users when moving from bare metal to virtual environment, they still prefer to sriov cni instead of host device. It will be very helpful if this commit could be merged into sriov cni.

@martinkennelly
Copy link
Member

@wjun May I ask why you would prefer SRIOV CNI instead of using host device in a virtual env?

@cswindle
Copy link
Author

@martinkennelly, if you have the same setup for virtual and baremetal it makes the life of an application developer using DPDK easier as they need less conditional changes for the environment.

// It is not possible to setup VF config if there is no PF associated with the VF, so check if there is any
// config and reject if the master has not been set.
if conf.Master == "" {
if conf.Vlan != 0 || conf.MAC != "" || conf.MinTxRate != nil || conf.MaxTxRate != nil || conf.SpoofChk != "" || conf.LinkState != "" {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cswindle It is not possible to configure MAC address or VLAN to VF with no access to its PF?

@ormergi
Copy link

ormergi commented Jul 25, 2021

@martinkennelly @cswindle hey we run KubeVirt VM's with SRIOV devices on containerized k8s cluster (nodes are containers KinD) on our CI.
We are looking into create similar setup on a virtualized cluster (nodes are VM's).

Using host-device drastically increase maintenance as we need to create NetworkAttachmentDefinition
for each VF and basically manage the allocation our-self's.
Being able to use sriov-cni instead would be great.

Having said that I noticed that with this PR changes we wont be able to configure the VF MAC address and VLAN,
which are crucial for our use case.
Is it possible to configure a VF additional settings (e.g MAC address, VLAN..) when the PF is not accessible?

@adpempx
Copy link

adpempx commented Sep 14, 2021

Hi,

do you have any news?

Please let us know.
Thanks

@jingczhang
Copy link

Hi,

We use Multus + host-device cni to pool SRIOV VFs inside VM.

We echo the trouble that host-device cni missing the capability to configure vlan.

Considering the host-device cni is "thin" and is meant to be "thin", we are also looking for sriov-cni to support VFs inside VM.

Thanks

@adpempx
Copy link

adpempx commented Sep 15, 2021

Same for us.

host-device is OK but not convenient, as it requires manual configuration of interface names/PCI addresses.

Thanks

@amorenoz
Copy link
Contributor

@martinkennelly @cswindle hey we run KubeVirt VM's with SRIOV devices on containerized k8s cluster (nodes are containers KinD) on our CI.
We are looking into create similar setup on a virtualized cluster (nodes are VM's).

Using host-device drastically increase maintenance as we need to create NetworkAttachmentDefinition
for each VF and basically manage the allocation our-self's.
Being able to use sriov-cni instead would be great.

Having said that I noticed that with this PR changes we wont be able to configure the VF MAC address and VLAN,
which are crucial for our use case.
Is it possible to configure a VF additional settings (e.g MAC address, VLAN..) when the PF is not accessible?

Note this is not possible today AFAIK. However, if it's for testing purpose, you could check out
https://github.com/hammerg/qemu/tree/igb_sriov_dev where @hammerg and @marcel-apf have emulated a SR-IOV-capable NIC in qemu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.