Unfortunately it's a bit more complicated than that, which is why I hate that SR... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

jsolson on April 3, 2017 | parent | context | favorite | on: Turtles on the Wire: Understanding How the OS Uses...

Unfortunately it's a bit more complicated than that, which is why I hate that SR-IOV gets so tied up in this :/

When they say "virtio-net" there they mean virtio-net inside qemu with vhost servicing the queues on the host side (note, we don't use vhost in GCE -- our device models live in a proprietary hypervisor that's designed to play nicely with Google's production infrastructure). One could just as easily expose what looked like an Intel VF to the guest and service it in the same manner (although there are good reasons not to).

One could also build a physical NIC that exposed virtual functions offering register layouts equivalent to VIRTIOs PCI BARs and used the VIRTIO queue format. If you assigned those into a guest, you'd be doing SR-IOV, but with a virtio-net NIC as seen by the guest. It also likely wouldn't perform as well as a software implementation (in its current form VIRTIO has a lot of serializing dependent loads which make it inefficient to implement over PCIe). There's some ongoing work upstream aimed at a more efficient queue format.

So, yeah, "it depends" is about the best you can do. SR-IOV really just says you're taking advantage of some features of PCI that allow differentiated address spaces in the IOMMU for a single device and (on modern CPUs) interrupt routing to actively running VCPUs without requiring a hop through the host kernel. The former is handy if you want the NIC to be able to cheaply use guest physical addresses (although the IOMMU isn't free either); the latter doesn't matter if the guest is running a poll-mode driver that masks interrupts, nor does it matter if the target VCPU isn't actively running.

shaklee3 on April 3, 2017 [–]

Thanks. I look forward to trying it on gce, but as others said, it would be nice if it was officially supported.

jsolson on April 3, 2017 | [–]

Agreed. As a policy, we don't sign off on supporting things until we feel we have sufficient regression testing for them in our automated qual matrix. DPDK is not there yet.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact