Skip to content

Latest commit

 

History

History
464 lines (346 loc) · 19.1 KB

virtualization.rst

File metadata and controls

464 lines (346 loc) · 19.1 KB

GPU Virtualization Guide

This guide describes 2 types of GPU Virtualization setup:

  • GPU Passthrough Virtualization
  • GPU SR-IOV Virtualization

The first one, GPU Passthrough Virtualization, is a legacy Virtualization Technology which allows exclusive access to GPU under Virtual Machine (VM). GPU SR-IOV Virtualization is a new technology available in modern Intel GPUs such as Intel® Data Center GPU Flex Series.

Intel GPU GPU Passtrough GPU SR-IOV
gen8-gen11 (such as BDW, SKL, KBL, CFL, etc.)
TGL
DG1
Alchemist
ATS-M

In this article we will provide VM setup instructions assuming the following:

Virtualization support might need to be enabled in host BIOS. Below we provide key settings to enable and tune virtualization. Mind that available options and their placement depends on the BIOS manufacturer.

For Intel® Server M50CYP Family:

Advanced -> Processor Configuration -> Intel(R) Virtualization Technology: Enabled
Advanced -> Integrated IO Configuration -> Intel(R) VT for Directed I/O : Enabled
Advanced -> PCI Configuration -> SR-IOV Support : Enabled
Advanced -> PCI Configuration -> Memory Mapped I/O above 4GB : Enabled
Advanced -> PCI Configuration -> MMIO High Base : 56T
Advanced -> PCI Configuration -> Memory Mapped I/O Size : 1024G
Advanced -> System Acoustic and Performance Configuration -> Set Fan Profile: Performance

Virtual Machine (VM) setup with GPU passthrough is a type of setup which allows exclusive access to GPU under VM. Mind that with this setup only one VM to which GPU was explicitly assigned will be able to use GPU. Other VMs and even a host will not have full access to the GPU.

One of the advantages of this setup is that there are much less requirements to the host kernel which don't need to be capable to support all the GPU features (i.e. respective GPU kernel mode driver, like i915, don't need to support the GPU). Basically, host kernel must be capable to recognize the GPU device and support virtualization for it. Actual GPU support is pushed to VM kernel which of course needs to have kernel mode driver and user space stack capable to work with the device.

  • Install Ubuntu 20.04 on Host

  • As noted above, there are limited requirements for the host kernel to support legacy GPU Passthrough Virtualization. For ATS-M this instruction was validated with the following kernels:

    • 5.14.0-1045-oem
    • 5.15.0-46-generic
  • Check that desired GPU is detected and find it's device ID and PCI slot (in the example below``56C0`` and 4d:00.0 respectively):

    $ lspci -nnk | grep -A 3 -E "VGA|Display"
    02:00.0 VGA compatible controller [0300]: ASPEED Technology, Inc. ASPEED Graphics Family [1a03:2000] (rev 41)
            DeviceName: ASPEED AST2500
            Subsystem: ASPEED Technology, Inc. ASPEED Graphics Family [1a03:2000]
            Kernel driver in use: ast
    --
    4d:00.0 Display controller [0380]: Intel Corporation Device [8086:56c0] (rev 08)
            Subsystem: Intel Corporation Device [8086:4905]
    
    $ DEVID=56C0
    $ PCISLOT=4d:00.0
    
  • Bind desired GPU device to vfio-pci driver by modifying kernel boot command line:

    # This will add the following options to Linux cmdline:
    #   intel_iommu=on iommu=pt vfio-pci.ids=8086:56C0 pcie_ports=native
    #
    if ! grep "intel_iommu=on" /etc/default/grub | grep -iq "8086:56C0"; then
    sudo sed -ine \
      's,^GRUB_CMDLINE_LINUX_DEFAULT="\([^"]*\)",GRUB_CMDLINE_LINUX_DEFAULT="\1 intel_iommu=on iommu=pt vfio-pci.ids=8086:56C0 pcie_ports=native",g' \
      /etc/default/grub
    fi
    grep GRUB_CMDLINE_LINUX_DEFAULT /etc/default/grub
    
  • Update grub and reboot:

    sudo update-grub && sudo reboot
    
  • After reboot verify that GPU device was binded to vfio-pci driver:

    $ lspci -nnk | grep -A 3 -i 56C0
    4d:00.0 Display controller [0380]: Intel Corporation Device [8086:56c0] (rev 08)
            Subsystem: Intel Corporation Device [8086:4905]
            Kernel driver in use: vfio-pci
            Kernel modules: i915, intel_pmt
    
  • Install virtualization environment:

    sudo apt-get update
    sudo apt-get install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils virtinst ovmf
    

Now you should be ready to create and use VM with GPU Passthrough Virtualization.

  • Download Ubuntu 20.04 ISO image to the host folder:

    sudo mkdir -p /opt/vmimage
    sudo chown -R $(id -u):$(id -g) /opt/vmimage
    wget -P /opt/vmimage https://releases.ubuntu.com/20.04.5/ubuntu-20.04.5-live-server-amd64.iso
    
  • Create disk image file for your VM (set size according to your needs, we will use 50G as an example):

    HDD_NAME="ubuntu-hdd"
    qemu-img create -f qcow2 /opt/vmimage/$HDD_NAME.qcow2 50G
    
  • Run VM and install Ubuntu 20.04 in it:

    sudo su
    
    VM_IMAGE=/opt/vmimage/ubuntu-hdd.qcow2
    HOST_IP=$(hostname -I | cut -f1 -d ' ')
    VNC_PORT=40
    qemu-system-x86_64 -enable-kvm -drive file=$VM_IMAGE \
      -cpu host -smp cores=8 -m 64G -serial mon:stdio \
      -device vfio-pci,host=4d:00.0 \
      -net nic -net user,hostfwd=tcp::10022-:22,hostfwd=tcp::8080-:8080 \
      -vnc $HOST_IP:$VNC_PORT \
      -cdrom /opt/vmimage/ubuntu-20.04.5-live-server-amd64.iso
    

    Upon execution you should be able to connect to VM via VNC using $HOST_IP:$VNC_PORT. Under VNC, proceed with typical Ubuntu installation. To enable access to VM via SSH don't forget to install openssh-server. SSH access should be possible from the host as follows:

    ssh -p 10022:localhost
    

    Mind that we also forward port 8080 which is required for Media Delivery demo to run.

  • Once installation is complete, turn off the VM and restart without installation media:

    sudo su
    
    VM_IMAGE=/opt/vmimage/ubuntu-hdd.qcow2
    HOST_IP=$(hostname -I | cut -f1 -d ' ')
    VNC_PORT=40
    qemu-system-x86_64 -enable-kvm -drive file=$VM_IMAGE \
      -cpu host -smp cores=8 -m 64G -serial mon:stdio \
      -device vfio-pci,host=4d:00.0 \
      -net nic -net user,hostfwd=tcp::10022-:22,hostfwd=tcp::8080-:8080 \
      -vnc $HOST_IP:$VNC_PORT
    

At this point you should have a running VM with an attached GPU in passthrough mode. You can check that GPU is actually available by looking into lspci output:

$ lspci -nnk | grep -A 3 -i 56C0
00:04.0 Display controller [0380]: Intel Corporation Device [8086:56c0] (rev 08)
        Subsystem: Intel Corporation Device [8086:4905]

To be able to use GPU device you might need to install additional software following bare metal setup instructions. For example, to setup Intel® Data Center GPU Flex Series (products formerly Arctic Sound) refer to this guide.

Virtual Machine (VM) setup with GPU SR-IOV Virtualization is a type of setup which allows non-exclusive time-sliced access to GPU from under VM. GPU SR-IOV Virtualization can be used to setup multiple VMs (and a host) with the access to the same GPU. It's possible to assign GPU resource limitations to each VM.

This variant of GPU virtualization setup requires host kernel to fully support underlying GPU.

  • Install Ubuntu 20.04 on Host

  • Follow this guide to enable Intel® Data Center GPU Flex Series (products formerly Arctic Sound) under the host.

  • Check that desired GPU is detected and find it's device ID and PCI slot (in the example below 56C0 and 4d:00.0 respectively):

    $ lspci -nnk | grep -A 3 -E "VGA|Display"
    02:00.0 VGA compatible controller [0300]: ASPEED Technology, Inc. ASPEED Graphics Family [1a03:2000] (rev 41)
            DeviceName: ASPEED AST2500
            Subsystem: ASPEED Technology, Inc. ASPEED Graphics Family [1a03:2000]
            Kernel driver in use: ast
    --
    4d:00.0 Display controller [0380]: Intel Corporation Device [8086:56c0] (rev 08)
            Subsystem: Intel Corporation Device [8086:4905]
            Kernel driver in use: i915
            Kernel modules: i915, intel_pmt
    
    $ DEVID=56C0
    $ PCISLOT=4d:00.0
    
  • Enable SR-IOV support (mind special option currently needed for i915 driver: i915.enable_guc=7):

    # This will add the following options to Linux cmdline:
    #   intel_iommu=on iommu=pt i915.enable_guc=7
    #
    if ! grep "intel_iommu=on" /etc/default/grub | grep -iq "8086:56C0"; then
    sudo sed -ine \
      's,^GRUB_CMDLINE_LINUX_DEFAULT="\([^"]*\)",GRUB_CMDLINE_LINUX_DEFAULT="\1 intel_iommu=on iommu=pt i915.enable_guc=7",g' \
      /etc/default/grub
    fi
    grep GRUB_CMDLINE_LINUX_DEFAULT /etc/default/grub
    
  • Update grub and reboot:

    sudo update-grub && sudo reboot
    
  • Verify that i915 driver was loaded with SR-IOV support:

    $ dmesg | grep i915 | grep PF
    [   21.116941] i915 0000:4d:00.0: Running in SR-IOV PF mode
    [   21.509331] i915 0000:4d:00.0: 31 VFs could be associated with this PF
    

    From this output you can also check how many VMs can be configured (31 in total).

Now you should be ready to create and use VM with GPU SR-IOV Virtualization.

The essential part of SR-IOV setup is resource allocation for each VM. We will described the trivial case of creating 1 VM maximizing out it's resources. Mind that such resource allocation will make GPU basically unusable from the host.

  • Check card number assigned to GPU device:

    $ ls -l /dev/dri/by-path/ | grep -o pci-0000:4d:00.0-.*
    pci-0000:4d:00.0-card -> ../card1
    pci-0000:4d:00.0-render -> ../renderD128
    
  • Allocate doorbells, contexts, ggtt and local memory for VM:

    sudo su
    
    CARD=/sys/class/drm/card1
    
    echo 0 > $CARD/device/sriov_drivers_autoprobe
    cat $CARD/iov/pf/gt/available/doorbells_max_quota > $CARD/iov/vf1/gt/doorbells_quota
    cat $CARD/iov/pf/gt/available/contexts_max_quota > $CARD/iov/vf1/gt/contexts_quota
    cat $CARD/iov/pf/gt/available/ggtt_max_quota > $CARD/iov/vf1/gt/ggtt_quota
    cat $CARD/iov/pf/gt/available/lmem_max_quota > $CARD/iov/vf1/gt/lmem_quota
    echo 0 > $CARD/iov/vf1/gt/exec_quantum_ms
    echo 0 > $CARD/iov/vf1/gt/preempt_timeout_us
    echo 1 > $CARD/iov/pf/device/sriov_numvfs
    
  • Create VFIO-PCI, run below commands (change underlined values as appropriate for the location of the GPU card in the system):

    sudo su
    
    CARD=/sys/class/drm/card1
    DEVICE=$(basename $(readlink -f $CARD/device/virtfn0))
    
    modprobe vfio-pci
    echo vfio-pci > /sys/bus/pci/devices/$DEVICE/driver_override
    echo $DEVICE > /sys/bus/pci/drivers_probe
    
  • Verify that "new" SR-IOV GPU device has appeared (4d:00.1) and was binded with vfio-pci driver:

    $ lspci -nnk | grep -A 3 -i 56C0
    4d:00.0 Display controller [0380]: Intel Corporation Device [8086:56c0] (rev 08)
            Subsystem: Intel Corporation Device [8086:4905]
            Kernel driver in use: i915
            Kernel modules: i915, intel_pmt
    4d:00.1 Display controller [0380]: Intel Corporation Device [8086:56c0] (rev 08)
            Subsystem: Intel Corporation Device [8086:4905]
            Kernel driver in use: vfio-pci
            Kernel modules: i915, intel_pmt
    
  • Download Ubuntu 20.04 ISO image to the host folder:

    sudo mkdir -p /opt/vmimage
    sudo chown -R $(id -u):$(id -g) /opt/vmimage
    wget -P /opt/vmimage https://releases.ubuntu.com/20.04.5/ubuntu-20.04.5-live-server-amd64.iso
    
  • Create disk image file for your VM (set size according to your needs, we will use 50G as an example):

    HDD_NAME="ubuntu-hdd"
    qemu-img create -f qcow2 /opt/vmimage/$HDD_NAME.qcow2 50G
    
  • Run VM and install Ubuntu 20.04 in it (mind SR-IOV device 4d:00.1 we've setup in previous paragraph):

    sudo su
    
    VM_IMAGE=/opt/vmimage/ubuntu-hdd.qcow2
    HOST_IP=$(hostname -I | cut -f1 -d ' ')
    VNC_PORT=40
    qemu-system-x86_64 -enable-kvm -drive file=$VM_IMAGE \
      -cpu host -smp cores=8 -m 64G -serial mon:stdio \
      -device vfio-pci,host=4d:00.1 \
      -net nic -net user,hostfwd=tcp::10022-:22,hostfwd=tcp::8080-:8080 \
      -vnc $HOST_IP:$VNC_PORT \
      -cdrom /opt/vmimage/ubuntu-20.04.5-live-server-amd64.iso
    

    Upon execution you should be able to connect to VM via VNC using $HOST_IP:$VNC_PORT. Under VNC, proceed with typical Ubuntu installation. To enable access to VM via SSH don't forget to install openssh-server. SSH access should be possible from the host as follows:

    ssh -p 10022:localhost
    

    Mind that we also forward port 8080 which is required for Media Delivery demo to run.

  • Once installation is complete, turn off the VM and restart without installation media:

    sudo su
    
    VM_IMAGE=/opt/vmimage/ubuntu-hdd.qcow2
    HOST_IP=$(hostname -I | cut -f1 -d ' ')
    VNC_PORT=40
    qemu-system-x86_64 -enable-kvm -drive file=$VM_IMAGE \
      -cpu host -smp cores=8 -m 64G -serial mon:stdio \
      -device vfio-pci,host=4d:00.1 \
      -net nic -net user,hostfwd=tcp::10022-:22,hostfwd=tcp::8080-:8080 \
      -vnc $HOST_IP:$VNC_PORT
    

At this point you should have a running VM with an attached GPU in SR-IOV mode. You can check that GPU is actually available by looking into lspci output:

$ lspci -nnk | grep -A 3 -i 56C0
00:03.0 Display controller [0380]: Intel Corporation Device [8086:56c0] (rev 08)
        Subsystem: Intel Corporation Device [8086:4905]

To be able to use GPU device you might need to install additional software following bare metal setup instructions. For example, to setup Intel® Data Center GPU Flex Series (products formerly Arctic Sound) refer to this guide.

  • You can valide whether you properly enabled virtualization (in BIOS and in your Operating System) by running virt-host-validate. You should see below output:

    $ sudo virt-host-validate | grep QEMU
      QEMU: Checking for hardware virtualization                                 : PASS
      QEMU: Checking if device /dev/kvm exists                                   : PASS
      QEMU: Checking if device /dev/kvm is accessible                            : PASS
      QEMU: Checking if device /dev/vhost-net exists                             : PASS
      QEMU: Checking if device /dev/net/tun exists                               : PASS
      QEMU: Checking for cgroup 'cpu' controller support                         : PASS
      QEMU: Checking for cgroup 'cpuacct' controller support                     : PASS
      QEMU: Checking for cgroup 'cpuset' controller support                      : PASS
      QEMU: Checking for cgroup 'memory' controller support                      : PASS
      QEMU: Checking for cgroup 'devices' controller support                     : PASS
      QEMU: Checking for cgroup 'blkio' controller support                       : PASS
      QEMU: Checking for device assignment IOMMU support                         : PASS
      QEMU: Checking if IOMMU is enabled by kernel                               : PASS
      QEMU: Checking for secure guest support                                    : WARN (Unknown if this platform has Secure Guest support)
    
  • If you would like to monitor VM bootup process or you can't connect to VM with VNC or SSH, serial console might be very useful. To enable it:

    • Make sure to start VM with -serial mon:stdio option (we have it in qemu-system-x86_64 cmdlines above)

    • Enable serial console inside the VM modifying Linux kernel cmdline:

      # This will add the following options to Linux cmdline:
      #   console=ttyS0,115200n8
      #
      if ! grep "intel_iommu=on" /etc/default/grub | grep -iq "8086:56C0"; then
      sudo sed -ine \
          's,^GRUB_CMDLINE_LINUX_DEFAULT="\([^"]*\)",GRUB_CMDLINE_LINUX_DEFAULT="\1 console=ttyS0\,115200n8",g' \
          /etc/default/grub
      fi
      grep GRUB_CMDLINE_LINUX_DEFAULT /etc/default/grub
      
    • Update grub and reboot the VM. You should see bootup process followed by

    serial console terminal prompt:

    sudo update-grub && sudo reboot
    
  • You might consider to run VM in a headless mode without VNC:

    qemu-system-x86_64 -enable-kvm -drive file=$VM_IMAGE \
      -cpu host -smp cores=8 -m 64G -serial mon:stdio \
      -vga none -nographic \
      -net nic -net user,hostfwd=tcp::10022-:22,hostfwd=tcp::8080-:8080 \
      -device vfio-pci,host=4d:00.0
    

    In this case you can find that network is not available. This is happening because network interface changes it's name from ens3 (with -vnc) to ens2 (with headless). To diagnose this, verify which inerface is available:

    $ ip a
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host
           valid_lft forever preferred_lft forever
    2: ens3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
        link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff
        altname enp0s3
    3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
        link/ether 02:42:04:4c:f4:f1 brd ff:ff:ff:ff:ff:ff
        inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
           valid_lft forever preferred_lft forever
    

    And make sure that this interface is actually listed in the the following file. Adjust accordingly if needed. After reboot network should be functional. In the example below, configuration needs to be changed from ens2 to ens3:

    $ /etc/netplan/00-installer-config.yaml
    # This is the network config written by 'subiquity'
    network:
      ethernets:
        ens2:
          dhcp4: true
      version: 2