Aoik

PVE 8 GPU passthrough

Enable virtualization in BIOS

The BIOS option is usually named Virtualization Technology.

Enable IOMMU in BIOS

The BIOS option is usually named Intel VT-d or AMD IOMMU.

IOMMU (Input-Output Memory Management Unit) is an MMU component that connects a DMA-capable I/O bus to system memory. It maps device-visible virtual addresses to physical addresses, making it useful in virtualization.

Install PVE 8

I have installed PVE 8.1.4, kernel version 6.5.13-1-pve.

Add kernel command line parameters to Grub to enable IOMMU

Run:

vi /etc/default/grub

Edit:

# Intel CPU
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"

# AMD CPU
GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt"
  • intel_iommu=on turns on IOMMU for Intel CPU.
    Posts on the web suggest to add amd_iommu=on for AMD CPU but this post says amd_iommu=on is invalid and IOMMU is enabled for AMD CPU by default.
  • iommu=pt turns on IOMMU tagging only for devices configured for passthrough, allowing the host to ignore it for local host-only devices.

Update Grub config.
Run:

update-grub

Find vfio kernel modules

VFIO (Virtual Function I/O) allows to implement PCI passthrough by isolating a PCI device from the rest of the host and to put it under control of a user process.

Run:

find /lib/modules -name 'vfio*.ko'

Result:

/lib/modules/6.5.13-1-pve/kernel/drivers/vfio/pci/vfio-pci.ko
/lib/modules/6.5.13-1-pve/kernel/drivers/vfio/pci/vfio-pci-core.ko
/lib/modules/6.5.13-1-pve/kernel/drivers/vfio/vfio.ko
/lib/modules/6.5.13-1-pve/kernel/drivers/vfio/vfio_iommu_type1.ko

Configure to load vfio kernel modules

Run:

echo vfio >> /etc/modules
echo vfio_iommu_type1 >> /etc/modules
echo vfio_pci >> /etc/modules
  • Posts on the web for older PVE versions suggest to run echo vfio_virqfd >> /etc/modules. This is not needed in PVE 8, as explained in this post.
  • Posts on the web for older PVE versions suggest to set options vfio-pci ids=_PCI_ID_ in /etc/modprobe.d/vfio.conf. This is not needed in PVE 8. Once you allocate a PCI device to a VM via the PVE server's web UI, vfio-pci configuration is automatically done for you.

Reboot for the enabling of IOMMU to take effect

Run:

reboot

Check if IOMMU is enabled

Run:

dmesg | grep 'remapping'

Result for enabled:

# Intel CPU
DMAR-IR: Enabled IRQ remapping in x2apic mode

# AMD CPU
AMD-Vi: Interrupt remapping enabled

Result for disabled:

x2apic: IRQ remapping doesn't support X2APIC mode

List IOMMU device files

Run:

find /sys/kernel/iommu_groups/ -type l

Result (example):

/sys/kernel/iommu_groups/7/devices/0000:00:16.0
/sys/kernel/iommu_groups/15/devices/0000:03:00.0
/sys/kernel/iommu_groups/5/devices/0000:00:14.2
/sys/kernel/iommu_groups/5/devices/0000:00:14.0
/sys/kernel/iommu_groups/13/devices/0000:01:00.0
/sys/kernel/iommu_groups/3/devices/0000:00:08.0
/sys/kernel/iommu_groups/11/devices/0000:00:1d.1
/sys/kernel/iommu_groups/1/devices/0000:00:00.0
/sys/kernel/iommu_groups/8/devices/0000:00:17.0
/sys/kernel/iommu_groups/6/devices/0000:00:15.0
/sys/kernel/iommu_groups/14/devices/0000:02:00.0
/sys/kernel/iommu_groups/4/devices/0000:00:12.0
/sys/kernel/iommu_groups/12/devices/0000:00:1f.0
/sys/kernel/iommu_groups/12/devices/0000:00:1f.5
/sys/kernel/iommu_groups/12/devices/0000:00:1f.3
/sys/kernel/iommu_groups/12/devices/0000:00:1f.4
/sys/kernel/iommu_groups/2/devices/0000:00:04.0
/sys/kernel/iommu_groups/10/devices/0000:00:1d.0
/sys/kernel/iommu_groups/0/devices/0000:00:02.0
/sys/kernel/iommu_groups/9/devices/0000:00:1c.0

List IOMMU device information

Run:

shopt -s nullglob
for g in $(find /sys/kernel/iommu_groups/* -maxdepth 0 -type d | sort -V); do
    echo "IOMMU Group ${g##*/}:"
    for d in $g/devices/*; do
        echo -e "\t$(lspci -nns ${d##*/})"
    done;
done;

Result (example):

IOMMU Group 0:
        00:00.0 Host bridge [0600]: Intel Corporation 8th/9th Gen Core 8-core Desktop Processor Host Bridge/DRAM Registers [Coffee Lake S] [8086:3e30] (rev 0a)
IOMMU Group 1:
        00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 0a)
        00:01.1 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) [8086:1905] (rev 0a)
...

List the PCI devices of the graphics card

Run one of the commands:

lspci -k | grep -A10 -i vga

lspci -k | grep -A10 -i amd

lspci -k | grep -A10 -i nvidia

Result for AMD Radeon RX 580:

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev e7)
        Subsystem: ASUSTeK Computer Inc. Ellesmere [Radeon RX 470/480/570/570X/580/580X/590]
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu
01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590]
        Subsystem: ASUSTeK Computer Inc. Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590]
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel

Result for NVIDIA GeForce GTX 970:

01:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce GTX 970] (rev a1)
        Subsystem: ASUSTeK Computer Inc. GM204 [GeForce GTX 970]
        Kernel driver in use: nouveau
        Kernel modules: nvidiafb, nouveau
01:00.1 Audio device: NVIDIA Corporation GM204 High Definition Audio Controller (rev a1)
        Subsystem: ASUSTeK Computer Inc. GM204 High Definition Audio Controller
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel

VGA compatible controllers just mean the video devices of the graphics card.

Besides video devices, graphics cards may have audio devices for their audio capability (e.g. for HDMI).

Disable kernel modules for the graphics card

The key for the success of passthrough is to disable all kernel modules (i.e. drivers) for the PCI devices of the graphics card on the host OS before allocating these PCI devices to a VM.

Make sure you are able to ssh to the PVE server from another host as after disabling all kernel modules for the graphics card the PVE server will be in black screen if it has only one graphics card.

Run:

# AMD graphics card
cat <<'ZZZ' > /etc/modprobe.d/pve-blacklist.conf
amdgpu
snd_hda_intel
ZZZ

# NVIDIA graphics card
cat <<'ZZZ' > /etc/modprobe.d/pve-blacklist.conf
nouveau
nvidiafb
nvidia
snd_hda_intel
ZZZ

Reboot for the disabling of kernel modules for the graphics card to take effect

Run:

reboot

Export the VBIOS rom of the graphics card

Run:

echo 1 > /sys/bus/pci/devices/_VGA_DEVICE_PCI_ID_/rom

cat /sys/bus/pci/devices/_VGA_DEVICE_PCI_ID_/rom > vga.rom

echo 0 > /sys/bus/pci/devices/_VGA_DEVICE_PCI_ID_/rom
  • _VGA_DEVICE_PCI_ID_ can be found via the command shown in this section.
    The value is in the format like 0000:01:00.0.

If meet this error:

cat: '/sys/bus/pci/devices/_VGA_DEVICE_PCI_ID_/rom': Input/output error

, it means exporting this way is somehow not supported. Use GPU-Z to export instead. See this section.

I met this error and have had success with GPU-Z.

Export the VBIOS rom of the graphics card using GPU-Z

Boot up a Windows PE via USB.

Run GPU-Z (downloaded from here).

Click the arrow icon in the BIOS Version line.

Click Save to file....

Save the rom file as vga.rom.

Install rom-parser to verify the VBIOS rom

Run:

wget -O rom-parser-master.zip https://codeload.github.com/awilliam/rom-parser/zip/refs/heads/master

unzip rom-parser-master.zip

cd rom-parser-master

make

./rom-parser vga.rom

Result for AMD Radeon RX 580:

Valid ROM signature found @0h, PCIR offset 250h
        PCIR: type 0 (x86 PC-AT), vendor: 1002, device: 67df, class: 030000
        PCIR: revision 0, vendor revision: f32
Valid ROM signature found @e800h, PCIR offset 1ch
        PCIR: type 3 (EFI), vendor: 1002, device: 67df, class: 030000
        PCIR: revision 0, vendor revision: 0
                EFI: Signature Valid, Subsystem: Boot, Machine: X64
        Last image

Result for NVIDIA GeForce GTX 970:

Valid ROM signature found @800h, PCIR offset 1a0h
        PCIR: type 0 (x86 PC-AT), vendor: 10de, device: 13c2, class: 030000
        PCIR: revision 0, vendor revision: 1
Valid ROM signature found @fe00h, PCIR offset 1ch
        PCIR: type 3 (EFI), vendor: 10de, device: 13c2, class: 030000
        PCIR: revision 3, vendor revision: 0
                EFI: Signature Valid, Subsystem: Boot, Machine: X64
        Last image

Put the VBIOS rom to the PVE server

Run:

mv vga.rom /usr/share/kvm/vga.rom

Upload an OS installation image to the PVE server

Open the PVE server's web UI.
The default port is 8006.
The login password is the root password set during PVE installation.

In the left panel, click Datacenter -- _NODE_ID_ -- local (_NODE_ID_).

In the left sub panel, click ISO Images.

Click Upload.

Click Select File to select the ISO image on local disk.

Click Upload to upload it to the PVE server.

Create a VM

Open the PVE server's web UI.

Click Create VM on the top bar.

Set the options like this:

General:
  Node: _NODE_ID_
  VM ID: _VM_ID_
  Name: _VM_NAME_
  Resource Pool:

OS:
  Use CD/DVD disk image file (iso): (check)
    Storage: local
    ISO image: (select)
  Guest OS:
    Type: Linux
    Version: 6.x - 2.6 Kernel

System:
  Graphics card: Default
  Machine: q35
  BIOS: OVMF (UEFI)
  Add EFI Disk: (check)
  EFI Storage: local-lvm
  Pre-Enroll keys: (check)
  SCSI Controller: VirtIO SCSI single
  Qume Agent: (uncheck)
  Add TPM: (uncheck)

Disk:
  scsi0:
    Disk:
      Bus/Device: SCSI, 0
      Storage: local-lvm
      Disk size (GiB): 15
      Cache: Default (No cache)
      Discard: (uncheck)
      IO thread: (check)

CPU:
  Sockets: _REAL_CPU_SOCKETS_COUNT_ (usually 1)
  Cores: _REAL_CPU_CORES_COUNT_
  Type: host

Memory (MiB): 4096
  Advanced (near the "Back" button): (checked)
  Ballooning Device: (uncheck)

Network:
  No network device: (uncheck)
  Bridge: vmbr0
  VLAN tag: no VLAN
  Firewall: (check)
  Model: VirtIO (paravirtualized)
  MAC address: (auto)

Click Finish.

Allocate the graphics card to the VM

Open the PVE server's web UI.

In the left panel, click Datacenter -- _NODE_ID_ -- _VM_ID_.

In the left sub panel, click Hardware.

In the right panel, Click Add. Click PCI Device.

Set the options like this:

Advanced (near the "Add" button): (checked)
Raw Device:
  Device: (select the graphics card's PCI video evice)
  All Functions: (check)
Primary GPU: (uncheck)
ROM-Bar: (check)
PCI-Express: (check)
Vendor ID: (From Device)
Device ID: (From Device)
Sub-Vendor ID: (From Device)
Sub-Device ID: (From Device)

Click Add.

Edit the VM's config file to load the VBIOS rom

Run:

vi /etc/pve/qemu-server/_VM_ID_.conf

Edit:

hostpci0: _PCI_ID_,pcie=1,romfile=vga.rom
  • _PCI_ID_ is in the format like 0000:01:00. Do not change it.
  • The value for romfile must be a path relative to /usr/share/kvm/. An absolute path not works.

After editing the config file, the PVE server's web UI's VM's Hardware page's PCI Device line can immediately show the updated value.

Boot up the VM

Open the PVE server's web UI.

In the left panel, click Datacenter -- _NODE_ID_ -- _VM_ID_.

In the left sub panel, click Console.

In the right panel, click Start Now.

Check if the graphics card has been identified in the VM

In the VM, run:

lspci -k | grep -A10 -i vga
  • If a Kernel driver in use line is present, it means kernel module (i.e. driver) has been loaded for the graphics card. Note the one with Kernel driver in use: bochs-drm is the virtual graphics card, not the passthrough one.

Check if the VBIOS rom has been correctly loaded in the VM

In the VM, run:

dmesg | grep -i 'Invalid PCI ROM header signature'

If meet this error:

Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff

, it means the VBIOS rom has not been correctly loaded.

In my case, the VBIOS rom is correctly loaded for AMD Radeon RX 580, but not for NVIDIA GeForce GTX 970. However, the hardware acceleration still works for NVIDIA GeForce GTX 970 somehow.

Check for graphics card driver error messages in the VM

In the VM, run:

dmesg | grep -i amd

dmesg | grep -i nvidia

Result for AMD Radeon RX 580:

kfd kfd: amdgpu: skipped device 1002:67df, PCI rejects atomics 730<0
  • It seems not a fatal error as the hardware acceleration still works.
Prev Post:
Next Post:

Comments:

Reply to: