PVE 8 GPU passthrough
Enable virtualization in BIOS
The BIOS option is usually named Virtualization Technology.
Enable IOMMU in BIOS
The BIOS option is usually named Intel VT-d or AMD IOMMU.
IOMMU (Input-Output Memory Management Unit) is an MMU component that connects a DMA-capable I/O bus to system memory. It maps device-visible virtual addresses to physical addresses, making it useful in virtualization.
Install PVE 8
I have installed PVE 8.1.4, kernel version 6.5.13-1-pve.
Add kernel command line parameters to Grub to enable IOMMU
Run:
vi /etc/default/grub
Edit:
# Intel CPU
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
# AMD CPU
GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt"
intel_iommu=on
turns on IOMMU for Intel CPU.
Posts on the web suggest to addamd_iommu=on
for AMD CPU but this post saysamd_iommu=on
is invalid and IOMMU is enabled for AMD CPU by default.iommu=pt
turns on IOMMU tagging only for devices configured for passthrough, allowing the host to ignore it for local host-only devices.
Update Grub config.
Run:
update-grub
Find vfio kernel modules
VFIO (Virtual Function I/O) allows to implement PCI passthrough by isolating a PCI device from the rest of the host and to put it under control of a user process.
Run:
find /lib/modules -name 'vfio*.ko'
Result:
/lib/modules/6.5.13-1-pve/kernel/drivers/vfio/pci/vfio-pci.ko
/lib/modules/6.5.13-1-pve/kernel/drivers/vfio/pci/vfio-pci-core.ko
/lib/modules/6.5.13-1-pve/kernel/drivers/vfio/vfio.ko
/lib/modules/6.5.13-1-pve/kernel/drivers/vfio/vfio_iommu_type1.ko
Configure to load vfio kernel modules
Run:
echo vfio >> /etc/modules
echo vfio_iommu_type1 >> /etc/modules
echo vfio_pci >> /etc/modules
- Posts on the web for older PVE versions suggest to run
echo vfio_virqfd >> /etc/modules
. This is not needed in PVE 8, as explained in this post. - Posts on the web for older PVE versions suggest to set
options vfio-pci ids=_PCI_ID_
in /etc/modprobe.d/vfio.conf. This is not needed in PVE 8. Once you allocate a PCI device to a VM via the PVE server's web UI, vfio-pci configuration is automatically done for you.
Reboot for the enabling of IOMMU to take effect
Run:
reboot
Check if IOMMU is enabled
Run:
dmesg | grep 'remapping'
Result for enabled:
# Intel CPU
DMAR-IR: Enabled IRQ remapping in x2apic mode
# AMD CPU
AMD-Vi: Interrupt remapping enabled
Result for disabled:
x2apic: IRQ remapping doesn't support X2APIC mode
List IOMMU device files
Run:
find /sys/kernel/iommu_groups/ -type l
Result (example):
/sys/kernel/iommu_groups/7/devices/0000:00:16.0
/sys/kernel/iommu_groups/15/devices/0000:03:00.0
/sys/kernel/iommu_groups/5/devices/0000:00:14.2
/sys/kernel/iommu_groups/5/devices/0000:00:14.0
/sys/kernel/iommu_groups/13/devices/0000:01:00.0
/sys/kernel/iommu_groups/3/devices/0000:00:08.0
/sys/kernel/iommu_groups/11/devices/0000:00:1d.1
/sys/kernel/iommu_groups/1/devices/0000:00:00.0
/sys/kernel/iommu_groups/8/devices/0000:00:17.0
/sys/kernel/iommu_groups/6/devices/0000:00:15.0
/sys/kernel/iommu_groups/14/devices/0000:02:00.0
/sys/kernel/iommu_groups/4/devices/0000:00:12.0
/sys/kernel/iommu_groups/12/devices/0000:00:1f.0
/sys/kernel/iommu_groups/12/devices/0000:00:1f.5
/sys/kernel/iommu_groups/12/devices/0000:00:1f.3
/sys/kernel/iommu_groups/12/devices/0000:00:1f.4
/sys/kernel/iommu_groups/2/devices/0000:00:04.0
/sys/kernel/iommu_groups/10/devices/0000:00:1d.0
/sys/kernel/iommu_groups/0/devices/0000:00:02.0
/sys/kernel/iommu_groups/9/devices/0000:00:1c.0
List IOMMU device information
Run:
shopt -s nullglob
for g in $(find /sys/kernel/iommu_groups/* -maxdepth 0 -type d | sort -V); do
echo "IOMMU Group ${g##*/}:"
for d in $g/devices/*; do
echo -e "\t$(lspci -nns ${d##*/})"
done;
done;
Result (example):
IOMMU Group 0:
00:00.0 Host bridge [0600]: Intel Corporation 8th/9th Gen Core 8-core Desktop Processor Host Bridge/DRAM Registers [Coffee Lake S] [8086:3e30] (rev 0a)
IOMMU Group 1:
00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 0a)
00:01.1 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) [8086:1905] (rev 0a)
...
List the PCI devices of the graphics card
Run one of the commands:
lspci -k | grep -A10 -i vga
lspci -k | grep -A10 -i amd
lspci -k | grep -A10 -i nvidia
Result for AMD Radeon RX 580:
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev e7)
Subsystem: ASUSTeK Computer Inc. Ellesmere [Radeon RX 470/480/570/570X/580/580X/590]
Kernel driver in use: amdgpu
Kernel modules: amdgpu
01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590]
Subsystem: ASUSTeK Computer Inc. Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590]
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel
Result for NVIDIA GeForce GTX 970:
01:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce GTX 970] (rev a1)
Subsystem: ASUSTeK Computer Inc. GM204 [GeForce GTX 970]
Kernel driver in use: nouveau
Kernel modules: nvidiafb, nouveau
01:00.1 Audio device: NVIDIA Corporation GM204 High Definition Audio Controller (rev a1)
Subsystem: ASUSTeK Computer Inc. GM204 High Definition Audio Controller
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel
VGA compatible controllers just mean the video devices of the graphics card.
Besides video devices, graphics cards may have audio devices for their audio capability (e.g. for HDMI).
Disable kernel modules for the graphics card
The key for the success of passthrough is to disable all kernel modules (i.e. drivers) for the PCI devices of the graphics card on the host OS before allocating these PCI devices to a VM.
Make sure you are able to ssh to the PVE server from another host as after disabling all kernel modules for the graphics card the PVE server will be in black screen if it has only one graphics card.
Run:
# AMD graphics card
cat <<'ZZZ' > /etc/modprobe.d/pve-blacklist.conf
amdgpu
snd_hda_intel
ZZZ
# NVIDIA graphics card
cat <<'ZZZ' > /etc/modprobe.d/pve-blacklist.conf
nouveau
nvidiafb
nvidia
snd_hda_intel
ZZZ
Reboot for the disabling of kernel modules for the graphics card to take effect
Run:
reboot
Export the VBIOS rom of the graphics card
Run:
echo 1 > /sys/bus/pci/devices/_VGA_DEVICE_PCI_ID_/rom
cat /sys/bus/pci/devices/_VGA_DEVICE_PCI_ID_/rom > vga.rom
echo 0 > /sys/bus/pci/devices/_VGA_DEVICE_PCI_ID_/rom
_VGA_DEVICE_PCI_ID_
can be found via the command shown in this section.
The value is in the format like0000:01:00.0
.
If meet this error:
cat: '/sys/bus/pci/devices/_VGA_DEVICE_PCI_ID_/rom': Input/output error
, it means exporting this way is somehow not supported. Use GPU-Z to export instead. See this section.
I met this error and have had success with GPU-Z.
Export the VBIOS rom of the graphics card using GPU-Z
Boot up a Windows PE via USB.
Run GPU-Z (downloaded from here).
Click the arrow icon in the BIOS Version
line.
Click Save to file...
.
Save the rom file as vga.rom.
Install rom-parser to verify the VBIOS rom
Run:
wget -O rom-parser-master.zip https://codeload.github.com/awilliam/rom-parser/zip/refs/heads/master
unzip rom-parser-master.zip
cd rom-parser-master
make
./rom-parser vga.rom
Result for AMD Radeon RX 580:
Valid ROM signature found @0h, PCIR offset 250h
PCIR: type 0 (x86 PC-AT), vendor: 1002, device: 67df, class: 030000
PCIR: revision 0, vendor revision: f32
Valid ROM signature found @e800h, PCIR offset 1ch
PCIR: type 3 (EFI), vendor: 1002, device: 67df, class: 030000
PCIR: revision 0, vendor revision: 0
EFI: Signature Valid, Subsystem: Boot, Machine: X64
Last image
Result for NVIDIA GeForce GTX 970:
Valid ROM signature found @800h, PCIR offset 1a0h
PCIR: type 0 (x86 PC-AT), vendor: 10de, device: 13c2, class: 030000
PCIR: revision 0, vendor revision: 1
Valid ROM signature found @fe00h, PCIR offset 1ch
PCIR: type 3 (EFI), vendor: 10de, device: 13c2, class: 030000
PCIR: revision 3, vendor revision: 0
EFI: Signature Valid, Subsystem: Boot, Machine: X64
Last image
Put the VBIOS rom to the PVE server
Run:
mv vga.rom /usr/share/kvm/vga.rom
Upload an OS installation image to the PVE server
Open the PVE server's web UI.
The default port is 8006.
The login password is the root password set during PVE installation.
In the left panel, click Datacenter -- _NODE_ID_ -- local (_NODE_ID_)
.
In the left sub panel, click ISO Images
.
Click Upload
.
Click Select File
to select the ISO image on local disk.
Click Upload
to upload it to the PVE server.
Create a VM
Open the PVE server's web UI.
Click Create VM
on the top bar.
Set the options like this:
General:
Node: _NODE_ID_
VM ID: _VM_ID_
Name: _VM_NAME_
Resource Pool:
OS:
Use CD/DVD disk image file (iso): (check)
Storage: local
ISO image: (select)
Guest OS:
Type: Linux
Version: 6.x - 2.6 Kernel
System:
Graphics card: Default
Machine: q35
BIOS: OVMF (UEFI)
Add EFI Disk: (check)
EFI Storage: local-lvm
Pre-Enroll keys: (check)
SCSI Controller: VirtIO SCSI single
Qume Agent: (uncheck)
Add TPM: (uncheck)
Disk:
scsi0:
Disk:
Bus/Device: SCSI, 0
Storage: local-lvm
Disk size (GiB): 15
Cache: Default (No cache)
Discard: (uncheck)
IO thread: (check)
CPU:
Sockets: _REAL_CPU_SOCKETS_COUNT_ (usually 1)
Cores: _REAL_CPU_CORES_COUNT_
Type: host
Memory (MiB): 4096
Advanced (near the "Back" button): (checked)
Ballooning Device: (uncheck)
Network:
No network device: (uncheck)
Bridge: vmbr0
VLAN tag: no VLAN
Firewall: (check)
Model: VirtIO (paravirtualized)
MAC address: (auto)
Click Finish
.
Allocate the graphics card to the VM
Open the PVE server's web UI.
In the left panel, click Datacenter -- _NODE_ID_ -- _VM_ID_
.
In the left sub panel, click Hardware
.
In the right panel, Click Add
. Click PCI Device
.
Set the options like this:
Advanced (near the "Add" button): (checked)
Raw Device:
Device: (select the graphics card's PCI video evice)
All Functions: (check)
Primary GPU: (uncheck)
ROM-Bar: (check)
PCI-Express: (check)
Vendor ID: (From Device)
Device ID: (From Device)
Sub-Vendor ID: (From Device)
Sub-Device ID: (From Device)
Click Add
.
Edit the VM's config file to load the VBIOS rom
Run:
vi /etc/pve/qemu-server/_VM_ID_.conf
Edit:
hostpci0: _PCI_ID_,pcie=1,romfile=vga.rom
_PCI_ID_
is in the format like0000:01:00
. Do not change it.- The value for
romfile
must be a path relative to/usr/share/kvm/
. An absolute path not works.
After editing the config file, the PVE server's web UI's VM's Hardware
page's PCI Device
line can immediately show the updated value.
Boot up the VM
Open the PVE server's web UI.
In the left panel, click Datacenter -- _NODE_ID_ -- _VM_ID_
.
In the left sub panel, click Console
.
In the right panel, click Start Now
.
Check if the graphics card has been identified in the VM
In the VM, run:
lspci -k | grep -A10 -i vga
- If a
Kernel driver in use
line is present, it means kernel module (i.e. driver) has been loaded for the graphics card. Note the one withKernel driver in use: bochs-drm
is the virtual graphics card, not the passthrough one.
Check if the VBIOS rom has been correctly loaded in the VM
In the VM, run:
dmesg | grep -i 'Invalid PCI ROM header signature'
If meet this error:
Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
, it means the VBIOS rom has not been correctly loaded.
In my case, the VBIOS rom is correctly loaded for AMD Radeon RX 580, but not for NVIDIA GeForce GTX 970. However, the hardware acceleration still works for NVIDIA GeForce GTX 970 somehow.
Check for graphics card driver error messages in the VM
In the VM, run:
dmesg | grep -i amd
dmesg | grep -i nvidia
Result for AMD Radeon RX 580:
kfd kfd: amdgpu: skipped device 1002:67df, PCI rejects atomics 730<0
- It seems not a fatal error as the hardware acceleration still works.
Comments: