Homelab & Platform

GPU Passthrough on Proxmox for Local AI: IOMMU, VFIO, and the Gotchas

The integration point where homelabbers’ projects actually touch: Proxmox host, GPU sitting idle, Ollama or other inference engine running inside a VM, waiting for a graphics card that lives outside the hypervisor. Getting the GPU through the hypervisor boundary and into the container where your LLM can see it is straightforward on paper and genuinely painful in practice — not because the kernel machinery is broken, but because motherboard firmware design, BIOS defaults, and IOMMU grouping vary wildly across consumer boards, and the failure messages do not tell you why.

This guide walks you through the actual setup, names the real failure modes before you hit them, and tells you honestly when a simpler architecture (bare metal, or a different VM strategy) is the better answer than persisting with passthrough.

When to passthrough, when to give up

Before you start, ask: does the GPU actually need to be virtualized?

Use passthrough (GPU through Proxmox into a VM) when:

  • Your Proxmox host runs other workloads that need the flexibility of VMs (web services, storage, dev containers).
  • You want to upgrade or replace the GPU without rebuilding the host OS.
  • You are learning Proxmox or plan to run multiple inference VMs and want to dynamically allocate GPUs.
  • You have multiple GPUs and want to assign them to different VMs.

Bare metal is simpler when:

  • This is a dedicated inference box and you have no other workloads on the host.
  • You want the lowest latency and no VM overhead (single-digit percentage, but worth measuring for your model size and batch size).
  • IOMMU grouping on your motherboard is a mess (see “Checking your motherboard” below).
  • You have experienced passthrough failures and are out of patience.

For a single RTX 3090 running Ollama on a homelab, bare metal install of Debian + Ollama is honestly a faster ship time than debugging IOMMU and VFIO. If you have already committed to Proxmox for other reasons, push through the steps below. If Proxmox is a hypothetical, a single-purpose inference box might not need it.

The architecture: IOMMU, VFIO, and Proxmox

Three things have to happen for GPU passthrough to work:

  1. IOMMU (Intel VT-d or AMD IOMMU) is enabled in the motherboard BIOS. This splits the PCIe address space into protected groups, so each device (or group of devices) can be assigned to a VM without the VM seeing devices that do not belong to it.

  2. VFIO (Virtual Function I/O) kernel driver claims the GPU. Instead of the normal NVIDIA driver running on the host, the VFIO module binds to the GPU and holds it inert, ready to hand off to a VM.

  3. Proxmox (or libvirt / KVM underneath it) creates a VM with PCI device passthrough configured. The VM boots, loads the normal NVIDIA driver, and sees the GPU as if it were installed in bare metal.

The catch: IOMMU groups are determined by your motherboard’s PCIe topology and BIOS settings. Some boards put every device in its own group (ideal). Many consumer boards group the GPU with other devices — a storage controller, a network card, a USB hub — because of firmware design choices or disabled ACS. When that happens, you cannot pass through the GPU alone; you have to pass the entire group, which fails unless you do not care about those other devices.

Checking your motherboard and BIOS settings

Before you touch the Proxmox host, verify that:

  1. Your motherboard supports IOMMU. Check the manual for “VT-d” (Intel) or “IOMMU” (AMD) in the feature list. Most X570, B550, Z790, and newer boards support it; some B450 and Z690 boards do too, but you have to check.

  2. BIOS has IOMMU/VT-d enabled. Power off the host, enter BIOS, and look for settings like:

    • Intel: “Intel VT-d” or “VT for Direct I/O” (usually under Advanced → System Agent Configuration or similar).
    • AMD: “IOMMU” or “AMD-Vi” (usually under Advanced → Chipset Configuration or CPU Features).
    • Enable it. Do not enable “ACS” unless the manual says it is safe; many boards do not implement it correctly.
  3. Your GPU is in a clean IOMMU group. After you boot Proxmox with IOMMU enabled (see next section), run:

    dmesg | grep -i iommu
    

    Look for lines like IOMMU: … group …. Then run:

    find /sys/kernel/iommu_groups -type l | sort -V | while read link; do echo "IOMMU Group $(basename $(dirname $link)): $(basename $link) $(lspci -nns $(readlink $link | sed 's/.*\///' | cut -d':' -f1-3))"; done | grep -i nvidia
    

    This shows which IOMMU group your NVIDIA GPU is in and what else is in that group.

If the GPU shares a group with a storage controller or USB device, your options narrow:

  • Pass the entire group (if you do not use the other devices).
  • Look for BIOS options to separate devices (some boards have “ACS override” or per-port IOMMU settings; check your manual).
  • Consider bare metal, because the BIOS may not support clean separation.

Proxmox host setup: enabling IOMMU and VFIO

These steps assume Proxmox VE 8.x. If you are on 7.x, package names and paths are slightly different; check the Proxmox wiki for your version.

Step 1: Enable IOMMU in the kernel

Edit /etc/default/grub:

nano /etc/default/grub

Find the line starting with GRUB_CMDLINE_LINUX_DEFAULT=. It currently looks something like:

GRUB_CMDLINE_LINUX_DEFAULT="quiet"

Add IOMMU flags to the end:

  • Intel: GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
  • AMD: GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"

The iommu=pt flag tells the kernel to pass through devices in their native IOMMU groups, reducing group size (useful for AMD boards especially).

Save the file and update GRUB:

update-grub

Step 2: Load VFIO kernel modules

Edit /etc/modules:

nano /etc/modules

Add these lines at the end:

vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

Save and reboot:

reboot

Step 3: Verify IOMMU is working

After reboot, check:

dmesg | grep -i iommu | head -5

Look for a line like IOMMU: … detected. If you see nothing or “IOMMU: disabled”, go back to BIOS and verify that VT-d or IOMMU is enabled. If the BIOS shows it enabled, your motherboard may not support it (despite the manual claim) — consider bare metal.

Binding the GPU to VFIO

Once IOMMU is working, you need to stop the normal NVIDIA driver from claiming the GPU and let VFIO claim it instead.

Find the GPU’s PCI ID

Run:

lspci -nn | grep NVIDIA

Output looks like:

81:00.0 VGA compatible controller: NVIDIA Corporation GA102 [GeForce RTX 3090] [10de:2204]

Note the ID in the square brackets: 10de:2204 (vendor:device). The vendor 10de is always NVIDIA for NVIDIA GPUs; the device code (2204 here for RTX 3090) varies by model.

Bind the GPU to VFIO at boot

Create a file /etc/modprobe.d/vfio-pci.conf:

echo "options vfio-pci ids=10de:2204" > /etc/modprobe.d/vfio-pci.conf

Replace 10de:2204 with your GPU’s ID.

If your GPU has multiple device functions (GPU + audio, common on newer boards), list them all:

echo "options vfio-pci ids=10de:2204,10de:228b" > /etc/modprobe.d/vfio-pci.conf

Update the kernel module dependency database:

update-initramfs -u -k all

Reboot:

reboot

Verify VFIO bound the GPU

After reboot, run:

lspci -k | grep -A 2 NVIDIA

Look for a line like Kernel driver in use: vfio-pci. If it says nvidia or nouveau, the VFIO binding failed — the normal driver got there first. Common fixes:

  • Blacklist the NVIDIA driver in /etc/modprobe.d/blacklist-nvidia.conf (create it with echo "blacklist nvidia" > /etc/modprobe.d/blacklist-nvidia.conf), then reboot.
  • Ensure the VFIO entry in /etc/modprobe.d/vfio-pci.conf is correct (right PCI ID, right syntax).
  • Check the load order: VFIO must load before other GPU drivers. Add to the end of the vfio-pci.conf line: (if NVIDIA is still loading first, it may be hardcoded into the kernel or listed in /etc/modules-load.d/.)

Creating a Proxmox VM with GPU passthrough

Once VFIO has the GPU, Proxmox can assign it to a VM.

Via the Proxmox web UI:

  1. Create a new VM (Datacenter → Create VM) with the OS of choice (Ubuntu 24.04 LTS is common for Ollama).
  2. Give it 4+ cores and 8+ GB RAM for a 7B model, 16+ GB for a 13B model.
  3. After creating the VM, click it in the sidebar and go to Hardware → Add → PCI Device.
  4. In the “Device” dropdown, select the NVIDIA GPU (it appears by name if VFIO is working).
  5. Check “All Functions” if it is a GPU with audio output (most modern cards).
  6. Check “Primary GPU” if you want the VM’s display to route through this GPU (optional; if unchecked, the GPU is compute-only and the VM uses emulated video).
  7. Click Add.

Via the terminal (advanced):

Edit /etc/pve/nodes/proxmox-node-name/qemu-server/VM-ID.conf and add:

hostpci0: 81:00,x-vga=on

Where 81:00 is the GPU’s PCIe address (from lspci). Change 81:00 to your GPU’s address. The x-vga=on flag tells KVM to map the GPU as the primary display device.

Boot the VM and install drivers

  1. Start the VM.

  2. Install the NVIDIA driver inside the VM:

    sudo apt update
    sudo apt install -y nvidia-driver-550
    

    (Check NVIDIA’s website for the latest driver version; 550+ supports most modern cards.)

  3. Verify the GPU is visible:

    nvidia-smi
    

    Should show the GPU and its VRAM.

  4. Install Ollama:

    curl -fsSL https://ollama.ai/install.sh | sh
    
  5. Test:

    ollama pull llama2:7b
    ollama run llama2:7b
    

The GPU should appear in nvidia-smi under the running process.

LXC containers and why they do not work for GPU passthrough

LXC containers are lighter than full VMs — no bootloader, no kernel, shared host kernel. But LXC does not support VFIO device passthrough. There is no mechanism in the LXC cgroup v2 setup to claim a PCI device group and pass it through the container boundary. You can bind /dev/nvidia* device nodes into an LXC container, but that requires the NVIDIA driver to be loaded on the host and shared via /dev — which breaks isolation and still requires the host to have GPU support.

Use a full KVM VM for any GPU workload on Proxmox, not LXC.

If you want LXC’s lightweight properties, the honest path is bare metal (no hypervisor at all) or a systemd-nspawn container on a bare metal Linux box.

Real failure modes and troubleshooting

Failure ModeSymptomRoot CauseSolution
VFIO module not loadedError “module not found” after rebootStale initramfs or wrong load orderRun update-initramfs -u -k all and reboot
GPU shows as unassignedNo GPU in Proxmox PCI device dropdownNVIDIA driver still bound to GPU instead of VFIOBlacklist nvidia driver in /etc/modprobe.d/blacklist-nvidia.conf, update initramfs, reboot
IOMMU not detecteddmesg | grep iommu shows nothing or “disabled”IOMMU disabled in BIOS or not supported by motherboardCheck BIOS for VT-d (Intel) or IOMMU (AMD) setting; if already enabled, motherboard may not support it
GPU grouped with other devicesCannot isolate GPU in its own IOMMU groupMotherboard firmware design or ACS disabledCheck BIOS for ACS or per-port IOMMU settings; if unavailable, pass entire group or use bare metal
VM boots but no GPU detectednvidia-smi shows “no GPU found” inside VMNVIDIA driver not installed or wrong version inside VMRun driver installation again; check lspci | grep NVIDIA inside VM to confirm passthrough worked
Kernel panic on VM bootVM hangs or crashes immediatelyVFIO reset broken on this GPU/driver combinationAdd rombar=0 to hostpci config in /etc/pve/nodes/.../qemu-server/VM-ID.conf; consider bare metal

”vfio-pci: module not found” after reboot

The VFIO modules exist in the kernel, but the module loading order was wrong or the initramfs is stale. Run:

update-initramfs -u -k all

And reboot. If it persists, check that /etc/modules has the four vfio lines and that you ran update-initramfs.

GPU shows as “unassigned” in Proxmox after VFIO bind

Proxmox web UI shows no GPU in the PCI Device dropdown. Likely cause: the normal NVIDIA driver still has the GPU. Run:

lspci -k | grep NVIDIA

If the output shows nvidia in the kernel driver line, blacklist it:

echo "blacklist nvidia-drm
blacklist nvidia
blacklist nouveau" > /etc/modprobe.d/blacklist-nvidia.conf

update-initramfs -u -k all
reboot

IOMMU group contains storage or network devices

Your motherboard puts the GPU in a group with other devices. Three options:

  1. Check BIOS for ACS or per-port IOMMU settings. Some Z790/X570 boards let you change grouping per PCIe slot. Consult your manual.
  2. If you control the other devices: Pass the entire group (e.g., hostpci0: 81:00 for GPU, hostpci1: 81:01 for audio). Works if you do not use those other devices on the host.
  3. Switch to bare metal. Grassroots IOMMU grouping issues are often a sign that the motherboard was not designed for high-reliability passthrough. Bare metal avoids the problem.

VM boots but nvidia-smi shows “no GPU found”

The VM sees the GPU hardware (Proxmox successfully passed it through) but the NVIDIA driver is not loaded. Run inside the VM:

lspci | grep NVIDIA

If you see the GPU, the NVIDIA driver install failed or is the wrong version. Re-run the driver installation. If you see nothing, the passthrough failed — check Proxmox logs (dmesg | grep -i iommu on the host) for conflicts.

Kernel panic when VM starts

VFIO reset is broken on some GPUs or driver combinations (especially older consumer cards). This usually manifests as the VM hanging on boot or crashing immediately. Workaround: in the Proxmox VM config (/etc/pve/nodes/.../qemu-server/VM-ID.conf), add:

hostpci0: 81:00,x-vga=on,rombar=0

The rombar=0 flag disables GPU BIOS ROM access, which can help with reset stability. If it still fails, consider bare metal.

Honest bottom line

GPU passthrough on Proxmox works. The Proxmox documentation is good. The kernel machinery is solid. But the failure modes are scattered across BIOS settings, motherboard firmware design, and IOMMU grouping quirks that are invisible until you hit them.

When passthrough makes sense: You are already running Proxmox for other workloads, you have a motherboard with clean IOMMU isolation, and you want flexibility in allocating hardware.

When bare metal is simpler: This is a single-purpose inference box. Bare metal (Debian + NVIDIA driver + Ollama) installs in 30 minutes, needs no troubleshooting, and gives you a few percentage points of lower latency. Spend the IOMMU and VFIO debugging time on something else.

Whichever path you choose, validate it early — boot the VM or bare metal system, run nvidia-smi, fire up Ollama, and confirm your model loads and generates tokens at expected speed before you commit to it as your long-term inference box. The GPU works. The question is whether the hypervisor layer makes that work harder or easier in your specific case.

For context on GPU selection for this workload, see best GPU for local LLM. For how to structure your homelab build around Proxmox, the dual-RTX 3090 build guide walks through the full hardware picture. And once you have the GPU working, how to run LLMs locally covers the inference engine options and tuning.

Sources

  • Proxmox VE documentation: pve.proxmox.com/wiki/IOMMU_setup (version-agnostic, with 7.x–8.x notes)
  • VFIO kernel module documentation: kernel.org (PCIe passthrough mechanics)
  • Linux kernel IOMMU group scanning: github.com/iommu groups, community troubleshooting (2024–2025)
  • Community reports: r/Proxmox, r/homelab passthrough failures and BIOS settings patterns (2024–2025)