SOLVED: Nvidia kernel modules loaded before thunderbolt 3 egpu modules

  • Which OS (Operating System)?

Arch Linux amd64 , kernel 6.13.7-hardened (but also 6.12.x-lts)

  • Which Framework product (Laptop 13, Laptop 16, Laptop 12 or Desktop) and which generation (Intel 11th gen, Intel 12th gen , Intel 13th gen, Chromebook, AMD 7040 Series, AMD AI 300 Series, AMD Ai Max 300 Series)

FWL13 i5-13th

Not a Framework specific question, but I need to lean on the brain trust here as I’ve used all my mental spoons on other projects. Before Framework RMA’d out my i5-13th mainboard, I had a working eGPU setup. It’s a Sonnet 550 eGPU box, with an Nvidia 1050 Ti card, using nvidia-dkms drivers. I tore apart that setup trying to troubleshoot the old FWL mainboard (turned out to be a hardware failure, RMA fixed it 100%). Now trying to get the eGPU working again. But in my attempt to rebuild the setup, I am running into a new issue I have not seen before: the nvidia drivers get loaded before the thunderbolt service, so nvidia gives up looking for the card.

> journalctl -b | grep -i 'thunder\|nvid'
Apr 09 17:56:42 pluto kernel: ACPI: bus type thunderbolt registered
Apr 09 17:56:42 pluto kernel: nvidia: loading out-of-tree module taints kernel.
Apr 09 17:56:42 pluto kernel: nvidia: module license 'NVIDIA' taints kernel.
Apr 09 17:56:42 pluto kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel
Apr 09 17:56:42 pluto kernel: nvidia: module license taints kernel.
Apr 09 17:56:42 pluto kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 235
Apr 09 17:56:42 pluto kernel: NVRM: No NVIDIA GPU found.
Apr 09 17:56:42 pluto kernel: nvidia-nvlink: Unregistered Nvlink Core, major device number 235
Apr 09 17:56:42 pluto systemd-modules-load[405]: Failed to insert module 'nvidia_uvm': No such device
Apr 09 17:56:43 pluto kernel: thunderbolt 0-0:1.1: new retimer found, vendor=0x8087 device=0x15ee
Apr 09 17:56:43 pluto systemd[1]: Starting Thunderbolt system service...
Apr 09 17:56:43 pluto systemd[1]: Started Thunderbolt system service.
Apr 09 17:56:44 pluto kernel: thunderbolt 0-1: new device found, vendor=0x8 device=0x38
Apr 09 17:56:44 pluto kernel: thunderbolt 0-1: Sonnet Technologies, Inc. eGFX Breakaway Box 550

Is there any good way to force the thunderbolt drivers / service to load earlier, or make nvidia wait until TB is ready? I have tried searching Arch wiki, DDG, etc for every combination of mkinitcpio, thunderbolt/tb, eGPU, etc and not finding any answers that address this exactly.

Thanks in advance for any assistance.

Additional info that might help:

I have tried several variations on mkinitcpio.conf, but here is the latest (still does not work) iteration:

MODULES=(btrfs nvidia nvidia_modeset nvidia_uvm nvidia_drm)
BINARIES=(/usr/bin/btrfs)
FILES=()
HOOKS=(base autodetect btrfs microcode keyboard keymap modconf block filesystems fsck udev kms)

output of boltctl after boot is complete (ID’s fudged):

> boltctl
 ● Sonnet Technologies, Inc. eGFX Breakaway Box 550
   β”œβ”€ type:          peripheral
   β”œβ”€ name:          eGFX Breakaway Box 550
   β”œβ”€ vendor:        Sonnet Technologies, Inc.
   β”œβ”€ uuid:          e4030000-XXXX-YYYY-ZZZZ-a30ac434391f
   β”œβ”€ generation:    Thunderbolt 3
   β”œβ”€ status:        authorized
   β”‚  β”œβ”€ domain:     d3fb8780-AAAA-BBBB-ffff-ffffffffffff
   β”‚  β”œβ”€ rx speed:   40 Gb/s = 2 lanes * 20 Gb/s
   β”‚  β”œβ”€ tx speed:   40 Gb/s = 2 lanes * 20 Gb/s
   β”‚  └─ authflags:  none
   β”œβ”€ authorized:    Thu 10 Apr 2025 12:30:33 AM UTC
   β”œβ”€ connected:     Thu 10 Apr 2025 12:30:30 AM UTC
   └─ stored:        Sun 30 Jun 2024 01:09:35 PM UTC
      β”œβ”€ policy:     iommu
      └─ key:        no

nvidia card does NOT show up in lspci after boot.

Let me know what else might help to troubleshoot.

On Arch I believe you want to add thunderbolt to the modules in mkinitcpio.conf for early loading

You might create a udev rule to modprobe -r nvidia && modprobe nvidia when the thunderbolt dock is added, looking for the PCI ID of your 1050 Ti.

I don’t have a good idea what changed to influence this sequence.

K3n.

Thanks for the suggestions. I’ve been too busy to try these yet but I will when I can.

Thanks for the suggestions. None of those software fixes worked. But pulling and re-seating the card in the eGPU’s pci slot seems to have gotten things going.

1 Like