Thunderbolt nvidia eGPU on GNOME wayland on debian 12 on Framework 13 AMD Ryzen 7 7840U

I recommend reading the entire post before taking any action.

First of all, I would like to explain the purpose of the post. I have managed to get several things working that can be useful for the majority of cases, so by following my steps, it is possible to reach a suitable functional point. On the other hand, I run into some problems and maybe someone can help me solve them.

My setup consists of the following elements:

Now we start with what needs to be done to make the GPU work. The first thing is to install the Nvidia drivers. For this the following tutorial has been followed: NvidiaGraphicsDrivers - Debian Wiki
The commands are as follows:

sudo apt install -y linux-headers-amd64
sudo apt install -y -t buster-backports linux-image-amd64
sudo apt install -y nvidia-driver firmware-misc-nonfree
sudo apt install nvidia-cuda-dev nvidia-cuda-toolkit

To make it work in Wayland, you have to insert the following commands:

echo 'GRUB_CMDLINE_LINUX="$GRUB_CMDLINE_LINUX nvidia-drm.modeset=1"' | sudo tee /etc/default/grub.d/nvidia-modeset.cfg
sudo apt install -y nvidia-suspend-common
sudo systemctl enable nvidia-suspend.service
sudo systemctl enable nvidia-hibernate.service
sudo systemctl enable nvidia-resume.service
echo 'options nvidia NVreg_PreserveVideoMemoryAllocations=1' > /etc/modprobe.d/nvidia-power-management.conf

After this, the drivers must be signed for correct installation. This is only necessary if secure boot is enabled. For this, I have used the following tutorial: SecureBoot - Debian Wiki
The commands are as follows:

mkdir -p /var/lib/shim-signed/mok/
cd /var/lib/shim-signed/mok/
sudo openssl req -nodes -new -x509 -newkey rsa:2048 -keyout MOK.priv -outform DER -out MOK.der -days 36500 -subj "/CN=Your Name/"
sudo openssl x509 -inform der -in MOK.der -out MOK.pem
sudo mokutil --import /var/lib/shim-signed/mok/MOK.der
echo -e "mok_signing_key=\"/var/lib/shim-signed/mok/MOK.priv\"\nmok_certificate=\"/var/lib/shim-signed/mok/MOK.der\"" | sudo tee /etc/dkms/framework.conf
sudo mokutil --import /var/lib/dkms/mok.pub

It is normal that when you turn your computer on, without the graphics card connected, to give an error. This is because the Nvidia drivers simply refuse to install themshelves if there is no hardware that requires them. Now if we connect the graphics card, it will be usable. Remember that not all laptop ports support this and only those closest to the screen are compatible. It simply connects when it is necessary to use it and disconnects after shutdown. Computations are done on the external graphics card and sent to the internal graphics card.

Now the problems begin. If you connect a video output to the GPU, it will not work. This is because that part of the drivers has not been loaded into the kernel. Fixing this is simple, we just run the following command every time you connect the eGPU:

sudo modprobe nvidia_drm

If you want, you can tell the operating system to do it automatically using the following command:

echo 'ACTION=="add", ATTRS{unique_id}=="c4010000-0062-640e-8388-23e40c045208", RUN+="/sbin/modprobe nvidia_drm"' | sudo tee -a /etc/udev/rules.d/10-thunderbolt.rules

You have to replace c4010000-0062-640e-8388-23e40c045208 with the id of your PCIe to Thunderbolt adapter that you get from this command after plugging in the board:

cat /sys/bus/thunderbolt/devices/*/unique_id

This fixes the problem but creates another one. The performance is horrible and I think that it is due to the data flow which I believe is as follows:
CPU->eGPU->iGPU->eGPU->Video output.

Restarting gdm3 or starting the computer with the graphics card already connected (to try to choose the eGPU as the main card) makes it impossible to start a session using Wayland as the option is no longer available. Therefore, it only allows X11. This would not be a problem if it were not for the lack of support for fractional scaling.

To make it work correctly with x11, I followed the next tutorial: External GPU - ArchWiki
Inserting the following commands:

echo '
Section "Device"
    Identifier "Device0"
    Driver     "nvidia"
    BusID      "PCI:26:16:3"                 # Edit according to lspci, translate from hex to decimal.
    Option     "AllowExternalGpus" "True"    # Required for proprietary NVIDIA driver.
EndSection

Section "Module"
    # Load modesetting module for the iGPU, which should show up in XrandR 1.4 as a provider.
    Load "modesetting"
EndSection
' | sudo tee /etc/X11/xorg.conf.d/80-egpu-primary-igpu-offload.conf

Warning, this can prevent the computer from booting X11 without the graphics card connected.

This leaves me with 3 different branches with 3 different problems:

  • I use Wayland and the video output of the laptop, that is, 2 cables that I don’t want
  • I use Wayland and the video output of the graphics card, that is, poor performance
  • I use X11 and I am left without fractional scaling

My preferred solution and the most likely solution is to opt for the second one and get the Nvidia drivers to allow the use of Wayland.

Does anyone have any ideas? Any suggestions or thoughts?
If I manage to solve it, I will write a complete tutorial.

1 Like

Sorry all I can offer is moral support, I’m trying to get a nvidia eGPU setup working on the 7640u mainboard with no success. Currently on Bazzite-nvidia. Best of luck, and if you do write that tutorial I’ll definitely be reading it!

Thank you very much for the moral support, I really appreciate it. I thought this topic was dead. I am not familiar with either Bazzite or Fedora, but maybe I can help if you share a bit about your situation

This is where I’m at atm: Stuttering on external display - #8 by SinisterCatFancier

Any updates on this situation? What did you end up doing?

Sorry for the delay in responding, I’ve been very busy lately. Recently, support for Wayland sessions with an NVIDIA GPU has been added in Debian. I filed a bug report about the issue: #1074481 - gdm3: There is no option to log in with Wayland when an Nvidia eGPU is connected - Debian Bug report logs. So now, when I connect the GPU and restart the graphical interface, everything works perfectly. Unfortunately, this causes any open applications to be lost, but it’s manageable. The big drawback is that when disconnecting the GPU, I haven’t been able to shut down the graphical interface properly, which results in a kernel panic and a massive crash. The behavior of Windows in this situation is much better, to be honest. It allows hot plug and unplug without even closing the tabs. I would like to spend more time investigating the problem in detail because I am quite sure it is technically possible. Since Linux supports hotplugging of PCIe devices. If I find a complete solution, I will create a post with a full tutorial. So, to recap, I have a hot-pluggable eGPU with Wayland, fractional scaling, and very good performance. The disconnection fails.

Mutter’s primary gpu may be auto-selecting the integrated graphics if you’re still experiencing slowdowns. Check sudo journalctl -b --no-pager | grep "primary" | less to confirm. That tells you which DRI device is being used (and if it gives you a driver name, you have your answer already). To find the gpu corresponding to DRI card, run udevadm info --query=all --name /dev/dri/cardX (X being 0 or 1, depending on device enumeration order).

If it’s the wrong card, find the right card’s vendor and device ID with lspci -nn | grep NVIDIA, for example mine returns 82:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [GeForce RTX 3060 Ti Lite Hash Rate] [10de:2489] (rev a1).

Then, create a udev rule SUBSYSTEM=="drm", ENV{DEVTYPE}=="drm_minor", ENV{DEVNAME}=="/dev/dri/card[0-1]", SUBSYSTEMS=="pci", ATTRS{vendor}=="0x10de", ATTRS{device}=="0x2489", TAG+="mutter-device-preferred-primary", replacing vendor and device with your strings. I put it in /etc/udev/rules.d/61-mutter-gpu.rules.

Once you reboot to enable the udev rules, you can confirm that Mutter selected the correct gpu by checking journalctl and checking that the DRI card listed as primary is indeed your nvidia card. Hope that helps, I also had the weird copy-mode pipeline performance issues that you seem to be experiencing.