Help wanted - eGPU bootloop issues

This is not a DEG2 specific problem: so far it seems to affect every TB5 eGPU adapter: DEG2, EG02, AG03, TH5P4. There have been exactly zero successful AMD + TB5 + Linux reports on egpu.io.

Up until a week ago I was using AOOStar AG02 over USB-C with my Minisforum UM870. I only switched to the Minisforum DEG2 because I managed to kill the PSU in the AG02 while attempting to install a Noctua fan. In that time I’ve also upgraded to kernel 7.x.

I have not once been able to get this working on either my Minisforum UM870 or my Framework Desktop. On my UM870 I get an instant reboot, on my Framework desktop it fails to initialize, I don’t have the error. It did get as far as being recognized in nvdia-smione time in my setup attempts, but then failed to initialize and never worked again.

I’m starting to wonder if the issue is this specific dock or kernel 7.x

Neither. It happens on every Linux kernel in combination with PCIe tunneling over any TB5 dock. Basically Linux+TB5=kaBoom :wink:
Now depending on specifics, kaBoom may be less or more spectacular: AMD host + Nvidia GPU is the most explosive (abrupt reboot, if GSP is loaded). In other cases according to reports on eGPU.io, it will be just very unstable and disconnect shortly.

There have been some bug fixes with regards to thunderbolt in linux kernel 7.0.10.
Please try kernel 7.0.10 and see if it helps.

Re-read my post. My AG02 was working fine until I broke it by modifying the PSU.
I was using it on Fedora 43 in VR with no issues. After killing it (my fault, nothing to do with Linux) I bought the DEG02, upgraded to Fedora 44, and upgraded to kernel 7.x, and haven’t been able to get it working since. I was using the AG02 just fine so clearly the issue isn’t as clear cut as “Linux+TB5=kaBoom”

Edit: I forgot to mention I’m using a 4070 Super with the proprietary drivers

The AG02 uses an older ASM2464PD that does work with Linux over USBv4. The DEG02 uses the same chip as the AG03, the Intel JHL9480, which does not work with Linux over USBv4, only OcuLink. Honestly, I would be pessimistic about this getting fixed, since Intel seems to be pulling back on open source in general due to financial troubles.

Which is such a shame, since the mins seem to have vastly improved compared to the ASM2464PD. I was considering an AG03 as an eGPU enclosure until I read this thread: now I’m thinking of just getting one of the older ASM2464PD enclosures and/or holding on for the Framework 16 OcuLink Dev Kit.

I wish I could get in on the testing for the latter, but I’m not important enough for that, and I wouldn’t get a suitable GPU for it until July. Guess I’m only going to be able to enjoy the eGPU life after I go back to school.

Thank you for that information, I’m still in the Amazon return window. Going to ship this back and get another AG02, hopefully this time with a quieter fan

Yeah, that would probably be best. That being said, I did just see this post on Reddit that contradicts everything said here, and the 50 series is known to cause problems for Thunderbolt ports on windows, as said here. I would not be surprised if it spread to AMD as well on Linux.

Can anyone reproduce using older hardware?

FAKE EDIT: Turns out they are a fellow user on the forums, post is here.

REAL EDIT: Just messaged them, we’ll see…

Re-read AG02’s specs: it’s not a TB5 adapter, it’s USB4v1.

Could you try again with the directions here? They somehow managed to get the exact same dock working on Linux by disabling the GSP and blacklisting nouveau, so I’m starting to wonder if it’s an NVidia issue once again.

As explained just a few posts above and in the kernel.org bugizilla and on egpu.io back in February, disabling GSP prevents reboots, but the connection still remains unstable, especially under load. Furthermore, it reduces the perf even by 50% in some scenarios.
UPDATE: …and of course it does not work for Blackwell.

The link got broken, I believe this is the one it was supposed to be
It’s 3 months old and does not contain any details or benchmarks. Probably GSP was disabled the person ran some workload just for a few minutes and posted success without checking more.
There hasn’t been even a single successful report of Linux+TB5 on egpu.io. A few ppl initially reported success on Intel hosts, but retracted later saying they experienced disconnecting under load.

Helpful, thank you.

7840u/HX370 doesn’t support TB5/USB4v2, so there wouldn’t be any mention of support for that with AMD. It would fallback to TB4/USB4v1 from what I understand. Perhaps I’m missing something from the conversation?

Please don’t make assumptions just because I didn’t provide benchmarks or the like, I simply focused on getting it working at the time… No one asked me any questions or even bothered to try help. Performance doesn’t seem drastically reduced for my workloads (mostly ML). Disconnects under load sounds more like a bad cable to me, which I’ve seen people in the past mention in the past for other setups.

Can confirm I’ve been using the FW13 7840u with the 4070ti eGPU on MS DEG2 without any major issue since over USB4v1. I haven’t tried Fedora or other flavours again since though, as finding a way to get it to behave took me more hours than I would like to admit with Ubuntu being the first to play ball. Even more annoying as it worked flawlessly on Win11, installed the driver and boom.

I am quite drunk and it’s just gone 1AM here in the UK, so any more in-depth stuff on my side will have to wait a little if there’s anything you want me to look at on my system later?

I am amazed you got it working at all.

Took more hours (days really) than I would like to admit :joy: Every log I checked gave absolutely no clue as to what was happening, so I started messing about with kernel parameters, open Vs proprietary drivers, different distro’s, trying things in different order, then it suddenly worked… so I took the approach of, “it ain’t broke right now, so I ain’t gonna try to fix it anymore” haha. Total trial by fire, seeing how each variable affected things and then adjusting based on that - it was torture.

Edit: I did try getting it working on a HP Zbook 10th gen Intel laptop I have, which should support an eGPU over TB3 given it’s specs, but that refused to work on Windows and Linux distro’s so I gave up on that one. I do have a 13th gen Asus Strix as well, I might try that at some point. Got a few uni exams in the next few weeks though, so I’ll have to wait on wasting time with that.

I’ve just tried it (together with Mario’s ASPM patch), but sadly no changes: still a reboot :frowning:

The speed is reduced to host’s capabilities (so to USB4v1’s ~30Gbps in case of the few recent AMD generations), but TB5 peripheral devices do behave somewhat different. For example on Windows, GPUs connected via TB5 adapters to AMD Strix hosts are natively recognized as removable GPUs (unlike in case of ASM2464PD(X), where you need the error-43 fixer from Nando). …And then of course there’s this rebooting on GSP loading that happens only with TB5 adapters.

As explained in the kernel.org bug report and on egpu.io build, exactly the same physical setup works flawlessly on Windows, so it’s not a cable issue. On Windows I can even daisy-chain a 2nd eGPU to it and it still works flawlessly.

That’s a very interesting piece of information: thanks!
@Mario_Limonciello , so it seems that older AMD APUs, like Phoenix (7xxx) are less affected (still a reboot if though if GSP is loaded if you understand correctly the linked post: “Ubuntu just crashes and boot loops”).

What point release of Ubuntu are you using? If we get that, then we can look for differences between that and the current kernel and narrow stuff down.

EDIT: uname -r would also be very good.

FYI: I’ve tried virtually all kernels from 6.16.1 to 7.0.10 (on earlier versions TB/USB4 was completely unusable on Strix (AI3xx)). I strongly believe that the difference is because of the host platform (Phoenix-7xxx vs Hawk-8xxx & Strix-AI3xx).

In the logs on the kernel bug tracker. There are AER errors from the eGPU.

Comparing the lspci -xxxx output.

The windows driver is suppressing AER using the following:
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 512 bytes

The linux driver enables it using:
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes

I don’t know how to disable in DevCtl in linux.

Another difference is linux allocates ioports over pcie, but the windows does not.

Linux also seems to enable iommu, but windows might not use it.

Linux also configures the deg2 pcie bridge differently.

I think the best way forward is to contact intel for support regarding problems with the intel chip in the deg2.

My guess is that there is a bug in the chip hardware, that they are working around in the windows driver, but intel maybe have not applied the same workaround to the linux driver.

1 Like

Attached some command outputs from my system in case that’s useful.

Nvidia SMI output:

Sun May 24 12:31:43 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.126.09             Driver Version: 580.126.09     CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4070 Ti     Off |   00000000:64:00.0 Off |                  N/A |
|  0%   26C    P0             25W /  285W |       8MiB /  12282MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            1431      G   /usr/lib/xorg/Xorg                        3MiB |
+-----------------------------------------------------------------------------------------+


Kernel: 6.17.0-14-generic #14~24.04.1-Ubuntu

LSHW:

H/W path                      Device           Class          Description
=========================================================================
                                               system         Laptop 13 (AMD Ryzen 7040Series) (FRANMDCP07)
/0                                             bus            FRANMDCP07
/0/0                                           memory         128KiB BIOS
/0/4                                           processor      AMD Ryzen 7 7840U w/ Radeon  780M Graphics
/0/4/5                                         memory         512KiB L1 cache
/0/4/6                                         memory         8MiB L2 cache
/0/4/7                                         memory         16MiB L3 cache
/0/11                                          memory         16GiB System Memory
/0/11/0                                        memory         8GiB SODIMM Synchronous Unbuffered (Unregistered) 5600 MHz (0.2 ns)
/0/11/1                                        memory         8GiB SODIMM Synchronous Unbuffered (Unregistered) 5600 MHz (0.2 ns)
/0/100                                         bridge         Advanced Micro Devices, Inc. [AMD]
/0/100/0.2                                     generic        Advanced Micro Devices, Inc. [AMD]
/0/100/2.2                                     bridge         Advanced Micro Devices, Inc. [AMD]
/0/100/2.2/0                  wlp1s0           network        MT7922 802.11ax PCI Express Wireless Network Adapter
/0/100/2.4                                     bridge         Advanced Micro Devices, Inc. [AMD]
/0/100/2.4/0                  /dev/nvme0       storage        WD_BLACK SN770 1TB
/0/100/2.4/0/0                hwmon3           disk           NVMe disk
/0/100/2.4/0/2                /dev/ng0n1       disk           NVMe disk
/0/100/2.4/0/1                /dev/nvme0n1     disk           1TB NVMe disk
/0/100/2.4/0/1/1              /dev/nvme0n1p1   volume         299MiB Windows FAT volume
/0/100/2.4/0/1/2              /dev/nvme0n1p2   volume         931GiB EXT4 volume
/0/100/3.1                                     bridge         Family 19h USB4/Thunderbolt PCIe tunnel
/0/100/4.1                                     bridge         Family 19h USB4/Thunderbolt PCIe tunnel
/0/100/4.1/0                                   bridge         Thunderbolt 80/120G Bridge [Barlow Ridge Hub 80G 2023]
/0/100/4.1/0/0                                 bridge         Thunderbolt 80/120G Bridge [Barlow Ridge Hub 80G 2023]
/0/100/4.1/0/0/0                               display        AD104 [GeForce RTX 4070 Ti]
/0/100/4.1/0/0/0.1                             multimedia     NVIDIA Corporation
/0/100/4.1/0/1                                 bridge         Thunderbolt 80/120G Bridge [Barlow Ridge Hub 80G 2023]
/0/100/4.1/0/2                                 bridge         Thunderbolt 80/120G Bridge [Barlow Ridge Hub 80G 2023]
/0/100/4.1/0/3                                 bridge         Thunderbolt 80/120G Bridge [Barlow Ridge Hub 80G 2023]
/0/100/8.1                                     bridge         Advanced Micro Devices, Inc. [AMD]
/0/100/8.1/0                  /dev/fb0         display        Phoenix1
/0/100/8.1/0.1                card0            multimedia     Rembrandt Radeon High Definition Audio Controller
/0/100/8.1/0.1/0              input10          input          HD-Audio Generic HDMI/DP,pcm=3
/0/100/8.1/0.1/1              input11          input          HD-Audio Generic HDMI/DP,pcm=7
/0/100/8.1/0.1/2              input12          input          HD-Audio Generic HDMI/DP,pcm=8
/0/100/8.1/0.2                                 generic        Family 19h (Model 74h) CCP/PSP 3.0 Device
/0/100/8.1/0.3                                 bus            Advanced Micro Devices, Inc. [AMD]
/0/100/8.1/0.3/0              usb1             bus            xHCI Host Controller
/0/100/8.1/0.3/0/2                             input          HDMI Expansion Card
/0/100/8.1/0.3/0/4                             generic        Goodix Fingerprint USB Device
/0/100/8.1/0.3/0/5                             communication  Wireless_Device
/0/100/8.1/0.3/1              usb2             bus            xHCI Host Controller
/0/100/8.1/0.4                                 bus            Advanced Micro Devices, Inc. [AMD]
/0/100/8.1/0.4/0              usb3             bus            xHCI Host Controller
/0/100/8.1/0.4/0/1                             multimedia     Laptop Camera
/0/100/8.1/0.4/1              usb4             bus            xHCI Host Controller
/0/100/8.1/0.5                                 multimedia     ACP/ACP3X/ACP6x Audio Coprocessor
/0/100/8.1/0.6                card1            multimedia     Family 17h/19h HD Audio Controller
/0/100/8.1/0.6/0              input13          input          HD-Audio Generic Headphone
/0/100/8.2                                     bridge         Advanced Micro Devices, Inc. [AMD]
/0/100/8.2/0                                   generic        Advanced Micro Devices, Inc. [AMD]
/0/100/8.2/0.1                                 generic        AMD IPU Device
/0/100/8.3                                     bridge         Advanced Micro Devices, Inc. [AMD]
/0/100/8.3/0                                   generic        Advanced Micro Devices, Inc. [AMD]
/0/100/8.3/0.3                                 bus            Advanced Micro Devices, Inc. [AMD]
/0/100/8.3/0.3/0              usb5             bus            xHCI Host Controller
/0/100/8.3/0.3/1              usb6             bus            xHCI Host Controller
/0/100/8.3/0.4                                 bus            Advanced Micro Devices, Inc. [AMD]
/0/100/8.3/0.4/0              usb7             bus            xHCI Host Controller
/0/100/8.3/0.4/0/1                             bus            USB2.0 Hub
/0/100/8.3/0.4/0/1/2                           bus            4-Port USB 2.0 Hub
/0/100/8.3/0.4/0/1/4                           generic        Thunderbolt 5 Docking Station
/0/100/8.3/0.4/1              usb8             bus            xHCI Host Controller
/0/100/8.3/0.4/1/1                             bus            USB3 HUB
/0/100/8.3/0.4/1/1/4                           bus            4-Port USB 3.0 Hub
/0/100/8.3/0.4/1/1/4/3                         generic        USB 10/100/1G/2.5G LAN
/0/100/8.3/0.4/1/1/4/4        scsi0            storage        USB to PCIE Bridge
/0/100/8.3/0.4/1/1/4/4/0.0.0  /dev/sda         disk           Generic
/0/100/8.3/0.5                                 bus            Pink Sardine USB4/Thunderbolt NHI controller #1
/0/100/8.3/0.6                                 bus            Pink Sardine USB4/Thunderbolt NHI controller #2
/0/100/14                                      bus            FCH SMBus Controller
/0/100/14.3                                    bridge         FCH LPC Bridge
/0/100/14.3/0                                  system         PnP device PNP0c02
/0/100/14.3/1                                  system         PnP device PNP0b00
/0/100/14.3/2                                  system         PnP device PNP0c02
/0/100/14.3/3                                  system         PnP device PNP0c01
/0/100/14.3/4                                  input          PnP device PNP0303
/0/101                                         bridge         Advanced Micro Devices, Inc. [AMD]
/0/102                                         bridge         Advanced Micro Devices, Inc. [AMD]
/0/103                                         bridge         Advanced Micro Devices, Inc. [AMD]
/0/104                                         bridge         Advanced Micro Devices, Inc. [AMD]
/0/105                                         bridge         Advanced Micro Devices, Inc. [AMD]
/0/106                                         bridge         Advanced Micro Devices, Inc. [AMD]
/0/107                                         bridge         Advanced Micro Devices, Inc. [AMD]
/0/108                                         bridge         Advanced Micro Devices, Inc. [AMD]
/0/109                                         bridge         Advanced Micro Devices, Inc. [AMD]
/0/10a                                         bridge         Advanced Micro Devices, Inc. [AMD]
/0/10b                                         bridge         Advanced Micro Devices, Inc. [AMD]
/0/10c                                         bridge         Advanced Micro Devices, Inc. [AMD]
/0/10d                                         bridge         Advanced Micro Devices, Inc. [AMD]
/1                                             power          Lithium Ion Battery
/2                                             power          UNKNOWN
/3                            input0           input          Lid Switch
/4                            input1           input          Power Button
/5                            input2           input          AT Translated Set 2 keyboard
/6                            input3           input          Video Bus
/7                            input4           input          FRMW0004:00 32AC:0006 Wireless Radio Control
/8                            input5           input          FRMW0004:00 32AC:0006 Consumer Control
/9                            input8           input          PIXA3854:00 093A:0274 Mouse
/a                            input9           input          PIXA3854:00 093A:0274 Touchpad
/b                            enx38052534cebc  network        Ethernet interface

dmide system:

# dmidecode 3.5
Getting SMBIOS data from sysfs.
SMBIOS 3.3.0 present.

Handle 0x0001, DMI type 1, 27 bytes
System Information
        Manufacturer: Framework
        Product Name: Laptop 13 (AMD Ryzen 7040Series)
        Version: A7
        Serial Number: xxxxxxxxxxxxxx
        UUID: xxxxxxxxxxxxx
        Wake-up Type: Power Switch
        SKU Number: xxxxxxxxxxxxxx
        Family: Laptop

Handle 0x000F, DMI type 12, 5 bytes
System Configuration Options
        Option 1: String1 for Type12 Equipment Manufacturer
        Option 2: String2 for Type12 Equipment Manufacturer
        Option 3: String3 for Type12 Equipment Manufacturer
        Option 4: String4 for Type12 Equipment Manufacturer

Handle 0x0026, DMI type 32, 20 bytes
System Boot Information
        Status: No errors detected

boltctl:

 ● Micro Computer (HK) Tech. Ltd. TBGAA
   ├─ type:          peripheral
   ├─ name:          TBGAA
   ├─ vendor:        Micro Computer (HK) Tech. Ltd.
   ├─ uuid:          34158780-0022-ec01-ffff-ffffffffffff
   ├─ generation:    USB4
   ├─ status:        authorized
   │  ├─ domain:     b7f13804-f11c-0729-ffff-ffffffffffff
   │  ├─ rx speed:   40 Gb/s = 2 lanes * 20 Gb/s
   │  ├─ tx speed:   40 Gb/s = 2 lanes * 20 Gb/s
   │  └─ authflags:  none
   ├─ authorized:    Sun 24 May 2026 11:31:29 UTC
   ├─ connected:     Sun 24 May 2026 11:22:50 UTC
   └─ stored:        Thu 05 Feb 2026 15:11:52 UTC
      ├─ policy:     iommu
      └─ key:        no

dmesg thunderbolt:

[    0.954695] ACPI: bus type thunderbolt registered
[   35.126346] thunderbolt 1-2: new device found, vendor=0x41f device=0xd002
[   35.126355] thunderbolt 1-2: Micro Computer (HK) Tech. Ltd. TBGAA
[   35.357107] thunderbolt 1-0:2.1: new retimer found, vendor=0x7fea device=0x1032
[   37.000789] usb 7-1.4: Product: Thunderbolt 5 Docking Station

dmesg USB4:

[    0.300522] ACPI: USB4 _OSC: OS supports USB3+ DisplayPort+ PCIe+ XDomain+
[    0.300525] ACPI: USB4 _OSC: OS controls USB3+ DisplayPort+ PCIe+ XDomain+
[    0.545478] usb usb4: We don't know the algorithms for LPM for this host, disabling LPM.
[    0.545503] usb usb4: New USB device found, idVendor=1d6b, idProduct=0003, bcdDevice= 6.17
[    0.545506] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    0.545508] usb usb4: Product: xHCI Host Controller
[    0.545509] usb usb4: Manufacturer: Linux 6.17.0-14-generic xhci-hcd
[    0.545511] usb usb4: SerialNumber: 0000:c1:00.4

1 Like