What PCIE network card for 40+ GB/s

Marinas_Florin · January 17, 2026, 12:42pm

Hi guys, I’m looking for some suggestions for how I can cluster multiple boards for 40+ GB/s network speed. I’ve heard of infiniband stuff or old network cards but I see they are PCIE 3x16. Are they compatible?

What do you think is a good solution? I would appreciate some models/links

Thanks

Djip · January 17, 2026, 1:28pm

Something like that https://www.fs.com/fr/products/147578.html
(there is some other…)
it is a dual 25GB SFP28 card that have PCIe-4.0…

but:

it need aggregation on the 2 port for more than 40+ GB/s
we need to find a way to connect it to the Pcie x4 . (I don’t know if the available power is sufficient. It is not on most GPU card)

jim_m · January 17, 2026, 2:54pm

If you are looking for 40Gbps NIC you will need to look at options from Mellanox (Nvidia now), Xilinx, or Intel. I am partial to the old Mellanox ConnectX brand as I used them in old high-performance compute clusters. I can’t image how expensive they would be be these days. Not sure if there are less expensive brands.

Marinas_Florin · January 17, 2026, 3:52pm

I keep looking around for specifics but nobody seems to have tried something that works. Generalities do not help. So, does anyone have a working solution for at least 20+gb?

Let’s look for a concrete solution. I know generalities myself too, I looked at all the yt videos and forums.

Looking forward to someone that knows :).

Marinas_Florin · January 17, 2026, 7:39pm

Doing more research, I found it confusing at least:

Can we connect the USB-C ports with 40 gig compatible USB-C cables? –> no network card connection just normal USB-C ports? and config linux to work?
Buy a PCIE 4 card like this one? (for both motherboards in the cluste? USB4 PCIe Gen4 Card｜Motherboards｜ASUS Global
(1 for each motherboard in the cluster, connect through USB-C and then get 20+ GBPS? –> is this necessary if we already have the USB-C ports?)
M2 adapter to thunderbolt for both motherboards –> then connect them together?

I’m amazed no one has currently shared a cluster (I ordered 2 motherboards and I am waiting for them so I am preparing to cluster them for 20+ GBPS speeds but no one clearly shows how to do it or tells).

Come on guys . Let’s come up with smth

entropy4936 · January 18, 2026, 4:18pm

Yes, you can. You need a Thunderbolt 4/5 cable but Linux has support for Thunderbolt networking.
You don’t need a separate adapter. The USB4 ports on the back would also allow you to use this networking stack. Theoretically, you shouldn’t need to do anything besides just connecting the two systems with a Thunderbolt cable (at least for me it was plug and play)

Thomas_Munn · January 18, 2026, 6:37pm

I did a lot of research on this. I think the main restrictions (not to say they can’t be overcome) are

Cable quality- you need an ACTIVE cable
Network instability- you need to be REALLY careful since tb can turn off the power, and doesn’t work well from suspend
Kind of oldish (its a feature of tb3)

Having said that, once I have another things that CAN DO it, I think that it might be fun. I would recommend using ipv6. There was a person who did it with proxmox ve, and found that ipv6 was the most stable.

Marinas_Florin · January 18, 2026, 7:51pm

Thanks. Finally smth actionable.

Got 2 boards w 600w platinum PSU’s.

Do you have a recommendation for a specific active thunderbolt cable?

Network wise, I got a 1gbps internet connection.

Looking forward as I want to do a proper cluster.

Thomas_Munn · January 18, 2026, 11:14pm

I am rather lucky since I work at a place that has TONS of old caldigit docs. I use the 3 foot cable from caldigit. I suspect if you have any old tb3 (not sure about this) from apple the might work too. DO NOT use a charging cable, or the generic ones. The important thing is to see the “thunderbolt’ bolt and have something reasonable (intel certified). Please note that I haven’t actually done this yet! Here is a product google recommended (based upon a query I did). I suspect a search on ebay for old active tb4 (not sure about tb3!) or tb4 might be worth a try if you are strapped for cash (which I suspect you probably arent since you have a framework!). The logo is the most important thing, it means its certified (keep it short too, long=instability). People forget that 40gbps is a PCIE interface in a cable!

entropy4936 · January 19, 2026, 12:49am

I use this for connecting my Minisforum PCs as well as Framework Desktop:

And yes, @Thomas_Munn is right, I didn’t mention that it should be an active cable in my original response, my bad. I kind of forgot that charging-only cables exist, to be honest

Thought, from my experience so far, using llama.cpp + RPC + Thunderbolt networking didn’’t really give me much of an improvement over gigabit Ethernet connection

Djip · January 19, 2026, 2:41am

For now did not see thunderbird speed on IA-MAX that is more than 10Gb/s… did not know if we really can have 40Gb with data (is it only available for display?)

Now for llama.cpp it did not need hight speed network for RPC. If you have 2 PC, the layers are split across the two machines, resulting in minimal data transfer. There’s no tensor parallel processing, only a serial connection of the layers. The biggest transfer, if I understand correctly, is sending the graph to be calculated.

To truly leverage both machines, the calculations need to be panellized, not just distributed across the layers.

Now for high speed network, the best I can find is tu use a dual 25Gbs network (ie 50Gb on agregation).

The question is can it work.
Dual 25Gbs network exist with intel E810 or NVIDA. It use 8xPcie 4.0 line. look that it can have full speed on 8xPCie3.0 or 4xPCIe4.0… To use it we need a Pcie x4 to Pcie x8 cable… and test if it work. Some test on 16xGPU look it did not work because of to low power from MB… (If I understand correctly)… now a standart PCIe x4 have a 25W spec. Look like the intel E810 is ~20W… so it may work…
Next we need to find how to configure the card for aggregation/rdna … For I am not sure what we need to do…
What we need is someone to test it… and a linux network export to configure it… if possible…

And at the end for llama.cpp we need to add tensor parallelism with MPI to use rdna low latency function…

(Carte Réseau Ethernet Intel E810-XXVAM2, double port SFP28 25G, PCIe 4.0 x8, comparable à l'Intel E810-XXVDA2, profil bas et pleine hauteur - FS.com Europe + 0,5m (2ft) Câble à Attache Directe Twinax en Cuivre Passif SFP28 25G Compatible Intel - FS.com Europe + GLOTRENDS 100 mm PCIe 4.0 X4 Riser Cable for M.2, WiFi, Firewire, USB, Sound Cards, etc : Amazon.fr: Computers ???)

Marinas_Florin · January 19, 2026, 7:02am

Thanks! Pfiu, so now we’re looking at an activa thunderbolt cable. I found the: Cable Matters [Intel Certified] 40Gbps Thunderbolt 4 Cable 0.3 m with 8K Video and 240 W Charging which is intel certified.

Well, money wise I’m open to buy whatever needs to be bought to get 20+ gbps xD. Price does not really matter. Im in for the technology.

@entropy4936 thanks man, I’ll get that one.

@Djip So one strategy that needs to be tested is to get a PCIE 4x4 to 4x8 converter and get a card like E810.

So you basically need 2 x of those cards and 2 x of those coverters?

Thomas_Munn · January 19, 2026, 11:45pm

You already have a 40gbps usb4 interface on the back of the thing.

entropy4936 · January 19, 2026, 11:47pm

Yeah, that’s what I was trying to say in my original message here but I guess it didn’t come across properly - I’m not using any other additional cards or adapters, I have cable connecting directly to USB4 port at the back

Djip · January 20, 2026, 12:13am

yes, and 2 SFP28 cable connection

Djip · January 20, 2026, 12:20am

There is 2 of them
But can you share a linux config to have TCP over USB4 40Gb/s data exchange? for now did not see for this AMD APU more than 10Gb/s TCP benchmark result.
What bandwidth and latency are you getting? Were you able to aggregate the two ports?

Marinas_Florin · January 20, 2026, 7:47am

Thank! I just bought a TB4 usbc cable from cablematters. WIll test and report back

Thomas_Munn · January 22, 2026, 12:46am

This is ai generated but I actually have some networking knowledge, so I did guide it. Please let me know how well it works! I would recommend using ipv6 on it for the addressing since it uses SlAAC to do stuff. AI solution follows

To get raw speed (40Gbps+) between two Strix Halos, you must set the IOMMU to “Passthrough” mode. This allows the USB4 controller to write directly to RAM without the CPU checking every permission bit.

Run on BOTH Strix Halos:

Open Grub config:
```
sudo nano /etc/default/grub
```
Find GRUB_CMDLINE_LINUX_DEFAULT and append:
```
iommu=pt amd_iommu=on
```

Update Grub and Reboot:

Bash

sudo update-grub  # or 'grub-mkconfig -o /boot/grub/grub.cfg' on Arch/Alpine
sudo reboot

2. The “Forgiving” Kernel Tuning

The Strix Halo is fast, but balance-rr is chaotic. You must tune the TCP stack to accept out-of-order IPv6 packets without panicking.

Run on BOTH computers:

# Allow massive packet reordering (Essential for Bond Mode 0)
sudo sysctl -w net.ipv4.tcp_reordering=127

# Use BBR (Better for high-throughput/variable-latency links)
sudo sysctl -w net.core.default_qdisc=fq
sudo sysctl -w net.ipv4.tcp_congestion_control=bbr

# Maximize Memory Buffers (Crucial for >25Gbps)
sudo sysctl -w net.core.rmem_max=134217728
sudo sysctl -w net.core.wmem_max=134217728
sudo sysctl -w net.ipv4.tcp_rmem="4096 87380 67108864"
sudo sysctl -w net.ipv4.tcp_wmem="4096 65536 67108864"

(Note: I doubled the buffer sizes from the previous guide because two Strix Halos can actually fill them.)

3. The IPv6-Only Bond Configuration

This creates a single, non-load-balanced pipe (bond0) that stripes packets across both cables.

Run on BOTH computers:

# 1. Reset links
sudo ip link set thunderbolt0 down
sudo ip link set thunderbolt1 down

# 2. Create Bond (Mode 0 = Round Robin Striping)
sudo ip link add bond0 type bond mode balance-rr miimon=100

# 3. Enable Jumbo Frames (Required for CPU efficiency)
sudo ip link set bond0 mtu 65528

# 4. Enslave physical ports
sudo ip link set thunderbolt0 master bond0
sudo ip link set thunderbolt1 master bond0

# 5. HARD DISABLE IPv4 (Forces IPv6 only)
sudo sysctl -w net.ipv4.conf.bond0.disable_ipv4=1

# 6. Bring everything up
sudo ip link set thunderbolt0 up
sudo ip link set thunderbolt1 up
sudo ip link set bond0 up

4. Assigning Addresses (IPv6 ULA)

We use fd00:: (Unique Local Address) so it behaves like a static private LAN.

On Strix Halo A:

Bash

sudo ip -6 addr add fd00::1/64 dev bond0

On Strix Halo B:

Bash

sudo ip -6 addr add fd00::2/64 dev bond0

Djip · January 22, 2026, 11:50pm

I wait for my 2ed board for now … batch-18 …
but if someone can test happy to know what you get with this config.

did you have “bench” cmd to use?

Marinas_Florin · January 24, 2026, 9:54am

Waiting for batch 18 too. Will report back when I finish the cluster

Topic		Replies	Views
How come no one seems to have tried using Thunderbolt for 40gbps networking? Framework Desktop	44	1725	February 24, 2026
Unable to exceed 2Gbit/s USB4/thunderbolt Framework Desktop	16	509	October 24, 2025
USB4 and Thunderbolt on AMD Framework Laptop 13	57	30102	July 16, 2025
Details about USB, Thunderbolt and dock operation Framework Laptop 13	53	31741	November 16, 2023
Curious Thunderbolt 3 eGPU link speed case (Linux) Linux arch	27	1288	January 3, 2026

What PCIE network card for 40+ GB/s

2. The “Forgiving” Kernel Tuning

3. The IPv6-Only Bond Configuration

4. Assigning Addresses (IPv6 ULA)

Related topics