Wifi crashes on Ubuntu 24.04 with syncthing running

Which Linux distro are you using? Ubuntu 24.04.04

Which kernel are you using? Linux 6.17.0-22-generic

Which BIOS version are you using? 03.18

Which Framework Laptop 13 model are you using? AMD Ryzen™ 7040 Series

My wifi connection crashes as soon as syncthing running on my android phone connects to the network. Journalctl catches the following errors:

wpa_supplicant[78715]: TDLS: Creating peer entry for f2:6d:2d:e4:d2:31
wpa_supplicant[78715]: wlan0: CTRL-EVENT-SIGNAL-CHANGE above=1 signal=-47 noise=9999 txrate=2161300
wpa_supplicant[78715]: TDLS: Dialog Token in TPK M1 220
wpa_supplicant[78715]: nl80211: kernel reports: key addition failed
wpa_supplicant[78715]: TDLS: Failed to set TPK to the driver


I belief the kernel error is in fact an error with the mt7921e Wifi driver.

I have tried to switch to iwd as a backend (which fixed this problem by not supporting tdls) but could not connect to the eduroam network in my university, which is not an option. I also tried to disable tdls for wpa_supplicant through /etc/wpa_supplicant/wpa_supplicant.conf but it seems that it does not respect that rule, as the same error continues to occurre.

Do you have any idea how to fix the driver or at least mitigate the problem?

Thanks in advance for your help!

Okay, after another long chat with an AI chatbot :grimacing: I got to this “solution”:

I could not find a way to disable TDLS with wpa_supplicant and NetworkManager. So I ended up creating a file to “set tdls disabled” everytime my network turns on.

sudo nano /etc/NetworkManager/dispatcher.d/99-disable-tdls
#!/bin/sh
INTERFACE=$1
ACTION=$2

if [ "$INTERFACE" = "wlan0" ] && [ "$ACTION" = "up" ]; then
    /usr/bin/wpa_cli set tdls_disabled 1 | logger -t "NM-TDLS-FIX"
    echo "TDLS disabled for $1" | logger -t "NM-TDLS-FIX"
fi

sudo chmod +x /etc/NetworkManager/dispatcher.d/99-disable-tdls

I would love to report this bug, but I am not sure where to do so..

I think this is a bug in either the Ubuntu 24.04 kernel (for the MediaTek WiFi drivers) or in Ubuntu’s wpa_supplicant package, but deifinitely worth raising with SyncThing’s GitHub and PPA. But I’m glad you raised it here, I’ll watch out for SyncThing doing legitimate but weird things on WiFi when things go wrong.

Thanks, I reported it to the Ubuntu launchpad.net. Syncthing can be excluded as a source, as it happens with GSConnect (KdeConnect Clone) as well.
I got a Laptop today with the same Software config but a different wifi modul and TDLS was activated without problems.

Spent a few days digging into this. The root cause is a 2013-era mac80211 bug: when wpa_supplicant calls NL80211_CMD_NEW_KEY to install the TDLS TPK, mac80211 returns -ENOENT because the peer hasn’t reached WLAN_STA_ASSOC yet (TDLS peers defer that transition). wpa_supplicant gives up, the half-set-up peer ends up zombie, and on some drivers that cascades into the main BSS link breaking.

The check was added by Johannes Berg in 2013 as an FT roaming fix: https:[//]git[]kernel[]org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1626e0fa740dec8665a973cf2349405cdfeb46dc - the TODO comment directly above it describes the TDLS case but it never got fixed.
Sorry for that ugly link, as I new user I cannot attach more than two links in one reply.

Same symptom was filed as kernel bug 216421 on MT7922 in 2022 and closed UNREPRODUCIBLE: Bug 216421 - [regression] nl80211: kernel reports: key addition failed

Fix is one line in net/mac80211/cfg.c - add sta->sta.tdls to the existing exemption alongside sta->sta.epp_peer (example). To be honest, I’m not really sure if this is the right way to fix the problem, but at least it works.

2 Likes

Hey, I am amazed by your understanding of the matter and your capability to fix it.
Do you think your fix does have any downsides?

As mentioned above I thought it was primarly a firmeware bug, as it worked with the same kernel and a different wifi module. But if I understand you correctly it is also possible for the nl80211 kernel thing to handle the firmeware differently. Have you filled that commit somewhere it will get tested/pushed upstream?

Hey Lukas, glad you find it useful.

On downsides: I can’t really test thoroughly because my only TDLS-initiating device is a Samsung Galaxy S25 Ultra, and it drops the link after a few seconds for reasons I haven’t been able to figure out (no root on the phone). What the patch does is narrow: it widens an existing -ENOENT exemption to also cover TDLS peers, so we stop rejecting key install in the specific case where a TDLS peer is still in the NONE state. The TODO comment directly above the guard from 2013 literally describes this case but it never got fixed, so in principle nothing should break.

Worth noting: the bug isn’t Syncthing/GSConnect specific, it triggers with anything that does P2P traffic over the same LAN, even plain ssh or adb over Wi-Fi.

On the firmware vs kernel question (for MT7925 specifically): the key-install rejection happens in mac80211’s ieee80211_add_key, which is generic kernel code, not driver/firmware. What differs is what happens after the rejection: wpa_supplicant gives up, the half-set-up TDLS peer sits in a zombie state, and the MT7925 firmware reacts to that zombie progressively worse over time. In my case after frequent setup-and-drop cycles the tx/rx bitrate eventually drops to 6 Mbps and the only thing that fixes it is reconnecting to the AP. There’s a recent linux-firmware update (commit) that improves the situation noticeably but doesn’t fully solve it.

On upstream: I’ve never submitted a kernel patch before, so I’d want a couple more people whose setups actually benefit from it (not just mine) to confirm before trying to send it to linux-wireless. Happy to help with testing in the meantime. Out of curiosity, which Wi-Fi card was in there when TDLS was working for you?

Hi,

Im not a framework user, but I have exactly the same issue, after switching from a compiled (which did not support tdls) to the rtw88 kernel driver. I’m literally searching for weeks and the only “solution” I found was switching to iwd backend, which also doesn’t support tdls or downgrading the driver. Really nice to see someone who has the skill to pin it down. I found many postings on the web with the same problem, but most end up without or with the switch to iwd as solution.

My test setup is the base S25 as tdls client and a small home server running over wifi (realtek 8821cu). Ssh/web or nearly everything triggers tdls and i end up with a broken connection and have to restart wifi on the server.

Some links to people with the same issue:

https://www.mail-archive.com/ubuntu-bugs@lists.ubuntu.com/msg6271568.html
https://www.reddit.com/r/linuxquestions/comments/1s6xib3/wifi_disconnects_after_smb_share_is_accessed/
https://discussion.fedoraproject.org/t/fedora-41-kde-lose-internet-on-pc-when-accessing-services-from-my-phone-anyone-else-had-this-issue/142484
https://discussion.fedoraproject.org/t/network-gets-cut-off-when-connecting-to-android-14-wpa-supplicant/151557
https://lists.archlinux.org/archives/list/arch-general@lists.archlinux.org/thread/JDOV6PB253UJA3H66YDGJCFNXHWV6OBG/
https://discussion.fedoraproject.org/t/fedora-cockpit-samba-share-cant-connect-to-them-after-initial-connection/172832
https://community.home-assistant.io/t/network-issues-ha-looses-connection-for-few-seconds/796816
https://lists.archlinux.org/archives/list/arch-general@lists.archlinux.org/thread/JDOV6PB253UJA3H66YDGJCFNXHWV6OBG/
1 Like

Okay, thank you for the explanation how the kernel would act differently!

What I mean is, that the initial key rejection did not happen on the intel wifi modul with the iwlwifi driver. Or at least there is now error throwen by the mac80211. It logs just TDLS Creating peer entry … TPK set or something. Right in the first try.

EDIT: I don’t think this new firmware will find its way into my ubuntu install 24.04, or is it? I am still on Version 2505Something.

To be honest, I don’t even know what’s going on in Ubuntu, or how to properly update linux-firmware other than by rebuilding the package. For now, I’d recommend just trying the kernel patch.

After several rounds of debugging the firmware, kernel, and sniffing the surrounding network traffic, I managed to get TDLS working with the Samsung Galaxy S25 Ultra!

I’ll try to clean up the patches and publish them soon.

Here’s roughly what it looks like now:

❯ iw dev wlp4s0 station dump
Station aa:bb:cc:dd:ee:ff (on wlp4s0)
authorized:	yes
authenticated:	yes
associated:	yes
preamble:	long
WMM/WME:	yes
MFP:		yes
TDLS peer:	no
inactive time:	2344 ms
rx bytes:	26843082
rx packets:	90255
tx bytes:	236745103
tx packets:	204918
tx retries:	7059
tx failed:	0
beacon loss:	0
rx drop misc:	714
signal:  	-41 [-42, -47] dBm
signal avg:	-41 [-41, -49] dBm
tx bitrate:	1200.9 MBit/s 80MHz HE-MCS 11 HE-NSS 2 HE-GI 0 HE-DCM 0
tx duration:	14913024 us
rx bitrate:	1200.9 MBit/s 80MHz HE-MCS 11 HE-NSS 2 HE-GI 0 HE-DCM 0
rx duration:	19864473 us
last ack signal:-37 dBm
avg ack signal:	-37 dBm
airtime weight: 256
DTIM period:	1
beacon interval:100
short slot time:yes
connected time:	9772 seconds
associated at [boottime]:	39449.609s
associated at:	1777726376627 ms
current time:	1777736148828 ms

Station ff:ee:dd:cc:bb:aa (on wlp4s0)
authorized:	yes
authenticated:	yes
associated:	yes
preamble:	long
WMM/WME:	yes
MFP:		no
TDLS peer:	yes
inactive time:	0 ms
rx bytes:	7903555758
rx packets:	5145532
tx bytes:	188713662
tx packets:	2188161
tx retries:	38117
tx failed:	0
beacon loss:	0
rx drop misc:	3
signal:  	-53 [-55, -57] dBm
signal avg:	-53 [-55, -57] dBm
tx bitrate:	1200.9 MBit/s 80MHz HE-MCS 11 HE-NSS 2 HE-GI 0 HE-DCM 0
tx duration:	18109325 us
rx bitrate:	864.8 MBit/s 80MHz HE-MCS 8 HE-NSS 2 HE-GI 0 HE-DCM 0
rx duration:	115579005 us
last ack signal:-47 dBm
avg ack signal:	-47 dBm
airtime weight: 256
DTIM period:	1
beacon interval:100
short slot time:yes
connected time:	163 seconds
associated at [boottime]:	49058.184s
associated at:	1777735985201 ms
current time:	1777736148828 ms

For those who want to test the fix: Commits · ElXreno/linux · GitHub
Don’t mind the branch name, I just find it easier this way.

Thanks for the effort!

If I do not compile the kernel myself I can’t test these patches, right?

Edit: sorry for asking noob questions :see_no_evil_monkey:

1 Like

Yes, that’s right.

In principle, you could even just recompile certain modules, but the easiest way is to take the last two commits from my repository and apply them to the entire kernel.

At this stage, my patches haven’t been accepted into the kernel yet because I disable hardware encapsulation/decapsulation acceleration, but I haven’t noticed any significant performance degradation, and it’s better than a completely broken TDLS.