[TRACKING] WiFi drops out (especially when downloading a lot) but fixes instantly by reconnecting (AMD)

Hi,

I have seen similar symptoms but in a different scenario.
A laptop wifi client and an AP that has a mediatek wifi chip in it.
At times I get packets being lost over wifi, and a manual reconnect of the wiki link on the laptop resumes normal operation.
There are no messages in the syslog logs when the problem occurs.
Also, power cycling the AP, as opposed to just soft rebooting it seems to make the problem go away for longer.
Summary:
Take both the Laptop wifi and the AP wifi chips into account when diagnosing a problem like this. So, the above people mentioning problems should also mention which wifi access point and software version the AP is using, in addition to the Laptop wifi chip being used.

Also, due to various bugs in the Linux kernel wifi code, it is also possible that an extra Wifi client on the AP, can affect this one. So tests should be done by users with all other wifi devices switched off, or mobile phone wifi in airplane mode.

Okay folks, I now have access to a WiFi 6E router (mesh, actually, adding even more points to fail and track down for repro). Early testing, zero issues thus far. Some interesting findings here, though.

My eero pro 6E has a feature called steering. Great for coverage, but not always the best for speed.

  • On a lark, I disabled steering. My speed climbed an additional 45 Mbps just with this change.

  • It bounced it self to the 5ghz range., gained another 35 Mbps. This testing against multiple speed test service, numbers averaged. Clearly deciding to switch things up.

  • Tested against a running a 4k video in 2160p the entire time to watch for drops.

  • Kernel and distro : 6.8.0-31 generic Ubuntu 24.04.

  • MediaTek MT7922 WIFI card, Framework Laptop 13 AMD Ryzen 7040 Series.

  • WPA3 Set on router.

  • Any and all QoS or steering is OFF.

Going to sit on this config while working tomorrow, see if this sticks or if I was just luckily.

I have AX210s on track for testing this week as well, to compare my findings, Ubuntu 24.04, same kernel as described above.

So what I need now is this:

  • Easy to duplicate workflows I can bang against. I need to be able to repro this, but I have limited cycles available. So quick do this paste and run situations are ideal, otherwise if it’s something with very specific steps you provide, I can set aside some time to repro. But I need hyper specifics: actions, packages, ideally sticking within the Fedora/Ubuntu space please, kernels used.

Okay, some testing against WiFi 6E (as I’ve never been able to repro on WiFi 5), Framework 16, default MediaTek card, power save off.

Using Bazzite as my control for testing as I can through their devs, control my default environment - as I mentioned previously, I have Bazzite using a previous firmware release (per my request for FW images). First call of the day over G Meet, video call.

Used Brave, a patched kernel (a patched kernel that makes for a nice control as it is known to be highly optimized). These are Bazzite defaults that I know are very good, hence why I am testing against it as it’s basically a highly customized F40 Atomic image.

Butter smooth, absolutely delightful over 4 mesh hops. Bazzite’s lead has had the same experience in calls as I have had, FW 16, MediaTek.


Fedora 39 MediaTek testing:
Now the next test will be Fedora 39, fully updated including the latest firmware (which differs from Bazzite). I will again, be testing against Brave and this time, also testing against Chrome as well.

Fedora 40 MediaTek testing:
Now the next test will be Fedora 40, fully updated including the latest firmware (which differs from Bazzite). I will again, be testing against Brave and this time, also testing against Chrome as well.

F39/F40 use the same MediaTek firmware release I believe, when fully updated. If by some chance I am still unable to repro the call drop issue or the drop issue pushing large files back and forth, those affected here can compare notes.

AX210 testing:
Tomorrow. This is my next phase of focus. Same 6E network. Ubuntu 24.04 and Fedora 40.

(Image has IP data removed, it is present)

Now testing Fedora 39 on FW 16, current MediaTek firmware provided by dnf. Kernel is 6.8.9.

Wi-Fi 6E.
WPA3 enabled.
No QoS or “client steering” enabled on the router. <—this is important
Bluetooth on, Bluetooth off.
3.03 FW 16 BIOS.

Testing has been a mix of Google Meet video calls on the Brave browser while pushing 4k YT scenic videos for hours at a time in the background.


Tomorrow, I do some testing with Intel AX210 (be it not officially supported by AMD) on AMD Ryzen boards. Same basic principles in testing as above.

So it’s looking good on Bazzite?

Yes, ran F39 on last night, 10 hour 4K video, no drops there either.

Okay, F39 is also good to go. Fully updated, of course.

Testing F39/F40, the AX210 card is seeing similar speeds to MediaTek, almost a match. This is on Wi-Fi 6E, 6Ghz confirmed as well.

Folks experiencing drops, please use this script, and you can attach both logs created after a full 60 minutes of running and provide this to support with this forum post as well.

Please use this: GitHub - FrameworkComputer/network-tester: MediaTek/Intel Wi-Fi Drop Tester

For AX210 on FW16 (not officially supported), using my script, captured two drops and the time stamps with video chatting for a full hour. Going to compare this with the dmesg/journal now.


For AX210, using the above script, captured two drops and the time stamps. Going to compare this with the dmesg/journal now. Below are the details of one of them.

Without my script, I’d missed this as I am multitasking and these drops were only 10 seconds.

This was during a video call. Using a fake SSID for these posts.


Here is the clipped area my script caught for iw_logfile, giving me a time stamp.

---------------------------
Thu May 16 01:54:45 PM PDT 2024:
Connected to d8:8e:d4:7d:2e:c8 (on wlp5s0)
	SSID: MEH
	freq: 6135.0
	RX: 1338030973 bytes (974265 packets)
	TX: 804649852 bytes (812745 packets)
	signal: -51 dBm
	rx bitrate: 1921.5 MBit/s 160MHz HE-MCS 9 HE-NSS 2 HE-GI 0 HE-DCM 0
	tx bitrate: 1729.6 MBit/s 160MHz HE-MCS 8 HE-NSS 2 HE-GI 0 HE-DCM 0
	bss flags: short-slot-time
	dtim period: 2
	beacon int: 100
---------------------------
Thu May 16 01:54:55 PM PDT 2024:
Not connected.
---------------------------
Thu May 16 01:55:05 PM PDT 2024:
Connected to e8:d3:eb:b8:d2:c7 (on wlp5s0)
	SSID: MEH
	freq: 5180.0
	RX: 682332 bytes (894 packets)
	TX: 550518 bytes (751 packets)
	signal: -64 dBm
	rx bitrate: 432.4 MBit/s 80MHz HE-MCS 8 HE-NSS 1 HE-GI 0 HE-DCM 0
	tx bitrate: 576.4 MBit/s 80MHz HE-MCS 5 HE-NSS 2 HE-GI 0 HE-DCM 0
	bss flags: short-slot-time
	dtim period: 2
	beacon int: 100
	
	journalctl --since "2024-05-16 13:50:00" --until "2024-05-16 13:58:00"

And with this and a little journal action:

journalctl --since "2024-05-16 13:50:00" --until "2024-05-16 13:58:00"

May 16 13:54:48 fedora kernel: wlp5s0: Connection to AP d8:8e:d4:7d:2e:c8 lost
May 16 13:54:48 fedora wpa_supplicant[2341]: wlp5s0: CTRL-EVENT-DISCONNECTED bssid=d8:8e:d4:7d:2e:c8 reason=4 locally_generated=1
May 16 13:54:48 fedora wpa_supplicant[2341]: BSSID d8:8e:d4:7d:2e:c8 ignore list count incremented to 2, ignoring for 10 seconds
May 16 13:54:48 fedora NetworkManager[2272]: <info>  [1715892888.9478] device (wlp5s0): supplicant interface state: completed ->>
May 16 13:54:48 fedora NetworkManager[2272]: <info>  [1715892888.9479] device (p2p-dev-wlp5s0): supplicant management interface >
May 16 13:54:49 fedora NetworkManager[2272]: <info>  [1715892889.0391] device (wlp5s0): supplicant interface state: disconnected>
May 16 13:54:49 fedora NetworkManager[2272]: <info>  [1715892889.0391] device (p2p-dev-wlp5s0): supplicant management interface >
May 16 13:54:49 fedora wpa_supplicant[2341]: wlp5s0: SME: Trying to authenticate with d8:8e:d4:7d:2e:c7 (SSID='MEH' freq=5180>
May 16 13:54:49 fedora kernel: wlp5s0: authenticate with d8:8e:d4:7d:2e:c7 (local address=9a:25:6c:43:ef:61)
May 16 13:54:49 fedora kernel: wlp5s0: send auth to d8:8e:d4:7d:2e:c7 (try 1/3)
May 16 13:54:49 fedora NetworkManager[2272]: <info>  [1715892889.2763] device (wlp5s0): supplicant interface state: scanning -> >
May 16 13:54:49 fedora NetworkManager[2272]: <info>  [1715892889.2764] device (p2p-dev-wlp5s0): supplicant management interface >
May 16 13:54:50 fedora kernel: iwlwifi 0000:05:00.0: Not associated and the session protection is over already...
May 16 13:54:50 fedora kernel: wlp5s0: Connection to AP d8:8e:d4:7d:2e:c7 lost
May 16 13:54:51 fedora kernel: wlp5s0: send auth to d8:8e:d4:7d:2e:c7 (try 2/3)
May 16 13:54:52 fedora kernel: iwlwifi 0000:05:00.0: Not associated and the session protection is over already...
May 16 13:54:52 fedora kernel: wlp5s0: Connection to AP d8:8e:d4:7d:2e:c7 lost
May 16 13:54:53 fedora kernel: wlp5s0: send auth to d8:8e:d4:7d:2e:c7 (try 3/3)
May 16 13:54:53 fedora geoclue[5751]: Failed to query location: Could not connect to location.services.mozilla.com: No route to >
May 16 13:54:54 fedora kernel: iwlwifi 0000:05:00.0: Not associated and the session protection is over already...
May 16 13:54:54 fedora kernel: wlp5s0: Connection to AP d8:8e:d4:7d:2e:c7 lost
May 16 13:54:54 fedora kernel: wlp5s0: aborting authentication with d8:8e:d4:7d:2e:c7 by local choice (Reason: 3=DEAUTH_LEAVING)
May 16 13:54:54 fedora wpa_supplicant[2341]: BSSID d8:8e:d4:7d:2e:c7 ignore list count incremented to 2, ignoring for 10 seconds
May 16 13:54:54 fedora NetworkManager[2272]: <info>  [1715892894.3034] device (wlp5s0): supplicant interface state: authenticati>
May 16 13:54:54 fedora NetworkManager[2272]: <info>  [1715892894.3035] device (p2p-dev-wlp5s0): supplicant management interface >
May 16 13:54:54 fedora NetworkManager[2272]: <info>  [1715892894.8079] device (wlp5s0): supplicant interface state: disconnected>
May 16 13:54:54 fedora NetworkManager[2272]: <info>  [1715892894.8080] device (p2p-dev-wlp5s0): supplicant management interface >
May 16 13:54:55 fedora wpa_supplicant[2341]: wlp5s0: SME: Trying to authenticate with e8:d3:eb:b8:d2:c7 (SSID='MEH' freq=5180>
May 16 13:54:55 fedora kernel: wlp5s0: authenticate with e8:d3:eb:b8:d2:c7 (local address=9a:25:6c:43:ef:61)
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.0422] device (wlp5s0): supplicant interface state: scanning -> >
May 16 13:54:55 fedora kernel: wlp5s0: send auth to e8:d3:eb:b8:d2:c7 (try 1/3)
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.0422] device (p2p-dev-wlp5s0): supplicant management interface >
May 16 13:54:55 fedora wpa_supplicant[2341]: wlp5s0: Trying to associate with e8:d3:eb:b8:d2:c7 (SSID='MEH' freq=5180 MHz)
May 16 13:54:55 fedora kernel: wlp5s0: authenticated
May 16 13:54:55 fedora kernel: wlp5s0: associate with e8:d3:eb:b8:d2:c7 (try 1/3)
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.0813] device (wlp5s0): supplicant interface state: authenticati>
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.0814] device (p2p-dev-wlp5s0): supplicant management interface >
May 16 13:54:55 fedora kernel: wlp5s0: RX AssocResp from e8:d3:eb:b8:d2:c7 (capab=0x1111 status=0 aid=4)
May 16 13:54:55 fedora wpa_supplicant[2341]: wlp5s0: Associated with e8:d3:eb:b8:d2:c7
May 16 13:54:55 fedora wpa_supplicant[2341]: wlp5s0: CTRL-EVENT-SUBNET-STATUS-UPDATE status=0
May 16 13:54:55 fedora kernel: wlp5s0: associated
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.1092] device (wlp5s0): supplicant interface state: associating >
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.1092] device (p2p-dev-wlp5s0): supplicant management interface >
May 16 13:54:55 fedora kernel: wlp5s0: Limiting TX power to 30 (30 - 0) dBm as advertised by e8:d3:eb:b8:d2:c7
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.1255] device (wlp5s0): supplicant interface state: associated ->
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.1256] device (p2p-dev-wlp5s0): supplicant management interface >
May 16 13:54:55 fedora wpa_supplicant[2341]: wlp5s0: WPA: Key negotiation completed with e8:d3:eb:b8:d2:c7 [PTK=CCMP GTK=CCMP]
May 16 13:54:55 fedora wpa_supplicant[2341]: wlp5s0: CTRL-EVENT-CONNECTED - Connection to e8:d3:eb:b8:d2:c7 completed [id=0 id_s>
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.1738] device (wlp5s0): supplicant interface state: 4way_handsha>
May 16 13:54:55 fedora wpa_supplicant[2341]: wlp5s0: CTRL-EVENT-SIGNAL-CHANGE above=1 signal=-56 noise=9999 txrate=245000
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.1745] device (wlp5s0): ip:dhcp4: restarting
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.1746] dhcp4 (wlp5s0): canceled DHCP transaction
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.1746] dhcp4 (wlp5s0): activation: beginning transaction (timeou>
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.1746] dhcp4 (wlp5s0): state changed no lease
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.1746] dhcp4 (wlp5s0): activation: beginning transaction (timeou>
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.1749] device (p2p-dev-wlp5s0): supplicant management interface >
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.2098] dhcp4 (wlp5s0): state changed new lease, address=192.168.>
May 16 13:54:55 fedora NetworkManager[2272]: <info>  [1715892895.2099] dhcp4 (wlp5s0): state changed new lease, address=192.168.>

From the logs, this screams firmware to me (Intel AX210 in this case)

  • reason=4: Reason code 4 means “Disassociated due to inactivity”.

  • locally_generated=1: This indicates that the disconnection was initiated locally rather than by the access point.

In this case, historically, we would disable power savings. This is for AX210 and is likely a linux-firmware package issue, not us specifically. Regression.

To determine if power save is on (and it likely is):

iw dev | grep Interface | awk '{print $2}' | xargs -I {} iw dev {} get power_save

If this returns: Power save: on

Run this script:

echo -e "[connection]\nwifi.powersave = 2" | sudo tee /etc/NetworkManager/conf.d/default-wifi-powersave-on.conf && sleep 1 && echo -e "\033[1;33mProcess is complete"

and then

sudo systemctl restart NetworkManager

Check again:

iw dev | grep Interface | awk '{print $2}' | xargs -I {} iw dev {} get power_save

Should output: Power save: off


Completed testing of AX210 with power saving OFF.

Flawless, no issues whatsoever, no drops. Script and journal confirmed it.

Suggestion for anyone using AX210, continue with this:

To determine if power save is on (and it likely is):

iw dev | grep Interface | awk '{print $2}' | xargs -I {} iw dev {} get power_save

If this returns: Power save: on

Run this script:

echo -e "[connection]\nwifi.powersave = 2" | sudo tee /etc/NetworkManager/conf.d/default-wifi-powersave-on.conf && sleep 1 && echo -e "\033[1;33mProcess is complete"

and then

sudo systemctl restart NetworkManager

Check again:

iw dev | grep Interface | awk '{print $2}' | xargs -I {} iw dev {} get power_save

Should output: Power save: off


Final Results

Intel AX210 on FW 16 does indeed see drops for 10 seconds at a time when a drop occurs.
Repeated this, with power saving for wifi disabled and did not see the drop offs.

Noticed something here - see the jump happening here:

Thu May 16 01:54:45 PM PDT 2024:
Connected to d8:8e:d4:7d:2e:c8 (on wlp5s0)
SSID: (fake)
freq: 6135.0

Thu May 16 01:54:55 PM PDT 2024:
Not connected.

Thu May 16 01:55:05 PM PDT 2024:
Connected to e8:d3:eb:b8:d2:c7 (on wlp5s0)
SSID: (fake)
freq: 5180.0




MediaTek card on FW 16 did not see any drop off, even with the defaults of having power save on.

Wi-Fi 6E.
WPA3 enabled.
No QoS or “client steering” enabled on the router. <—this is important
Bluetooth on, Bluetooth off.
3.03 FW 16 BIOS.

2 Likes

Pinned this for two weeks for visibility.