[TRACKING] WiFi drops out (especially when downloading a lot) but fixes instantly by reconnecting (AMD)

hi! i’m running Fedora 39 Silverblue (actually Universal Blue, also with tuned instead of PPD) and I keep having my WiFi just stop working randomly. it seems to happen way more when i’m using more bandwidth. it fixes itself very quickly by just reconnecting to the network manually, and usually connections are also restored that way.

i believe i’m running the latest firmware:

[  111.699177] mt7921e 0000:01:00.0: enabling device (0000 -> 0002)
[  111.730285] mt7921e 0000:01:00.0: ASIC revision: 79220010
[  111.806542] mt7921e 0000:01:00.0: HW/SW Version: 0x8a108a10, Build Time: 20240219103244a
[  111.831743] mt7921e 0000:01:00.0: WM Firmware Version: ____000000, Build Time: 20240219103337

any clue? i’ve looked at the other threads but i haven’t found anyone that had the exact issue, even less so with the latest firmware.

@Mario_Limonciello (for when you return). This may be a regression. Also looping in @Jorge_Castro in case any of the Ublue team have seen this?

any updates on this? it makes long transfers on files impossible and is generally pretty annoying. thanks

As it’s happening on the newer firmware I think this should be raised on the wireless mailing lists. Here’s the people you should mail about it:

$ ./scripts/get_maintainer.pl drivers/net/wireless/mediatek/
Kalle Valo <kvalo@kernel.org> (maintainer:NETWORKING DRIVERS (WIRELESS))
Felix Fietkau <nbd@nbd.name> (commit_signer:264/290=91%,authored:17/290=6%)
Lorenzo Bianconi <lorenzo@kernel.org> (commit_signer:88/290=30%,authored:74/290=26%)
Shayne Chen <shayne.chen@mediatek.com> (commit_signer:71/290=24%)
Deren Wu <deren.wu@mediatek.com> (commit_signer:60/290=21%,authored:18/290=6%)
Peter Chiu <chui-hao.chiu@mediatek.com> (commit_signer:31/290=11%,authored:24/290=8%)
Ming Yen Hsieh <mingyen.hsieh@mediatek.com> (authored:23/290=8%)
linux-wireless@vger.kernel.org (open list:NETWORKING DRIVERS (WIRELESS))
linux-kernel@vger.kernel.org (open list)

Feel free to drop a link here for others to follow along your discussions.

1 Like

Can confirm - behaviour has gotten worse with F40 Betas ; noticing drop outs on Video Calls quite frequently.

Do your experiences improve following running

nmcli con mod "$CONN" 802-11-wireless.powersave 2

where $CONN is the Connection name of the network ?

Please do file a bug report with the Fedora team so they can address this, I am pretty heads down and you all may be able to get this done faster than I as I need cycles to repro. But yes, as Mario indicated, reaching out directly via email may be faster/smoother as this is a wireless mailing list related item.

Yes. Please do test this. We have found power save from 3 to 2 can address issues described, which for Intel cards was an issue at one time.

First see if power save is enabled (2) or not (3).

interface=$(iw dev | awk '/Interface/ {print $2}'); powersave=$(nmcli -t -f 802-11-wireless.powersave connection show "$(nmcli -t -f NAME,DEVICE con show --active | grep ":${interface}$" | cut -d: -f1)"); echo "The name of your wireless interface is: $interface, and the power save setting is: $powersave"

If it’s is showing as “the power save setting is: 802-11-wireless.powersave:enable”, test it this way with it disabled. See if this improves.

nmcli con mod "$(nmcli -t -f NAME,DEVICE con show --active | grep ":$(iw dev | awk '/Interface/ {print $2}')" | cut -d: -f1)" 802-11-wireless.powersave 2

Then check again, then see if the connection behaves (this will reset on reboot, so just for testing)

interface=$(iw dev | awk '/Interface/ {print $2}'); powersave=$(nmcli -t -f 802-11-wireless.powersave connection show "$(nmcli -t -f NAME,DEVICE con show --active | grep ":${interface}$" | cut -d: -f1)"); echo "The name of your wireless interface is: $interface, and the power save setting is: $powersave"
1 Like

Spoke with a dev member of the Bluefin/Ublue team, this is not being seen on the FW16 for them as of yet.

Even with power-save set to 2, I just had the same issue happen again.

Gotcha, then I need to fly this one up the flag pole so to speak, reach out to the mailing list.

All dealing with this, I need to gather specific details.

Distro/Release:

Kernel:

MediaTek Wireless card firmware version: (Just grab what come up with this in your terminal)
sudo dmesg | grep mt7921e

Steps to reproduce: (Ideally easy to duplicate steps are ideal; YouTube, large file transfer LAN or to a cloud provider, etc.)

Wi-Fi Router Type: (Wi-Fi 5, 6, 6E, etc?)

With this, I can escalate this internally and get resources behind this. I can also justify my own limited cycles to see if myself or @Loell_Framework is able to repro this ourselves, gather logs, and get this moving forward.

Distro/Release: Fedora 40 Universal Blue (Bluefin)

Kernel: Linux fedora 6.8.7-300.fc40.x86_64

MediaTek Wireless card firmware version:

[   28.870498] mt7921e 0000:01:00.0: ASIC revision: 79220010
[   28.950129] mt7921e 0000:01:00.0: HW/SW Version: 0x8a108a10, Build Time: 20240219103244a
[   29.342888] mt7921e 0000:01:00.0: WM Firmware Version: ____000000, Build Time: 20240219103337
[   30.544827] mt7921e 0000:01:00.0 wlp1s0: renamed from wlan0

Steps to reproduce: (Ideally easy to duplicate steps are ideal; YouTube, large file transfer LAN or to a cloud provider, etc.)

Use bandwidth. Usually, the more you use, the faster it happens. (Seriously; it has happened when downloading Linux ISOs over BitTorrent, while watching YouTube, while being on a Jitsi Meet, etc.)

Wi-Fi Router Type: Wi-Fi 5

1 Like

Thank you - I have also asked folks in the linux channel of our Discord to chime in if they have seen this as well, using my above template.

Hrm - I have been montioring this relatively closely and I am thinking it’s to do with AP setups which have different regulatory domains set. I have a relatively complex network and I am seeing two potential antecedents to this which involve bgscan and changing the regulatory domain. My guess is the mt driver isn’t able to cleanly deal with switching between AP’s where there is a different regulatory domain set as I get drop outs whenever I get steered from one AP to another and am seeing this:

May 03 12:29:42 emiemi wpa_supplicant[3280]: wlp1s0: CTRL-EVENT-REGDOM-CHANGE init=CORE type=WORLD
May 03 12:29:42 emiemi wpa_supplicant[3280]: wlp1s0: Trying to associate with 24:4b:fe:62:67:04 (SSID='kainga-atawhai' freq=5500 MHz)
May 03 12:29:42 emiemi wpa_supplicant[3280]: wlp1s0: CTRL-EVENT-REGDOM-CHANGE init=USER type=COUNTRY alpha2=NZ
May 03 12:29:42 emiemi NetworkManager[1456]: <info>  [1714696182.3650] device (wlp1s0): supplicant interface state: authenticating -> associating
May 03 12:29:42 emiemi NetworkManager[1456]: <info>  [1714696182.3650] device (p2p-dev-wlp1s0): supplicant management interface state: authenticating -> associating
May 03 12:29:42 emiemi kernel: wlp1s0: associate with 24:4b:fe:62:67:04 (try 1/3)
May 03 12:29:42 emiemi kernel: wlp1s0: RX ReassocResp from 24:4b:fe:62:67:04 (capab=0x1511 status=0 aid=2)
May 03 12:29:42 emiemi kernel: wlp1s0: associated
May 03 12:29:42 emiemi wpa_supplicant[3280]: wlp1s0: Associated with 24:4b:fe:62:67:04
May 03 12:29:42 emiemi wpa_supplicant[3280]: wlp1s0: CTRL-EVENT-SUBNET-STATUS-UPDATE status=0
May 03 12:29:42 emiemi kernel: wlp1s0: Limiting TX power to 24 (24 - 0) dBm as advertised by 24:4b:fe:62:67:04
May 03 12:29:42 emiemi wpa_supplicant[3280]: wlp1s0: CTRL-EVENT-REGDOM-CHANGE init=COUNTRY_IE type=COUNTRY alpha2=US
May 03 12:29:42 emiemi wpa_supplicant[3280]: wlp1s0: WPA: Key negotiation completed with 24:4b:fe:62:67:04 [PTK=CCMP GTK=CCMP]
May 03 12:29:42 emiemi wpa_supplicant[3280]: wlp1s0: CTRL-EVENT-CONNECTED - Connection to 24:4b:fe:62:67:04 completed [id=0 id_str=]
May 03 12:29:42 emiemi wpa_supplicant[3280]: bgscan simple: Failed to enable signal strength monitoring
May 03 12:29:42 emiemi NetworkManager[1456]: <info>  [1714696182.5240] device (wlp1s0): supplicant interface s
1 Like

Hi Matias,

This isn’t an ideal situation but it’s worth a shot, we reverted the firmware on the card to an older version in the latest update to Bluefin last night to see if we can nail it down. If you wouldn’t mind kicking the tyres it would be great feedback, thanks!

2 Likes

I’m not sure if this applies to me; the AP that I usually connect to is always the same AP, with a unique SSID. I can’t find CTRL-EVENT-REGDOM-CHANGE in my dmesg.

Acknowledging the missing CTRL-EVENT-REGDOM-CHANGE taking place in dmesg. We think this may be related, yes. Idea being, we have a custom environment while it would mean an install (spare Nvme if you have one works), but it would allow us to see if the behavior replicates on the previous firmware with Bluefin (a HIGHLY customized Fedora Silverblue image)

I asked Jorge and the team to see if they could help me with this, putting this together as a method of verifying firmware differences to see if the issue clears up.

The idea is expanding on what @jwp indicated, seeing if the firmware is the culprit with the hand off. I know this isn’t ideal, but as we cannot repro this here, we want to flag the correct fault - AP, Wi-Fi environment or as I suspect, Firmware.

Testing this now, on Bluefin pushing 4k video on two separate instances with no drop out. The older firmware was auto-updated.

Please do keep us updated as we have yet to be able to repro this and we lack logs to pass along for a bug report.

I just was able to accidentally reproduce it while on a Jitsi Meet call with version:

[   15.976914] mt7921e 0000:01:00.0: ASIC revision: 79220010
[   16.054399] mt7921e 0000:01:00.0: HW/SW Version: 0x8a108a10, Build Time: 20240219103244a
[   16.451150] mt7921e 0000:01:00.0: WM Firmware Version: ____000000, Build Time: 20240219103337
[   17.611006] mt7921e 0000:01:00.0 wlp1s0: renamed from wlan0

I don’t see anything in the dmesg that refers to the drop-out, only messages when I manually reconnect.

I believe this is the updated firmware. Should I try the outdated firmware then?

1 Like

Let’s get you into a ticket so I can review your logs.

Please link to this thread and because it sounds like you’re on Bluefin, please run:

ujust logs-this-boot > thisboot.txt and attach the thisboot.txt file which will appear in the home directory in the ticket. Ask the agent to send to my assigned ticket queue.

I have had multiple folks try to replicate this on Fedora and two others on Bluefin, no luck thus far.