Kernel panic from Wifi - MediaTek MT7925 nullptr dereference

Hitting a kernel panic freeze that I thought was related to Thunderbolt 5, but it turns out it’s wifi related. Switched to Ethernet for now.

Which Linux distro are you using?

Ubuntu

Which release version?
Ubuntu 25.10

(If rolling release, last date updated?)
N/A (not a rolling release)

Which kernel are you using?
6.17.0-8-generic

Which BIOS version are you using?
03.04 (released 11/19/2025)

Which Framework Desktop model are you using?
AMD Ryzen™ AI Max 395 Series

System freezes/lockups caused by kernel panic in the MediaTek MT7925e WiFi driver

The root cause seems to be a NULL pointer dereference in mt76_connac_mcu_uni_add_dev() function (offset 0xba) during WiFi reset workflow. This occurs when the driver attempts to reconnect after WiFi association failures or timeouts.

quasi-stack trace: mt7925_mac_reset_work → ieee80211_iterate_interfaces → mt7925_vif_connect_iter → mt76_connac_mcu_uni_add_dev [NULL pointer dereference]

Dump

[ 654.968920] [ T12] mt7925e 0000:c0:00.0: WM Firmware Version: ____000000, Build Time: 20250721232943
[ 655.737302] [ T12] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 655.737320] [ T12] #PF: supervisor read access in kernel mode
[ 655.737324] [ T12] #PF: error_code(0x0000) - not-present page
[ 655.737328] [ T12] PGD 0 P4D 0
[ 655.737334] [ T12] Oops: Oops: 0000 [#1] SMP NOPTI
[ 655.737342] [ T12] CPU: 20 UID: 0 PID: 12 Comm: kworker/u128:0 Kdump: loaded Tainted: G OE 6.17.0-8-generic #8-Ubuntu PREEMPT(voluntary)
[ 655.737350] [ T12] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 655.737351] [ T12] Hardware name: Framework Desktop (AMD Ryzen AI Max 300 Series)/FRANMFCP06, BIOS 03.04 11/19/2025
[ 655.737354] [ T12] Workqueue: mt76 mt7925_mac_reset_work [mt7925_common]
[ 655.737370] [ T12] RIP: 0010:mt76_connac_mcu_uni_add_dev+0xba/0x1f0 [mt76_connac_lib]
[ 655.737385] [ T12] Code: cc 66 44 89 5d d2 44 88 45 d4 44 88 4d d5 88 65 d7 c6 45 dc 01 88 55 dd 0f b7 97 b8 00 00 00 88 4d ef 66 89 55 e4 66 89 55 ea <48> 8b 16 8b 12 83 fa 03 0f 84 0c 01 00 00 77 1b 83 fa 01 0f 84 f5
[ 655.737388] [ T12] RSP: 0018:ffffd07fc018fcb0 EFLAGS: 00010282
[ 655.737392] [ T12] RAX: 000000000000ff00 RBX: ffff8a4449442040 RCX: 0000000000000000
[ 655.737394] [ T12] RDX: 0000000000000013 RSI: 0000000000000000 RDI: ffff8a44c7d7a4b0
[ 655.737396] [ T12] RBP: ffffd07fc018fcf8 R08: 0000000000000001 R09: 0000000000000000
[ 655.737397] [ T12] R10: 0000000000000000 R11: 0000000000000020 R12: ffff8a4449442040
[ 655.737399] [ T12] R13: ffff8a44c7d79f08 R14: 0000000000000000 R15: ffff8a44c7d78a80
[ 655.737401] [ T12] FS: 0000000000000000(0000) GS:ffff8a53ca47f000(0000) knlGS:0000000000000000
[ 655.737403] [ T12] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 655.737404] [ T12] CR2: 0000000000000000 CR3: 00000009e2a40000 CR4: 0000000000f50ef0
[ 655.737406] [ T12] PKRU: 55555554
[ 655.737408] [ T12] Call Trace:
[ 655.737411] [ T12]
[ 655.737416] [ T12] mt7925_vif_connect_iter+0xcb/0x240 [mt7925_common]
[ 655.737423] [ T12] __iterate_interfaces+0x92/0x130 [mac80211]
[ 655.737500] [ T12] ? __pfx_mt7925_vif_connect_iter+0x10/0x10 [mt7925_common]
[ 655.737506] [ T12] ieee80211_iterate_interfaces+0x3d/0x60 [mac80211]
[ 655.737549] [ T12] ? __pfx_mt7925_vif_connect_iter+0x10/0x10 [mt7925_common]
[ 655.737553] [ T12] mt7925_mac_reset_work+0x105/0x190 [mt7925_common]
[ 655.737559] [ T12] process_one_work+0x18b/0x370
[ 655.737567] [ T12] worker_thread+0x317/0x450
[ 655.737570] [ T12] ? __pfx_worker_thread+0x10/0x10
[ 655.737573] [ T12] kthread+0x108/0x220
[ 655.737577] [ T12] ? __pfx_kthread+0x10/0x10
[ 655.737579] [ T12] ret_from_fork+0x131/0x150
[ 655.737585] [ T12] ? __pfx_kthread+0x10/0x10
[ 655.737587] [ T12] ret_from_fork_asm+0x1a/0x30
[ 655.737592] [ T12]
[ 655.737594] [ T12] Modules linked in: tls snd_seq_dummy snd_hrtimer xt_mark veth nf_conntrack_netlink xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE bridge stp llc xfrm_user xfrm_algo xt_set ip_set nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype nft_compat ccm nf_tables rfcomm cmac algif_hash algif_skcipher af_alg overlay qrtr bnep binfmt_misc nls_iso8859_1 intel_rapl_msr amd_atl intel_rapl_common snd_hda_codec_alc269 edac_mce_amd snd_hda_scodec_component snd_hda_codec_realtek_lib snd_hda_codec_atihdmi snd_hda_codec_generic mt7925e snd_hda_codec_hdmi btusb mt7925_common btrtl kvm_amd btintel mt792x_lib snd_hda_intel btbcm snd_hda_codec mt76_connac_lib btmtk leds_cros_ec kvm gpio_cros_ec irqbypass cros_ec_sysfs led_class_multicolor cros_ec_hwmon cros_ec_debugfs snd_hda_core cros_ec_chardev mt76 polyval_clmulni ghash_clmulni_intel bluetooth snd_seq_midi spd5118 cros_ec_dev aesni_intel snd_intel_dspcfg rapl mac80211 snd_seq_midi_event snd_intel_sdw_acpi snd_rawmidi snd_hwdep wmi_bmof cfg80211 snd_seq
[ 655.737656] [ T12] snd_pcm amd_pmf amdxdna snd_seq_device amdtee gpu_sched snd_timer input_leds i2c_piix4 snd i2c_smbus libarc4 amd_sfh ccp soundcore joydev tee cros_ec_lpcs platform_profile cros_ec amd_pmc cros_ec_proto mac_hid sch_fq_codel msr parport_pc ppdev lp parport efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 typec_displayport typec_thunderbolt hid_generic ucsi_acpi typec_ucsi usbhid typec hid amdgpu(OE) amddrm_ttm_helper(OE) amdttm(OE) amddrm_buddy(OE) amdxcp(OE) amddrm_exec(OE) drm_suballoc_helper amd_sched(OE) amdkcl(OE) drm_panel_backlight_quirks i2c_algo_bit drm_ttm_helper ttm drm_display_helper nvme cec nvme_core gpio_keys rc_core r8169 nvme_keyring thunderbolt realtek nvme_auth video wmi soc_button_array 8250_dw
[ 655.737713] [ T12] CR2: 0000000000000000

[ 325.069133] [ T1940] wlp192s0: Limiting TX power to 30 (30 - 0) dBm as advertised by d8:b3:70:f8:9e:7d
[ 329.056447] [ T1948] wlp192s0: Limiting TX power to 30 (30 - 0) dBm as advertised by d8:b3:70:f8:9e:7c
[ 493.378637] [ T215] audit: type=1400 audit(1767142510.031:274): apparmor=“AUDIT” operation=“userns_create” class=“namespace” info=“Userns create - transitioning profile” profile=“unconfined” pid=35160 comm=“(insights)” requested=“userns_create” target=“unprivileged_userns” execpath=“/usr/lib/systemd/systemd-executor”
[ 493.378689] [ T215] audit: type=1400 audit(1767142510.032:275): apparmor=“AUDIT” operation=“userns_create” class=“namespace” info=“Userns create - transitioning profile” profile=“unconfined” pid=35161 comm=“(insights)” requested=“userns_create” target=“unprivileged_userns” execpath=“/usr/lib/systemd/systemd-executor”
[ 493.379084] [ T215] audit: type=1400 audit(1767142510.032:276): apparmor=“DENIED” operation=“capable” class=“cap” profile=“unprivileged_userns” pid=35160 comm=“(insights)” capability=21 capname=“sys_admin”
[ 493.379172] [ T215] audit: type=1400 audit(1767142510.032:277): apparmor=“DENIED” operation=“capable” class=“cap” profile=“unprivileged_userns” pid=35161 comm=“(insights)” capability=21 capname=“sys_admin”
[ 633.912593] [ T2428] wlp192s0: disconnect from AP d8:b3:70:f8:9e:7b for new auth to d8:b3:70:f8:9e:7b
[ 634.298743] [ T2428] wlp192s0: [link 2] regulatory prevented using AP config, downgraded
[ 634.458674] [ T2428] wlp192s0: authenticate with d8:b3:70:f8:9e:7b (local address=fe:6c:b0:38:0d:3e)
[ 634.597675] [ T2428] wlp192s0: send auth to d8:b3:70:f8:9e:7b (try 1/3)
[ 634.604083] [ T209] wlp192s0: authenticated
[ 634.607417] [ T2428] wlp192s0: aborting authentication with d8:b3:70:f8:9e:7b by local choice (Reason: 3=DEAUTH_LEAVING)
[ 635.031798] [ T2428] wlp192s0: authenticate with d8:b3:70:f8:9e:7b (local address=92:fb:4b:57:88:de)
[ 635.171103] [ T2428] wlp192s0: send auth to d8:b3:70:f8:9e:7b (try 1/3)
[ 635.178869] [ T209] wlp192s0: authenticated
[ 635.183695] [ T209] wlp192s0: associate with d8:b3:70:f8:9e:7b (try 1/3)
[ 635.301217] [ T209] wlp192s0: associate with d8:b3:70:f8:9e:7b (try 2/3)
[ 635.405206] [ T257] wlp192s0: associate with d8:b3:70:f8:9e:7b (try 3/3)
[ 635.434065] [ T209] wlp192s0: RX AssocResp from d8:b3:70:f8:9e:7b (capab=0x1111 status=30 aid=0)
[ 635.434122] [ T209] wlp192s0: d8:b3:70:f8:9e:7b rejected association temporarily; comeback duration 1000 TU (1024 ms)
[ 635.437631] [ T209] wlp192s0: RX AssocResp from d8:b3:70:f8:9e:7b (capab=0x1111 status=30 aid=1902)
[ 635.437731] [ T209] wlp192s0: d8:b3:70:f8:9e:7b rejected association temporarily; comeback duration 896 TU (917 ms)
[ 635.443989] [ T209] wlp192s0: RX AssocResp from d8:b3:70:f8:9e:7b (capab=0x1111 status=30 aid=1070)
[ 635.444021] [ T209] wlp192s0: d8:b3:70:f8:9e:7b rejected association temporarily; comeback duration 795 TU (814 ms)
[ 636.292753] [ T257] wlp192s0: association with d8:b3:70:f8:9e:7b timed out
[ 639.364878] [ T257] mt7925e 0000:c0:00.0: Message 00020002 (seq 6) timeout
[ 645.381238] [ T257] mt7925e 0000:c0:00.0: Message 00020003 (seq 7) timeout
[ 648.516818] [ T257] mt7925e 0000:c0:00.0: Message 00020002 (seq 8) timeout
[ 651.524770] [ T257] mt7925e 0000:c0:00.0: Message 00020002 (seq 9) timeout
[ 654.532940] [ T257] mt7925e 0000:c0:00.0: Message 00020001 (seq 10) timeout
[ 654.626615] [ T12] mt7925e 0000:c0:00.0: HW/SW Version: 0x8a108a10, Build Time: 20250721232852a

[ 654.968920] [ T12] mt7925e 0000:c0:00.0: WM Firmware Version: ____000000, Build Time: 20250721232943
[ 655.737302] [ T12] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 655.737320] [ T12] #PF: supervisor read access in kernel mode
[ 655.737324] [ T12] #PF: error_code(0x0000) - not-present page
[ 655.737328] [ T12] PGD 0 P4D 0
[ 655.737334] [ T12] Oops: Oops: 0000 [#1] SMP NOPTI
[ 655.737342] [ T12] CPU: 20 UID: 0 PID: 12 Comm: kworker/u128:0 Kdump: loaded Tainted: G OE 6.17.0-8-generic #8-Ubuntu PREEMPT(voluntary)
[ 655.737350] [ T12] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 655.737351] [ T12] Hardware name: Framework Desktop (AMD Ryzen AI Max 300 Series)/FRANMFCP06, BIOS 03.04 11/19/2025
[ 655.737354] [ T12] Workqueue: mt76 mt7925_mac_reset_work [mt7925_common]
[ 655.737370] [ T12] RIP: 0010:mt76_connac_mcu_uni_add_dev+0xba/0x1f0 [mt76_connac_lib]
[ 655.737385] [ T12] Code: cc 66 44 89 5d d2 44 88 45 d4 44 88 4d d5 88 65 d7 c6 45 dc 01 88 55 dd 0f b7 97 b8 00 00 00 88 4d ef 66 89 55 e4 66 89 55 ea <48> 8b 16 8b 12 83 fa 03 0f 84 0c 01 00 00 77 1b 83 fa 01 0f 84 f5
[ 655.737388] [ T12] RSP: 0018:ffffd07fc018fcb0 EFLAGS: 00010282
[ 655.737392] [ T12] RAX: 000000000000ff00 RBX: ffff8a4449442040 RCX: 0000000000000000
[ 655.737394] [ T12] RDX: 0000000000000013 RSI: 0000000000000000 RDI: ffff8a44c7d7a4b0
[ 655.737396] [ T12] RBP: ffffd07fc018fcf8 R08: 0000000000000001 R09: 0000000000000000
[ 655.737397] [ T12] R10: 0000000000000000 R11: 0000000000000020 R12: ffff8a4449442040
[ 655.737399] [ T12] R13: ffff8a44c7d79f08 R14: 0000000000000000 R15: ffff8a44c7d78a80
[ 655.737401] [ T12] FS: 0000000000000000(0000) GS:ffff8a53ca47f000(0000) knlGS:0000000000000000
[ 655.737403] [ T12] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 655.737404] [ T12] CR2: 0000000000000000 CR3: 00000009e2a40000 CR4: 0000000000f50ef0
[ 655.737406] [ T12] PKRU: 55555554
[ 655.737408] [ T12] Call Trace:
[ 655.737411] [ T12]
[ 655.737416] [ T12] mt7925_vif_connect_iter+0xcb/0x240 [mt7925_common]
[ 655.737423] [ T12] __iterate_interfaces+0x92/0x130 [mac80211]
[ 655.737500] [ T12] ? __pfx_mt7925_vif_connect_iter+0x10/0x10 [mt7925_common]
[ 655.737506] [ T12] ieee80211_iterate_interfaces+0x3d/0x60 [mac80211]
[ 655.737549] [ T12] ? __pfx_mt7925_vif_connect_iter+0x10/0x10 [mt7925_common]
[ 655.737553] [ T12] mt7925_mac_reset_work+0x105/0x190 [mt7925_common]
[ 655.737559] [ T12] process_one_work+0x18b/0x370
[ 655.737567] [ T12] worker_thread+0x317/0x450
[ 655.737570] [ T12] ? __pfx_worker_thread+0x10/0x10
[ 655.737573] [ T12] kthread+0x108/0x220
[ 655.737577] [ T12] ? __pfx_kthread+0x10/0x10
[ 655.737579] [ T12] ret_from_fork+0x131/0x150
[ 655.737585] [ T12] ? __pfx_kthread+0x10/0x10
[ 655.737587] [ T12] ret_from_fork_asm+0x1a/0x30
[ 655.737592] [ T12]
[ 655.737594] [ T12] Modules linked in: tls snd_seq_dummy snd_hrtimer xt_mark veth nf_conntrack_netlink xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE bridge stp llc xfrm_user xfrm_algo xt_set ip_set nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype nft_compat ccm nf_tables rfcomm cmac algif_hash algif_skcipher af_alg overlay qrtr bnep binfmt_misc nls_iso8859_1 intel_rapl_msr amd_atl intel_rapl_common snd_hda_codec_alc269 edac_mce_amd snd_hda_scodec_component snd_hda_codec_realtek_lib snd_hda_codec_atihdmi snd_hda_codec_generic mt7925e snd_hda_codec_hdmi btusb mt7925_common btrtl kvm_amd btintel mt792x_lib snd_hda_intel btbcm snd_hda_codec mt76_connac_lib btmtk leds_cros_ec kvm gpio_cros_ec irqbypass cros_ec_sysfs led_class_multicolor cros_ec_hwmon cros_ec_debugfs snd_hda_core cros_ec_chardev mt76 polyval_clmulni ghash_clmulni_intel bluetooth snd_seq_midi spd5118 cros_ec_dev aesni_intel snd_intel_dspcfg rapl mac80211 snd_seq_midi_event snd_intel_sdw_acpi snd_rawmidi snd_hwdep wmi_bmof cfg80211 snd_seq
[ 655.737656] [ T12] snd_pcm amd_pmf amdxdna snd_seq_device amdtee gpu_sched snd_timer input_leds i2c_piix4 snd i2c_smbus libarc4 amd_sfh ccp soundcore joydev tee cros_ec_lpcs platform_profile cros_ec amd_pmc cros_ec_proto mac_hid sch_fq_codel msr parport_pc ppdev lp parport efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 typec_displayport typec_thunderbolt hid_generic ucsi_acpi typec_ucsi usbhid typec hid amdgpu(OE) amddrm_ttm_helper(OE) amdttm(OE) amddrm_buddy(OE) amdxcp(OE) amddrm_exec(OE) drm_suballoc_helper amd_sched(OE) amdkcl(OE) drm_panel_backlight_quirks i2c_algo_bit drm_ttm_helper ttm drm_display_helper nvme cec nvme_core gpio_keys rc_core r8169 nvme_keyring thunderbolt realtek nvme_auth video wmi soc_button_array 8250_dw
[ 655.737713] [ T12] CR2: 0000000000000000

1 Like

Temporary workaround to blacklist MT7925 and keep it from loading at all until upstream Linux kernels can fix their shit. I filed bugs with Ubuntu too.

sudo tee /etc/modprobe.d/blacklist-mt7925.conf << 'EOF'
# Blacklist MT7925 WiFi 7 driver and all dependencies
blacklist mt7925e
blacklist mt7925_common
blacklist mt792x_lib
blacklist mt76_connac_lib
blacklist mt76
# Prevent auto-loading
install mt7925e /bin/false
install mt7925_common /bin/false
install mt792x_lib /bin/false
install mt76_connac_lib /bin/false
install mt76 /bin/false
EOF

sudo modprobe -r mt7925e mt7925_common mt792x_lib mt76_connac_lib mt76

sudo update-initramfs -u
2 Likes

I hacked up a kernel patch. I don’t know if this makes sense since I don’t know this code, but at least it won’t null deference.

From 6790e656030fb23527aa5c0d6eaa28ce029335b1 Mon Sep 17 00:00:00 2001
From: Zac Bowling <zac@zacbowling.com>
Date: Tue, 30 Dec 2025 20:32:56 -0800
Subject: [PATCH] wifi: mt76: mt7925: fix NULL pointer dereference in vif
 iteration loops

mt792x_vif_to_bss_conf() can return NULL when iterating over valid_links
during HW reset or other state transitions, because the link configuration
in mac80211 may not be set up yet even though the driver's valid_links
bitmap has the link marked as valid.

This causes a NULL pointer dereference in mt76_connac_mcu_uni_add_dev()
when it tries to access bss_conf->vif->type, and similar crashes in other
functions that use bss_conf without checking.

The crash manifests as:
  BUG: kernel NULL pointer dereference, address: 0000000000000000
  RIP: 0010:mt76_connac_mcu_uni_add_dev+0xba/0x1f0 [mt76_connac_lib]
  Call Trace:
   mt7925_vif_connect_iter+0xcb/0x240 [mt7925_common]
   __iterate_interfaces+0x92/0x130 [mac80211]
   ieee80211_iterate_interfaces+0x3d/0x60 [mac80211]
   mt7925_mac_reset_work+0x105/0x190 [mt7925_common]

Add NULL checks for bss_conf in all loops that iterate over valid_links
and call mt792x_vif_to_bss_conf(), skipping links where the mac80211
link configuration is not yet available.

Reported-by: Zac Bowling <zac@zacbowling.com>
Signed-off-by: Zac Bowling <zac@zacbowling.com>
---
 drivers/net/wireless/mediatek/mt76/mt7925/mac.c  | 6 ++++++
 drivers/net/wireless/mediatek/mt76/mt7925/main.c | 8 ++++++++
 2 files changed, 14 insertions(+)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c
index 871b67101..184efe8af 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c
@@ -1271,6 +1271,12 @@ mt7925_vif_connect_iter(void *priv, u8 *mac,
 		bss_conf = mt792x_vif_to_bss_conf(vif, i);
 		mconf = mt792x_vif_to_link(mvif, i);
 
+		/* Skip links that don't have bss_conf set up yet in mac80211.
+		 * This can happen during HW reset when link state is inconsistent.
+		 */
+		if (!bss_conf)
+			continue;
+
 		mt76_connac_mcu_uni_add_dev(&dev->mphy, bss_conf, &mconf->mt76,
 					    &mvif->sta.deflink.wcid, true);
 		mt7925_mcu_set_tx(dev, bss_conf);
diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c
index 2d358a966..3001a62a8 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c
@@ -1304,6 +1304,8 @@ mt7925_mlo_pm_iter(void *priv, u8 *mac, struct ieee80211_vif *vif)
 	mt792x_mutex_acquire(dev);
 	for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) {
 		bss_conf = mt792x_vif_to_bss_conf(vif, i);
+		if (!bss_conf)
+			continue;
 		mt7925_mcu_uni_bss_ps(dev, bss_conf);
 	}
 	mt792x_mutex_release(dev);
@@ -1630,6 +1632,8 @@ static void mt7925_ipv6_addr_change(struct ieee80211_hw *hw,
 
 	for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) {
 		bss_conf = mt792x_vif_to_bss_conf(vif, i);
+		if (!bss_conf)
+			continue;
 		__mt7925_ipv6_addr_change(hw, bss_conf, idev);
 	}
 }
@@ -1861,6 +1865,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw,
 	if (changed & BSS_CHANGED_ARP_FILTER) {
 		for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) {
 			bss_conf = mt792x_vif_to_bss_conf(vif, i);
+			if (!bss_conf)
+				continue;
 			mt7925_mcu_update_arp_filter(&dev->mt76, bss_conf);
 		}
 	}
@@ -1876,6 +1882,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw,
 			} else if (mvif->mlo_pm_state == MT792x_MLO_CHANGED_PS) {
 				for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) {
 					bss_conf = mt792x_vif_to_bss_conf(vif, i);
+					if (!bss_conf)
+						continue;
 					mt7925_mcu_uni_bss_ps(dev, bss_conf);
 				}
 			}
-- 
2.51.0

Submitted the patch upstream. No idea if this fixes the underlying issue of why things can be inconsistent in the first place, but it does fix the null from panicking from blowing up the kernel, at least there.

1 Like

For Wi-Fi/Bluetooth, I installed an MT7925 card, which works perfectly under both Linux Zorin and Windows 11.

Probably because you have outdated kernel before this bug was introduced, or you have been getting lucky since this bug is a race condition when reconnecting to roaming between WiFi APs.

Similar problem on Nixos (unstable, kernel 6.12.63). System hangs, some cli commands are going through (ls) while everything network-related hungs indefinitely (ip a). Kill doesn’t work on hanged processes, reboot hangs as well. Deadlock during syscall?

Logs

Dec 29 16:40:26.292082 vglfr kernel: wlp192s0: authenticate with 44:ac:85:c4:80:e7 (local address=dc:56:7b:02:b4:1f)

Dec 29 16:40:26.658307 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 16:40:26.743083 vglfr kernel: wlp192s0: authenticated

Dec 29 16:40:26.748088 vglfr kernel: wlp192s0: associate with 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 16:40:26.771095 vglfr kernel: wlp192s0: RX AssocResp from 44:ac:85:c4:80:e7 (capab=0x1111 status=0 aid=1)

Dec 29 16:40:26.813077 vglfr kernel: wlp192s0: associated

Dec 29 16:40:27.059223 vglfr kernel: wlp192s0: Limiting TX power to 30 (30 - 0) dBm as advertised by 44:ac:85:c4:80:e7

Dec 29 16:40:28.258171 vglfr kernel: warning: `ThreadPoolForeg’ uses wireless extensions which will stop working for Wi-Fi 7 hardware; use nl80211

Dec 29 16:45:36.951733 vglfr kernel: wlp192s0: disconnect from AP 44:ac:85:c4:80:e7 for new auth to 44:ac:85:bc:eb:68

Dec 29 16:45:37.178707 vglfr kernel: wlp192s0: authenticate with 44:ac:85:bc:eb:68 (local address=dc:56:7b:02:b4:1f)

Dec 29 16:45:37.534778 vglfr kernel: wlp192s0: send auth to 44:ac:85:bc:eb:68 (try 1/3)

Dec 29 16:45:40.017753 vglfr kernel: wlp192s0: authenticate with 44:ac:85:c4:80:e7 (local address=dc:56:7b:02:b4:1f)

Dec 29 16:45:40.030679 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 16:45:40.135727 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e7 (try 2/3)

Dec 29 16:45:40.136043 vglfr kernel: wlp192s0: authenticated

Dec 29 16:45:40.140659 vglfr kernel: wlp192s0: associate with 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 16:45:40.199701 vglfr kernel: wlp192s0: RX AssocResp from 44:ac:85:c4:80:e7 (capab=0x1111 status=0 aid=1)

Dec 29 16:45:40.243780 vglfr kernel: wlp192s0: associated

Dec 29 16:45:40.307617 vglfr kernel: wlp192s0: Limiting TX power to 30 (30 - 0) dBm as advertised by 44:ac:85:c4:80:e7

Dec 29 16:49:50.289619 vglfr kernel: wlp192s0: deauthenticating from 44:ac:85:c4:80:e7 by local choice (Reason: 3=DEAUTH_LEAVING)

Dec 29 16:50:02.415711 vglfr kernel: wlp192s0: authenticate with 44:ac:85:c4:80:e7 (local address=dc:56:7b:02:b4:1f)

Dec 29 16:50:02.433702 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 16:50:02.442658 vglfr kernel: wlp192s0: authenticated

Dec 29 16:50:02.447621 vglfr kernel: wlp192s0: associate with 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 16:50:02.472649 vglfr kernel: wlp192s0: RX AssocResp from 44:ac:85:c4:80:e7 (capab=0x1111 status=0 aid=1)

Dec 29 16:50:02.516663 vglfr kernel: wlp192s0: associated

Dec 29 16:50:02.517605 vglfr kernel: wlp192s0: Limiting TX power to 30 (30 - 0) dBm as advertised by 44:ac:85:c4:80:e7

Dec 29 16:52:55.394602 vglfr kernel: wlp192s0: deauthenticating from 44:ac:85:c4:80:e7 by local choice (Reason: 3=DEAUTH_LEAVING)

Dec 29 16:53:01.925669 vglfr kernel: wlp192s0: authenticate with 44:ac:85:c4:80:e7 (local address=dc:56:7b:02:b4:1f)

Dec 29 16:53:01.942652 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 16:53:01.952641 vglfr kernel: wlp192s0: authenticated

Dec 29 16:53:01.957656 vglfr kernel: wlp192s0: associate with 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 16:53:01.982682 vglfr kernel: wlp192s0: RX AssocResp from 44:ac:85:c4:80:e7 (capab=0x1111 status=0 aid=1)

Dec 29 16:53:02.026648 vglfr kernel: wlp192s0: associated

Dec 29 16:53:02.041976 vglfr kernel: wlp192s0: Limiting TX power to 30 (30 - 0) dBm as advertised by 44:ac:85:c4:80:e7

Dec 29 16:53:28.682603 vglfr kernel: wlp192s0: deauthenticating from 44:ac:85:c4:80:e7 by local choice (Reason: 3=DEAUTH_LEAVING)

Dec 29 16:54:37.993771 vglfr kernel: wlp192s0: authenticate with 44:ac:85:c4:80:e7 (local address=dc:56:7b:02:b4:1f)

Dec 29 16:54:38.011676 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 16:54:38.021664 vglfr kernel: wlp192s0: authenticated

Dec 29 16:54:38.026117 vglfr kernel: wlp192s0: associate with 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 16:54:38.048674 vglfr kernel: wlp192s0: RX AssocResp from 44:ac:85:c4:80:e7 (capab=0x1111 status=0 aid=1)

Dec 29 16:54:38.089645 vglfr kernel: wlp192s0: associated

Dec 29 16:54:38.091656 vglfr kernel: wlp192s0: Limiting TX power to 30 (30 - 0) dBm as advertised by 44:ac:85:c4:80:e7

Dec 29 16:59:47.391645 vglfr kernel: wlp192s0: disconnect from AP 44:ac:85:c4:80:e7 for new auth to 44:ac:85:c4:80:e8

Dec 29 16:59:47.594893 vglfr kernel: wlp192s0: authenticate with 44:ac:85:c4:80:e8 (local address=dc:56:7b:02:b4:1f)

Dec 29 16:59:47.613684 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e8 (try 1/3)

Dec 29 16:59:50.177708 vglfr kernel: wlp192s0: authenticate with 44:ac:85:c4:80:e7 (local address=dc:56:7b:02:b4:1f)

Dec 29 16:59:50.195675 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 16:59:50.302444 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e7 (try 2/3)

Dec 29 16:59:50.305637 vglfr kernel: wlp192s0: authenticated

Dec 29 16:59:50.309646 vglfr kernel: wlp192s0: associate with 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 16:59:50.334681 vglfr kernel: wlp192s0: RX AssocResp from 44:ac:85:c4:80:e7 (capab=0x1111 status=0 aid=1)

Dec 29 16:59:50.375650 vglfr kernel: wlp192s0: associated

Dec 29 16:59:50.411691 vglfr kernel: wlp192s0: Limiting TX power to 30 (30 - 0) dBm as advertised by 44:ac:85:c4:80:e7

Dec 29 17:01:10.971616 vglfr kernel: wlp192s0: deauthenticating from 44:ac:85:c4:80:e7 by local choice (Reason: 3=DEAUTH_LEAVING)

Dec 29 17:02:33.782775 vglfr kernel: wlp192s0: authenticate with 44:ac:85:c4:80:e7 (local address=dc:56:7b:02:b4:1f)

Dec 29 17:02:33.799664 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 17:02:33.812708 vglfr kernel: wlp192s0: authenticated

Dec 29 17:02:33.816611 vglfr kernel: wlp192s0: associate with 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 17:02:33.839678 vglfr kernel: wlp192s0: RX AssocResp from 44:ac:85:c4:80:e7 (capab=0x1111 status=0 aid=1)

Dec 29 17:02:33.883636 vglfr kernel: wlp192s0: associated

Dec 29 17:02:33.949628 vglfr kernel: wlp192s0: Limiting TX power to 30 (30 - 0) dBm as advertised by 44:ac:85:c4:80:e7

Dec 29 17:02:58.527601 vglfr kernel: wlp192s0: deauthenticating from 44:ac:85:c4:80:e7 by local choice (Reason: 3=DEAUTH_LEAVING)

Dec 29 17:03:47.085694 vglfr kernel: wlp192s0: authenticate with 44:ac:85:c4:80:e7 (local address=dc:56:7b:02:b4:1f)

Dec 29 17:03:47.102647 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 17:03:47.119634 vglfr kernel: wlp192s0: authenticated

Dec 29 17:03:47.123648 vglfr kernel: wlp192s0: associate with 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 17:03:47.149668 vglfr kernel: wlp192s0: RX AssocResp from 44:ac:85:c4:80:e7 (capab=0x1111 status=0 aid=1)

Dec 29 17:03:47.194627 vglfr kernel: wlp192s0: associated

Dec 29 17:03:47.198651 vglfr kernel: wlp192s0: Limiting TX power to 30 (30 - 0) dBm as advertised by 44:ac:85:c4:80:e7

Dec 29 17:06:43.729612 vglfr kernel: tun: Universal TUN/TAP device driver, 1.6

Dec 29 17:08:56.509635 vglfr kernel: wlp192s0: disconnect from AP 44:ac:85:c4:80:e7 for new auth to 44:ac:85:c4:80:e8

Dec 29 17:08:56.721087 vglfr kernel: wlp192s0: authenticate with 44:ac:85:c4:80:e8 (local address=dc:56:7b:02:b4:1f)

Dec 29 17:08:56.732681 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e8 (try 1/3)

Dec 29 17:08:59.250043 vglfr kernel: wlp192s0: authenticate with 44:ac:85:c4:80:e7 (local address=dc:56:7b:02:b4:1f)

Dec 29 17:08:59.263902 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 17:08:59.371696 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e7 (try 2/3)

Dec 29 17:08:59.390652 vglfr kernel: wlp192s0: authenticated

Dec 29 17:08:59.394648 vglfr kernel: wlp192s0: associate with 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 17:08:59.422654 vglfr kernel: wlp192s0: RX AssocResp from 44:ac:85:c4:80:e7 (capab=0x1111 status=0 aid=1)

Dec 29 17:08:59.465650 vglfr kernel: wlp192s0: associated

Dec 29 17:08:59.579693 vglfr kernel: wlp192s0: Limiting TX power to 30 (30 - 0) dBm as advertised by 44:ac:85:c4:80:e7

Dec 29 17:14:08.769642 vglfr kernel: wlp192s0: disconnect from AP 44:ac:85:c4:80:e7 for new auth to 44:ac:85:c4:80:e8

Dec 29 17:14:08.944993 vglfr kernel: wlp192s0: authenticate with 44:ac:85:c4:80:e8 (local address=dc:56:7b:02:b4:1f)

Dec 29 17:14:08.964233 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e8 (try 1/3)

Dec 29 17:14:14.042757 vglfr kernel: wlp192s0: authenticate with 44:ac:85:c4:80:e7 (local address=dc:56:7b:02:b4:1f)

Dec 29 17:14:14.052647 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 17:14:14.127632 vglfr kernel: wlp192s0: authenticated

Dec 29 17:14:14.131645 vglfr kernel: wlp192s0: associate with 44:ac:85:c4:80:e7 (try 1/3)

Dec 29 17:14:14.154651 vglfr kernel: wlp192s0: RX AssocResp from 44:ac:85:c4:80:e7 (capab=0x1111 status=0 aid=1)

Dec 29 17:14:14.197646 vglfr kernel: wlp192s0: associated

Dec 29 17:14:14.261625 vglfr kernel: wlp192s0: Limiting TX power to 30 (30 - 0) dBm as advertised by 44:ac:85:c4:80:e7

Dec 29 17:19:23.488624 vglfr kernel: wlp192s0: disconnect from AP 44:ac:85:c4:80:e7 for new auth to 44:ac:85:c4:80:e8

Dec 29 17:19:23.683665 vglfr kernel: wlp192s0: authenticate with 44:ac:85:c4:80:e8 (local address=dc:56:7b:02:b4:1f)

Dec 29 17:19:23.694733 vglfr kernel: wlp192s0: send auth to 44:ac:85:c4:80:e8 (try 1/3)

Dec 29 17:23:05.951741 vglfr kernel: INFO: task NetworkManager:1975 blocked for more than 122 seconds.

Dec 29 17:23:05.952209 vglfr kernel: Not tainted 6.12.63 #1-NixOS

Dec 29 17:23:05.952253 vglfr kernel: task:NetworkManager state:D stack:0 pid:1975 tgid:1975 ppid:1 flags:0x00000002

Dec 29 17:23:05.952278 vglfr kernel: Call Trace:

Dec 29 17:23:05.952297 vglfr kernel:

Dec 29 17:23:05.952347 vglfr kernel: __schedule+0x426/0x12c0

Dec 29 17:23:05.952370 vglfr kernel: ? drain_stock+0x68/0xa0

Dec 29 17:23:05.952389 vglfr kernel: ? __refill_stock+0x81/0x90

Dec 29 17:23:05.952407 vglfr kernel: ? generic_permission+0x39/0x220

Dec 29 17:23:05.952426 vglfr kernel: schedule+0x27/0xf0

Dec 29 17:23:05.952443 vglfr kernel: schedule_preempt_disabled+0x15/0x30

Dec 29 17:23:05.952461 vglfr kernel: __mutex_lock.constprop.0+0x3d0/0x6d0

Dec 29 17:23:05.952478 vglfr kernel: ? security_capable+0x59/0xc0

Dec 29 17:23:05.952496 vglfr kernel: rtnetlink_rcv_msg+0xff/0x3f0

Dec 29 17:23:05.952515 vglfr kernel: ? ep_autoremove_wake_function+0x23/0x50

Dec 29 17:23:05.952534 vglfr kernel: ? __wake_up_common+0x75/0xa0

Dec 29 17:23:05.952554 vglfr kernel: ? __pfx_rtnetlink_rcv_msg+0x10/0x10

Dec 29 17:23:05.952570 vglfr kernel: netlink_rcv_skb+0x50/0x100

Dec 29 17:23:05.952588 vglfr kernel: netlink_unicast+0x251/0x3a0

Dec 29 17:23:05.952622 vglfr kernel: netlink_sendmsg+0x21b/0x470

Dec 29 17:23:05.952639 vglfr kernel: ____sys_sendmsg+0x3a3/0x3e0

Dec 29 17:23:05.952657 vglfr kernel: ___sys_sendmsg+0x9a/0xe0

Dec 29 17:23:05.952674 vglfr kernel: __sys_sendmsg+0x7a/0xd0

Dec 29 17:23:05.952692 vglfr kernel: do_syscall_64+0xb7/0x200

Dec 29 17:23:05.952712 vglfr kernel: entry_SYSCALL_64_after_hwframe+0x77/0x7f

Dec 29 17:23:05.952733 vglfr kernel: RIP: 0033:0x7fd7d392577b

Dec 29 17:23:05.952750 vglfr kernel: RSP: 002b:00007ffede5ceab0 EFLAGS: 00000246 ORIG_RAX: 000000000000002e

Dec 29 17:23:05.952766 vglfr kernel: RAX: ffffffffffffffda RBX: 00005648856f8510 RCX: 00007fd7d392577b

Dec 29 17:23:05.952780 vglfr kernel: RDX: 0000000000000000 RSI: 00007ffede5ceb00 RDI: 000000000000000c

Dec 29 17:23:05.952799 vglfr kernel: RBP: 00007ffede5cead0 R08: 0000000000000000 R09: 0000000000000000

Dec 29 17:23:05.952815 vglfr kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffede5ceb00

Dec 29 17:23:05.952834 vglfr kernel: R13: 0000000000000231 R14: 00007ffede5ced0c R15: 0000000000000000

Dec 29 17:23:05.952923 vglfr kernel:

Dec 29 17:23:05.952939 vglfr kernel: INFO: task wpa_supplicant:2040 blocked for more than 122 seconds.

Dec 29 17:23:05.952954 vglfr kernel: Not tainted 6.12.63 #1-NixOS

Dec 29 17:23:05.952983 vglfr kernel: task:wpa_supplicant state:D stack:0 pid:2040 tgid:2040 ppid:1 flags:0x00000002

Dec 29 17:23:05.952992 vglfr kernel: Call Trace:

Dec 29 17:23:05.952998 vglfr kernel:

Dec 29 17:23:05.953006 vglfr kernel: __schedule+0x426/0x12c0

Dec 29 17:23:05.953013 vglfr kernel: ? schedule_hrtimeout_range_clock+0x108/0x1b0

Dec 29 17:23:05.953021 vglfr kernel: schedule+0x27/0xf0

Dec 29 17:23:05.953029 vglfr kernel: schedule_preempt_disabled+0x15/0x30

Dec 29 17:23:05.953035 vglfr kernel: __mutex_lock.constprop.0+0x3d0/0x6d0

Dec 29 17:23:05.953042 vglfr kernel: ? __nla_parse+0x24/0x30

Dec 29 17:23:05.953048 vglfr kernel: nl80211_pre_doit+0x28/0x270 [cfg80211]

Dec 29 17:23:05.953057 vglfr kernel: genl_family_rcv_msg_doit+0xda/0x150

Dec 29 17:23:05.953067 vglfr kernel: genl_rcv_msg+0x1b7/0x2c0

Dec 29 17:23:05.953075 vglfr kernel: ? __pfx_nl80211_pre_doit+0x10/0x10 [cfg80211]

Dec 29 17:23:05.953083 vglfr kernel: ? __pfx_nl80211_abort_scan+0x10/0x10 [cfg80211]

Dec 29 17:23:05.953093 vglfr kernel: ? __pfx_nl80211_post_doit+0x10/0x10 [cfg80211]

Dec 29 17:23:05.953102 vglfr kernel: ? __pfx_genl_rcv_msg+0x10/0x10

Dec 29 17:23:05.953114 vglfr kernel: netlink_rcv_skb+0x50/0x100

Dec 29 17:23:05.953119 vglfr kernel: genl_rcv+0x28/0x40

Dec 29 17:23:05.953127 vglfr kernel: netlink_unicast+0x251/0x3a0

Dec 29 17:23:05.953131 vglfr kernel: netlink_sendmsg+0x21b/0x470

Dec 29 17:23:05.953137 vglfr kernel: ____sys_sendmsg+0x3a3/0x3e0

Dec 29 17:23:05.953144 vglfr kernel: ___sys_sendmsg+0x9a/0xe0

Dec 29 17:23:05.953152 vglfr kernel: __sys_sendmsg+0x7a/0xd0

Dec 29 17:23:05.953158 vglfr kernel: do_syscall_64+0xb7/0x200

Dec 29 17:23:05.953166 vglfr kernel: entry_SYSCALL_64_after_hwframe+0x77/0x7f

Dec 29 17:23:05.953172 vglfr kernel: RIP: 0033:0x7fbb7c125734

Dec 29 17:23:05.953181 vglfr kernel: RSP: 002b:00007ffe7af5b8e8 EFLAGS: 00000202 ORIG_RAX: 000000000000002e

Dec 29 17:23:05.953189 vglfr kernel: RAX: ffffffffffffffda RBX: 0000557de3c92c50 RCX: 00007fbb7c125734

Dec 29 17:23:05.953200 vglfr kernel: RDX: 0000000000000000 RSI: 00007ffe7af5b920 RDI: 0000000000000006

Dec 29 17:23:05.953206 vglfr kernel: RBP: 00007ffe7af5b910 R08: 0000000000000000 R09: 0000000000000000

Dec 29 17:23:05.953212 vglfr kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 0000557de3d4ac00

Dec 29 17:23:05.953221 vglfr kernel: R13: 0000557de3c92b60 R14: 00007ffe7af5b920 R15: 0000000000000000

Dec 29 17:23:05.953228 vglfr kernel:

Dec 29 17:23:05.953233 vglfr kernel: INFO: task .i3status-rs-wr:2280 blocked for more than 122 seconds.

Dec 29 17:23:05.953243 vglfr kernel: Not tainted 6.12.63 #1-NixOS

Dec 29 17:23:05.953253 vglfr kernel: task:.i3status-rs-wr state:D stack:0 pid:2280 tgid:2280 ppid:2268 flags:0x00000002

Dec 29 17:23:05.953259 vglfr kernel: Call Trace:

Dec 29 17:23:05.953263 vglfr kernel:

Dec 29 17:23:05.953270 vglfr kernel: __schedule+0x426/0x12c0

Dec 29 17:23:05.953276 vglfr kernel: ? rtnl_fill_ifinfo.isra.0+0x12fe/0x1530

Dec 29 17:23:05.953285 vglfr kernel: ? __nla_validate_parse+0x5f/0xcb0

Dec 29 17:23:05.953292 vglfr kernel: schedule+0x27/0xf0

Dec 29 17:23:05.953299 vglfr kernel: schedule_preempt_disabled+0x15/0x30

Dec 29 17:23:05.953305 vglfr kernel: __mutex_lock.constprop.0+0x3d0/0x6d0

Dec 29 17:23:05.953312 vglfr kernel: ? nl80211_dump_wiphy_parse.constprop.0+0x166/0x1b0 [cfg80211]

Dec 29 17:23:05.953318 vglfr kernel: nl80211_dump_interface+0xe4/0x2a0 [cfg80211]

Dec 29 17:23:05.953325 vglfr kernel: genl_dumpit+0x33/0x90

Dec 29 17:23:05.953332 vglfr kernel: netlink_dump+0x147/0x340

Dec 29 17:23:05.953341 vglfr kernel: __netlink_dump_start+0x1eb/0x310

Dec 29 17:23:05.953352 vglfr kernel: genl_family_rcv_msg_dumpit+0x9a/0x100

Dec 29 17:23:05.953362 vglfr kernel: ? __pfx_genl_start+0x10/0x10

Dec 29 17:23:05.953369 vglfr kernel: ? __pfx_genl_dumpit+0x10/0x10

Dec 29 17:23:05.953377 vglfr kernel: ? __pfx_genl_done+0x10/0x10

Dec 29 17:23:05.953386 vglfr kernel: genl_rcv_msg+0x149/0x2c0

Dec 29 17:23:05.953394 vglfr kernel: ? __pfx_nl80211_dump_interface+0x10/0x10 [cfg80211]

Dec 29 17:23:05.953399 vglfr kernel: ? __pfx_genl_rcv_msg+0x10/0x10

Dec 29 17:23:05.953404 vglfr kernel: netlink_rcv_skb+0x50/0x100

Dec 29 17:23:05.953411 vglfr kernel: genl_rcv+0x28/0x40

Dec 29 17:23:05.953416 vglfr kernel: netlink_unicast+0x251/0x3a0

Dec 29 17:23:05.953420 vglfr kernel: netlink_sendmsg+0x21b/0x470

Dec 29 17:23:05.953426 vglfr kernel: __sys_sendto+0x1dd/0x1f0

Dec 29 17:23:05.953434 vglfr kernel: __x64_sys_sendto+0x24/0x30

Dec 29 17:23:05.953442 vglfr kernel: do_syscall_64+0xb7/0x200

Dec 29 17:23:05.953448 vglfr kernel: entry_SYSCALL_64_after_hwframe+0x77/0x7f

Dec 29 17:23:05.953455 vglfr kernel: RIP: 0033:0x7f5d8b3255ba

Dec 29 17:23:05.953461 vglfr kernel: RSP: 002b:00007ffc9f544680 EFLAGS: 00000246 ORIG_RAX: 000000000000002c

Dec 29 17:23:05.953468 vglfr kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5d8b3255ba

Dec 29 17:23:05.953474 vglfr kernel: RDX: 0000000000000014 RSI: 0000559ae0d704c0 RDI: 0000000000000015

Dec 29 17:23:05.953477 vglfr kernel: RBP: 00007ffc9f5446b0 R08: 0000000000000000 R09: 0000000000000000

Dec 29 17:23:05.953485 vglfr kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000001700000003

Dec 29 17:23:05.953489 vglfr kernel: R13: 8000000000000000 R14: 0000559ae0d704c0 R15: 0000559ae0d91b10

Dec 29 17:23:05.953495 vglfr kernel:

Dec 29 17:23:05.953501 vglfr kernel: INFO: task kworker/u129:1:6110 blocked for more than 122 seconds.

Dec 29 17:23:05.953507 vglfr kernel: Not tainted 6.12.63 #1-NixOS

Dec 29 17:23:05.953516 vglfr kernel: task:kworker/u129:1 state:D stack:0 pid:6110 tgid:6110 ppid:2 flags:0x00004000

Dec 29 17:23:05.953524 vglfr kernel: Workqueue: events_unbound cfg80211_wiphy_work [cfg80211]

Dec 29 17:23:05.953532 vglfr kernel: Call Trace:

Dec 29 17:23:05.953538 vglfr kernel:

Dec 29 17:23:05.953544 vglfr kernel: __schedule+0x426/0x12c0

Dec 29 17:23:05.953549 vglfr kernel: ? __schedule+0x42e/0x12c0

Dec 29 17:23:05.953552 vglfr kernel: ? iommu_map+0x5c/0xd0

Dec 29 17:23:05.953559 vglfr kernel: schedule+0x27/0xf0

Dec 29 17:23:05.953563 vglfr kernel: schedule_timeout+0x12f/0x160

Dec 29 17:23:05.953567 vglfr kernel: wait_for_completion+0x8a/0x160

Dec 29 17:23:05.953576 vglfr kernel: __flush_work+0x2b3/0x3b0

Dec 29 17:23:05.953583 vglfr kernel: ? __pfx_wq_barrier_func+0x10/0x10

Dec 29 17:23:05.953590 vglfr kernel: cancel_work_sync+0x5e/0x80

Dec 29 17:23:05.953599 vglfr kernel: mt7925_roc_abort_sync+0x2d/0x60 [mt7925_common]

Dec 29 17:23:05.953607 vglfr kernel: mt7925_mac_sta_remove_links.isra.0+0x1e2/0x470 [mt7925_common]

Dec 29 17:23:05.953614 vglfr kernel: mt7925_mac_sta_remove+0x32/0x70 [mt7925_common]

Dec 29 17:23:05.953620 vglfr kernel: __mt76_sta_remove+0x6a/0xc0 [mt76]

Dec 29 17:23:05.953627 vglfr kernel: mt76_sta_state+0x94/0x270 [mt76]

Dec 29 17:23:05.953635 vglfr kernel: drv_sta_state+0xf5/0x600 [mac80211]

Dec 29 17:23:05.953642 vglfr kernel: __sta_info_destroy_part2+0x198/0x1d0 [mac80211]

Dec 29 17:23:05.953649 vglfr kernel: sta_info_destroy_addr+0x33/0x40 [mac80211]

Dec 29 17:23:05.953656 vglfr kernel: ieee80211_destroy_auth_data+0x67/0xb0 [mac80211]

Dec 29 17:23:05.953662 vglfr kernel: ieee80211_sta_work+0x2b1/0x530 [mac80211]

Dec 29 17:23:05.953670 vglfr kernel: ? finish_task_switch.isra.0+0x99/0x2e0

Dec 29 17:23:05.953677 vglfr kernel: ? skb_dequeue+0x72/0x80

Dec 29 17:23:05.953682 vglfr kernel: ? ieee80211_iface_work+0x166/0x490 [mac80211]

Dec 29 17:23:05.953689 vglfr kernel: cfg80211_wiphy_work+0xef/0x160 [cfg80211]

Dec 29 17:23:05.953697 vglfr kernel: process_one_work+0x18a/0x350

Dec 29 17:23:05.953704 vglfr kernel: worker_thread+0x220/0x350

Dec 29 17:23:05.953712 vglfr kernel: ? __pfx_worker_thread+0x10/0x10

Dec 29 17:23:05.953719 vglfr kernel: kthread+0xcd/0x100

Dec 29 17:23:05.953725 vglfr kernel: ? __pfx_kthread+0x10/0x10

Dec 29 17:23:05.953731 vglfr kernel: ret_from_fork+0x31/0x50

Dec 29 17:23:05.953739 vglfr kernel: ? __pfx_kthread+0x10/0x10

Dec 29 17:23:05.953745 vglfr kernel: ret_from_fork_asm+0x1a/0x30

Dec 29 17:23:05.953750 vglfr kernel:

Dec 29 17:23:05.953754 vglfr kernel: INFO: task kworker/u128:1:8823 blocked for more than 122 seconds.

Dec 29 17:23:05.953761 vglfr kernel: Not tainted 6.12.63 #1-NixOS

Dec 29 17:23:05.953770 vglfr kernel: task:kworker/u128:1 state:D stack:0 pid:8823 tgid:8823 ppid:2 flags:0x00004000

Dec 29 17:23:05.953778 vglfr kernel: Workqueue: phy0 mt7925_roc_work [mt7925_common]

Dec 29 17:23:05.953783 vglfr kernel: Call Trace:

Dec 29 17:23:05.953787 vglfr kernel:

Dec 29 17:23:05.953791 vglfr kernel: __schedule+0x426/0x12c0

Dec 29 17:23:05.953795 vglfr kernel: ? hrtimer_try_to_cancel.part.0+0x50/0xe0

Dec 29 17:23:05.953802 vglfr kernel: ? psi_group_change+0x126/0x300

Dec 29 17:23:05.953810 vglfr kernel: schedule+0x27/0xf0

Dec 29 17:23:05.953815 vglfr kernel: schedule_preempt_disabled+0x15/0x30

Dec 29 17:23:05.953820 vglfr kernel: __mutex_lock.constprop.0+0x3d0/0x6d0

Dec 29 17:23:05.953827 vglfr kernel: mt7925_roc_work+0x39/0xa0 [mt7925_common]

Dec 29 17:23:05.953835 vglfr kernel: process_one_work+0x18a/0x350

Dec 29 17:23:05.953841 vglfr kernel: worker_thread+0x220/0x350

Dec 29 17:23:05.953845 vglfr kernel: ? __pfx_worker_thread+0x10/0x10

Dec 29 17:23:05.953850 vglfr kernel: kthread+0xcd/0x100

Dec 29 17:23:05.953855 vglfr kernel: ? __pfx_kthread+0x10/0x10

Dec 29 17:23:05.953860 vglfr kernel: ret_from_fork+0x31/0x50

Dec 29 17:23:05.953866 vglfr kernel: ? __pfx_kthread+0x10/0x10

Dec 29 17:23:05.953872 vglfr kernel: ret_from_fork_asm+0x1a/0x30

Dec 29 17:23:05.953877 vglfr kernel:

Dec 29 17:23:05.953883 vglfr kernel: INFO: task kworker/u129:7:13048 blocked for more than 122 seconds.

Dec 29 17:23:05.953893 vglfr kernel: Not tainted 6.12.63 #1-NixOS

Dec 29 17:23:05.953902 vglfr kernel: task:kworker/u129:7 state:D stack:0 pid:13048 tgid:13048 ppid:2 flags:0x00004000

Dec 29 17:23:05.953913 vglfr kernel: Workqueue: events_power_efficient reg_check_chans_work [cfg80211]

Dec 29 17:23:05.953920 vglfr kernel: Call Trace:

Dec 29 17:23:05.953925 vglfr kernel:

Dec 29 17:23:05.953930 vglfr kernel: __schedule+0x426/0x12c0

Dec 29 17:23:05.953934 vglfr kernel: ? blake2s_final+0x53/0x90

Dec 29 17:23:05.953942 vglfr kernel: schedule+0x27/0xf0

Dec 29 17:23:05.953948 vglfr kernel: schedule_preempt_disabled+0x15/0x30

Dec 29 17:23:05.953956 vglfr kernel: __mutex_lock.constprop.0+0x3d0/0x6d0

Dec 29 17:23:05.953959 vglfr kernel: reg_check_chans_work+0x31/0x580 [cfg80211]

Dec 29 17:23:05.953967 vglfr kernel: ? crng_reseed+0xf0/0x190

Dec 29 17:23:05.953975 vglfr kernel: process_one_work+0x18a/0x350

Dec 29 17:23:05.953982 vglfr kernel: worker_thread+0x220/0x350

Dec 29 17:23:05.953986 vglfr kernel: ? __pfx_worker_thread+0x10/0x10

Dec 29 17:23:05.953992 vglfr kernel: kthread+0xcd/0x100

Dec 29 17:23:05.953997 vglfr kernel: ? __pfx_kthread+0x10/0x10

Dec 29 17:23:05.954002 vglfr kernel: ret_from_fork+0x31/0x50

Dec 29 17:23:05.954005 vglfr kernel: ? __pfx_kthread+0x10/0x10

Dec 29 17:23:05.954009 vglfr kernel: ret_from_fork_asm+0x1a/0x30

Dec 29 17:23:05.954012 vglfr kernel:

Dec 29 17:23:05.954020 vglfr kernel: INFO: task openvpn:44647 blocked for more than 122 seconds.

Dec 29 17:23:05.954024 vglfr kernel: Not tainted 6.12.63 #1-NixOS

Dec 29 17:23:05.954033 vglfr kernel: task:openvpn state:D stack:0 pid:44647 tgid:44647 ppid:1 flags:0x00000002

Dec 29 17:23:05.954071 vglfr kernel: Call Trace:

Dec 29 17:23:05.954075 vglfr kernel:

Dec 29 17:23:05.954078 vglfr kernel: __schedule+0x426/0x12c0

Dec 29 17:23:05.954081 vglfr kernel: ? __schedule+0x42e/0x12c0

Dec 29 17:23:05.954086 vglfr kernel: ? get_nohz_timer_target+0x2f/0x140

Dec 29 17:23:05.954090 vglfr kernel: schedule+0x27/0xf0

Dec 29 17:23:05.954094 vglfr kernel: schedule_preempt_disabled+0x15/0x30

Dec 29 17:23:05.954097 vglfr kernel: __mutex_lock.constprop.0+0x3d0/0x6d0

Dec 29 17:23:05.954102 vglfr kernel: ? security_capable+0x59/0xc0

Dec 29 17:23:05.954106 vglfr kernel: rtnetlink_rcv_msg+0xff/0x3f0

Dec 29 17:23:05.954109 vglfr kernel: ? __mod_memcg_lruvec_state+0x9c/0x150

Dec 29 17:23:05.954114 vglfr kernel: ? __pfx_rtnetlink_rcv_msg+0x10/0x10

Dec 29 17:23:05.954118 vglfr kernel: netlink_rcv_skb+0x50/0x100

Dec 29 17:23:05.954123 vglfr kernel: netlink_unicast+0x251/0x3a0

Dec 29 17:23:05.954126 vglfr kernel: netlink_sendmsg+0x21b/0x470

Dec 29 17:23:05.954130 vglfr kernel: ____sys_sendmsg+0x3a3/0x3e0

Dec 29 17:23:05.954133 vglfr kernel: ___sys_sendmsg+0x9a/0xe0

Dec 29 17:23:05.954138 vglfr kernel: __sys_sendmsg+0x7a/0xd0

Dec 29 17:23:05.954142 vglfr kernel: do_syscall_64+0xb7/0x200

Dec 29 17:23:05.954147 vglfr kernel: entry_SYSCALL_64_after_hwframe+0x77/0x7f

Dec 29 17:23:05.954151 vglfr kernel: RIP: 0033:0x7f920f325734

Dec 29 17:23:05.954154 vglfr kernel: RSP: 002b:00007ffc6f96d5e8 EFLAGS: 00000202 ORIG_RAX: 000000000000002e

Dec 29 17:23:05.954159 vglfr kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f920f325734

Dec 29 17:23:05.954163 vglfr kernel: RDX: 0000000000000000 RSI: 00007ffc6f96d640 RDI: 0000000000000003

Dec 29 17:23:05.954168 vglfr kernel: RBP: 00007ffc6f9716c0 R08: 0000000000000000 R09: 0000000000000020

Dec 29 17:23:05.954171 vglfr kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000

Dec 29 17:23:05.954176 vglfr kernel: R13: 0000000000000004 R14: 00007ffc6f96d624 R15: 00007ffc6f96d614

Dec 29 17:23:05.954180 vglfr kernel:

Dec 29 17:23:05.954183 vglfr kernel: INFO: task ThreadPoolForeg:46505 blocked for more than 122 seconds.

Dec 29 17:23:05.954187 vglfr kernel: Not tainted 6.12.63 #1-NixOS

Dec 29 17:23:05.954194 vglfr kernel: task:ThreadPoolForeg state:D stack:0 pid:46505 tgid:46487 ppid:46466 flags:0x00000002

Dec 29 17:23:05.954197 vglfr kernel: Call Trace:

Dec 29 17:23:05.954201 vglfr kernel:

Dec 29 17:23:05.954205 vglfr kernel: __schedule+0x426/0x12c0

Dec 29 17:23:05.954208 vglfr kernel: ? __submit_bio+0x1b5/0x280

Dec 29 17:23:05.954211 vglfr kernel: ? __rmqueue_pcplist+0x73/0x1140

Dec 29 17:23:05.954215 vglfr kernel: schedule+0x27/0xf0

Dec 29 17:23:05.954219 vglfr kernel: schedule_preempt_disabled+0x15/0x30

Dec 29 17:23:05.954223 vglfr kernel: __mutex_lock.constprop.0+0x3d0/0x6d0

Dec 29 17:23:05.954227 vglfr kernel: ? __pfx_rtnl_dump_all+0x10/0x10

Dec 29 17:23:05.954232 vglfr kernel: rtnl_dumpit+0x74/0xa0

Dec 29 17:23:05.954235 vglfr kernel: netlink_dump+0x147/0x340

Dec 29 17:23:05.954239 vglfr kernel: __netlink_dump_start+0x1eb/0x310

Dec 29 17:23:05.954243 vglfr kernel: ? __pfx_rtnl_dump_all+0x10/0x10

Dec 29 17:23:05.954246 vglfr kernel: rtnetlink_rcv_msg+0x2ae/0x3f0

Dec 29 17:23:05.954249 vglfr kernel: ? __pfx_rtnl_dumpit+0x10/0x10

Dec 29 17:23:05.954254 vglfr kernel: ? __pfx_rtnl_dump_all+0x10/0x10

Dec 29 17:23:05.954258 vglfr kernel: ? __pfx_rtnetlink_rcv_msg+0x10/0x10

Dec 29 17:23:05.954262 vglfr kernel: netlink_rcv_skb+0x50/0x100

Dec 29 17:23:05.954267 vglfr kernel: netlink_unicast+0x251/0x3a0

Dec 29 17:23:05.954270 vglfr kernel: netlink_sendmsg+0x21b/0x470

Dec 29 17:23:05.954273 vglfr kernel: __sys_sendto+0x1dd/0x1f0

Dec 29 17:23:05.954278 vglfr kernel: __x64_sys_sendto+0x24/0x30

Dec 29 17:23:05.954283 vglfr kernel: do_syscall_64+0xb7/0x200

Dec 29 17:23:05.954286 vglfr kernel: entry_SYSCALL_64_after_hwframe+0x77/0x7f

Dec 29 17:23:05.954291 vglfr kernel: RIP: 0033:0x7f3505f25874

Dec 29 17:23:05.954297 vglfr kernel: RSP: 002b:00007f34f1bf7740 EFLAGS: 00000246 ORIG_RAX: 000000000000002c

Dec 29 17:23:05.954301 vglfr kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3505f25874

Dec 29 17:23:05.954305 vglfr kernel: RDX: 0000000000000014 RSI: 00007f34f1bf8870 RDI: 0000000000000061

Dec 29 17:23:05.954310 vglfr kernel: RBP: 00007f34f1bf7780 R08: 00007f34f1bf8814 R09: 000000000000000c

Dec 29 17:23:05.954314 vglfr kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000061

Dec 29 17:23:05.954316 vglfr kernel: R13: 00007f34f1bf8870 R14: 00007f34f1bf8830 R15: 00000000ee1c49c8

Dec 29 17:23:05.954320 vglfr kernel:

Dec 29 17:23:05.954325 vglfr kernel: INFO: task kworker/u128:3:46765 blocked for more than 122 seconds.

Dec 29 17:23:05.954330 vglfr kernel: Not tainted 6.12.63 #1-NixOS

Dec 29 17:23:05.954337 vglfr kernel: task:kworker/u128:3 state:D stack:0 pid:46765 tgid:46765 ppid:2 flags:0x00004000

Dec 29 17:23:05.954341 vglfr kernel: Workqueue: ipv6_addrconf addrconf_verify_work

Dec 29 17:23:05.954346 vglfr kernel: Call Trace:

Dec 29 17:23:05.954349 vglfr kernel:

Dec 29 17:23:05.954354 vglfr kernel: __schedule+0x426/0x12c0

Dec 29 17:23:05.954358 vglfr kernel: schedule+0x27/0xf0

Dec 29 17:23:05.954361 vglfr kernel: schedule_preempt_disabled+0x15/0x30

Dec 29 17:23:05.954363 vglfr kernel: __mutex_lock.constprop.0+0x3d0/0x6d0

Dec 29 17:23:05.954368 vglfr kernel: addrconf_verify_work+0x12/0x30

Dec 29 17:23:05.954372 vglfr kernel: process_one_work+0x18a/0x350

Dec 29 17:23:05.954375 vglfr kernel: worker_thread+0x220/0x350

Dec 29 17:23:05.954378 vglfr kernel: ? __pfx_worker_thread+0x10/0x10

Dec 29 17:23:05.954380 vglfr kernel: ? __pfx_worker_thread+0x10/0x10

Dec 29 17:23:05.954385 vglfr kernel: kthread+0xcd/0x100

Dec 29 17:23:05.954389 vglfr kernel: ? __pfx_kthread+0x10/0x10

Dec 29 17:23:05.954393 vglfr kernel: ret_from_fork+0x31/0x50

Dec 29 17:23:05.954396 vglfr kernel: ? __pfx_kthread+0x10/0x10

Dec 29 17:23:05.954400 vglfr kernel: ret_from_fork_asm+0x1a/0x30

Dec 29 17:23:05.954403 vglfr kernel:

Dec 29 17:23:05.954407 vglfr kernel: INFO: task kworker/u128:0:48737 blocked for more than 122 seconds.

Dec 29 17:23:05.954421 vglfr kernel: task:kworker/u128:0 state:D stack:0 pid:48737 tgid:48737 ppid:2 flags:0x00004000

Dec 29 17:23:05.954424 vglfr kernel: Workqueue: mt76 mt7925_mac_reset_work [mt7925_common]

Dec 29 17:23:05.954429 vglfr kernel: Call Trace:

Dec 29 17:23:05.954432 vglfr kernel:

Dec 29 17:23:05.954437 vglfr kernel: __schedule+0x426/0x12c0

Dec 29 17:23:05.954441 vglfr kernel: ? kobject_uevent_env+0x179/0x6f0

Dec 29 17:23:05.954445 vglfr kernel: ? kfree+0x33a/0x410

Dec 29 17:23:05.954452 vglfr kernel: schedule+0x27/0xf0

Dec 29 17:23:05.954456 vglfr kernel: schedule_preempt_disabled+0x15/0x30

Dec 29 17:23:05.954461 vglfr kernel: __mutex_lock.constprop.0+0x3d0/0x6d0

Dec 29 17:23:05.954464 vglfr kernel: mt7925_mac_reset_work+0x85/0x170 [mt7925_common]

Dec 29 17:23:05.954468 vglfr kernel: process_one_work+0x18a/0x350

Dec 29 17:23:05.954473 vglfr kernel: worker_thread+0x220/0x350

Dec 29 17:23:05.954477 vglfr kernel: ? __pfx_worker_thread+0x10/0x10

Dec 29 17:23:05.954480 vglfr kernel: kthread+0xcd/0x100

Dec 29 17:23:05.954484 vglfr kernel: ? __pfx_kthread+0x10/0x10

Dec 29 17:23:05.954487 vglfr kernel: ret_from_fork+0x31/0x50

Dec 29 17:23:05.954491 vglfr kernel: ? __pfx_kthread+0x10/0x10

Dec 29 17:23:05.954495 vglfr kernel: ret_from_fork_asm+0x1a/0x30

Dec 29 17:23:05.954498 vglfr kernel:

Looks like every 5 minutes adapter tries to hop to a better Bssid at which point deadlock kicks in. Sometimes on a first attempt, sometimes after several hours, sometimes does not appear at all.

My quickfix is pinning Bssid in Nmtui. Bssid switch attempts stopped, everything’s fine. Will update if it fails again.

Here is a second patch. Now that my kernel is no longer panicing, I caught the deadlock from the race around different reset events like auth errors and reconnections. I’m not 100% this is the most correct way to fix it because I’m not a great expert on this module’s code, but if things work the way I assume (and AI confirms), this should be right. Always weird with races and re-entrancy because you have to understand the bigger systems at play, but this is definitely better than the racy code that is there right now, and the worst case this creates a different deadlock from the one that exists. Anyways, it seems to be working better here with this patch.

1 Like

Yeah your stack trace is exactly the same issue I’m seeing. My two patches should solve it.

I created a repo with these patches until they get merged upstream.

1 Like

I kept digging and found other similar bugs that have existed in this code since 2023, and created another patch to fix some other races from assuming locks were in places they were not. This driver forked from a driver with the same bugs for an older chipset, but the really old MediaTek chipset has the right patterns for holding the locks correctly. The code is a ticking time bomb of pain. If you stop BSSID switches, you might be able to mitigate it partially, but you won’t fully mitigate other types of resets. I had Claude go analyze all other wireless drivers in Linux to see what they do around these same APIs, and it shows they all have more correct patterns (either using mutexes or atomics if they don’t sleep). MediaTek mt7921 driver is also broken for the same bugs (if someone googles the same errors and ends up here).

I also sent this work as a PR to openwrt since the maintainers use that repo, too, to maintain this work under BSD3.

Ok all of this has lead to 10 total patches around this driver. 4 critical real race conditions and null dereferences and the other 6 are hopefully just edge case handling around checking returns for nulls and other errors.

I can’t stop. 11 patches total.

1 Like

Wow you discovered the bug only 3 days ago and submitted 17 patches now!?

Probably because you have outdated kernel before this bug was introduced

Did you figure out when the crash bug got introduced? I also haven’t seen any crashes myself.

1 Like

Fyi @Mario_Limonciello

I haven’t seen any of this myself either, but glad you have come up with a reproducer and a ton of fixes!

The Linux wifi code is generally pretty badly written code. There are, in particular, a lot of problems in relation to wifi disconnects. I hope those patches are accepted.

It looks like it been there since the start. Things that can aggravate the race issues are:

  1. Enter the wrong Wi-Fi pre-share key and get auth errors over and over, or have a rogue AP with the SSID keep rejecting you. Likely to lose the race somewhere and either crash or deadlock somewhere.
  2. Have mesh network or multiple APs you can roam between with the same SSID. If your AP has features to steer you based on strength or load to a different AP, this can really piss things off with reauths. My home has 3 APs, and my desktop is right between 2 of them. My wife turns on the microwave and I get steered to the other one. Boom, panic or kernel deadlock.
  3. The really big trigger is having a Wifi 7 AP with MLO support enabled (like the Ubiquti 7 Pro). The constant state transitions of each band connecting and disconnecting can really piss this thing off, and you can lose the race and hit a null dereference really quickly. MLO support is just barely working it seems and all that code isn’t well veted
  4. Do something annoying with Bluetooth that causes the whole device to reset and crash in the kernel in the wifi module as the interface goes up and down from the node reset from these same bugs.
1 Like

The crash dump sent me directly to the bug. Fixing the null deref was easy. Then I got a bad code smell vibe from things, and wanted to know the underlying case why I could get into that state where it could be null at all. Lots of AI tools (claude, cursor, etc), calling to generate a metric ton of docs explaining the code and researching other crashes found via google and other searches to find other crashes. Then I went into deep analysis mode and had my agent generate stress tests to reproduce the race conditions from the logs I saw. That exposed new issues. Lots of rebooting and going back to the AI to help analyze new dumps.

In the later patches, I basically used Claude and Gemini to go through things line by line. I even had my AI validate all changes by comparing to other drivers, how other new drivers handle MLO state change logic, to look at API usage of the same kernel APIs in other drivers, etc.

Basically, a whole lot of well-informed prompts. My advantage here is I’m a former kernel dev and Bluetooth expert, and I know my way around a bit and what to look for.

The best part is MediaTek already sent another patch that fixes one of the 17 things in a similar but slightly different way, fixing one of the crash cases. A lot of these fixes are defensive and may never trigger, but a lot more code will look before leaps and won’t crash if something unexpected happens now.

1 Like