Framework 16 Fails to Resume From Hibernate

@Charlie_6 According to callahad, downgrading to 6.10 should work.

Since we mailed the issue to the maintainer and regressions@kernel.org, I tried this on 715ca9dd687f89ddaac8ec8ccb3b5e5a30311a99 from torvalds/linux (we’ll be asked for this anyway), and got a similar error, with slightly different output:

[ 59.152179] R13: ffffffffc232d278 R14: ffffffffc232d278 R15: ffff8dd293661050
[ 59.152190] FS: 0000000000000000(0000) GS:ffff8de15fd00000(0000) kn1GS:0000000000000000
[ 59.152203] CS: 0010 DS: 0000 ES: 0000 CRO: 0000000080050033
[ 59.152213] CR2: 0000000000000000 CR3: 0000000f47022000 CR4: 0000000000f50ef0
[ 59.152225] PKRU: 55555554
[ 59.152232] Call Trace:
[ 59.152241]
[ 59.152250] ? __die_body.cold+0x19/0x27
[ 59.152265] ? die_addr+0x3c/0x60
[ 59.152277] ? exc_general_protection+0x17d/0x480
[59.152290] ? ep_poll_callback+0x24d/0x2a0
[ 59.152308) ? asm_exc_general_protection+0x26/0x30
[ 59.152334) ? hci_unregister_dev+0x45/8x1f0 [bluetooth 32e96f6383663851b5f844c13363e0f147e537f6)
[ 59.152388] ? hci_unregister_dev+0x3e/0x1f0 [bluetooth 32e96f6383663851b5f844c13363e0f147e537f6]
[ 59.152438] btusb_disconnect+0x67/0x170 [btusb 592e11ea3c86183de886179434a855630ccda5d9]
[ 59.152457] usb_unbind_interface+0x90/8×290
[ 59.152475] device_release_driver_internal+0x19c/0x200
[ 59.152492] usb_forced_unbind_intf+0x75/0xb0
[ 59.152506] unbind_marked_interfaces.isra.0+0×59/0×80
[ 59.152520] ? __pfx_usb_dev_restore+0x10/0x10
[ 59.152535] usb_resume+0x5a/0x60
[ 59.152544] dpm_run_callback+0x47/0x150
[ 59.152559] device_resume+0xb0/0x280
[ 59.152572] async_resume+0x1d/0x30
[ 59.152584] async_run_entry_fn+0x31/0×140
[ 59.152597) process_one_work+0x17b/0x330
[ 59.152612] worker_thread+0x2ce/0x3f0
[ 59.152626] ? __pfx_worker_thread+0x18/0x10
[ 59.152637) kthread+0xcf/0x108
[ 59.152649] ? __pfx_kthread+0x10/0x10
[ 59.152663] ret_from_fork+0x31/0×50
[59.152673] ? __pfx_kthread+0x10/0x10
59.152685) ret_from_fork_asm+0x1a/0x30
[ 59.152705]
[59.152711) Modules linked in: snd_seq_dummy rfcomm snd_hrtimer snd_seq snd_seq_device ccm algif_aead crypto_null des3_ede_x86_64 des_generic libdes algif_sk
[ 60.115414] [drm] ring gfx_32772.1.1 was addedmes_kiq_3.1.8 uses VM inv eng 13 on hub Øvailable 28 90 cb c1 e8 42 fc 05 d6 48 8b 43 88 48 8b 13 <48> 3b 18 Of
[ 60.117332] [drm] ring compute_32772.2.2 was added
[ 68.118682] [drm] ring sdma_32772.3.3 was added
[ 60.119395] [drm] ring gfx_32772.1.1 ib test pass
[ 60.122263] [drm] ring compute_32772.2.2 ib test pass
[ 60.123241] [drm] ring sdma_32772.3.3 ib test pass
[ 60.133435] usb 1-4.3: reset full-speed USB device number 8 using xhci_hcd
[ 60.228314] usb 1-4.3: unable to get BOS descriptor set
[ 61.627046] mt7921e 0000:04:00.0: Message 00020007 (seq 8) timeout
[ 61.629043) mt7921e 8808:04:00.0: PM: dpm_run_callback(): pci_pm_restore returns -110
[ 61.629834] mt7921e 0000:04:00.0: PM: failed to restore async: error -110
[ 61.787394] mt7921e 0000:04:00.0: HW/SW Version: 8x8a108a10, Build Time: 20240716163242a [ 61.707394]
[ 62.081740] mt7921e 0000:04:00.0: WM Firmware Version: ____000000, Build Time: 20240716163327

Yes, hibernation worked perfectly on 6.10; the regression was introduced with 6.11.

Unfortunately, I cannot easily bisect between 6.10 and 6.11 because I’m running bcachefs which changed its on-disk format with kernel 6.11. Very happy to test patches, though!

Dropping this script into /etc/systemd/system-sleep/rfkill-before-hibernate.sh seems to be a decent workaround in the meantime.

#!/bin/sh
if [ "$1-$SYSTEMD_SLEEP_ACTION" = "pre-hibernate" ]; then
  /usr/sbin/rfkill block bluetooth
fi

Thorsten noted over email that this seems similar to
the symptoms that Marc Payne (now CCed) reported a while ago:
https://lore.kernel.org/linux-bluetooth/ZsTh7Jyug7MbZsLE@mdpsys.co.uk/

And notes an abandoned patch for this:
https://lore.kernel.org/all/20240822052310.25220-1-hao.qin@mediatek.com/

The patch does not work for me on torvalds/linux HEAD as of an hour ago, but perhaps it will work for you. I’ll post a link to lore or lkml once I can find the conversation in an index.

Does 6.11.4 help? It has a fix for a bluetooth suspend problem.

I first observed the issue on 6.11.4, so unfortunately not.

The patch referenced in [PATCH] Bluetooth: btmtk: Remove resetting mt7921 before downloading the fw - Hao Qin does work for me, insofar as I am able to successfully resume from hibernation on kernels with it applied. I tested the patch on both 6.11.4 and 6.12-rc3 and observed the same result.

That’s not to say everything works: I’m having trouble discovering new devices, but haven’t fully investigated it. Maybe it also fails without the patch? Who knows, will investigate further. this seems like a known regression, likely fixed in 6.12-rc4?

Things I’ve observed so far:

Trying discoverable on or scan on in bluetoothctl always fails with Failed to start discovery: org.bluez.Error.InProgress and I’ll see bluetoothd[1145]: Failed to set mode: Busy (0x0a) in the system journal.

Coming out of hibernation, I see the following in the journal:

kernel: mt7921e 0000:05:00.0: Message 00020007 (seq 12) timeout
kernel: mt7921e 0000:05:00.0: PM: dpm_run_callback(): pci_pm_restore returns -110
kernel: mt7921e 0000:05:00.0: PM: failed to restore async: error -110
kernel: PM: hibernation: Basic memory bitmaps freed
kernel: OOM killer enabled.
kernel: Restarting tasks ... done.
kernel: PM: hibernation: hibernation exit
kernel: rfkill: input handler enabled
kernel: rfkill: input handler disabled
bluetoothd[1145]: Controller resume with wake event 0x0
[...]
kernel: Bluetooth: hci0: Device setup in 2378080 usecs
kernel: Bluetooth: hci0: HCI Enhanced Setup Synchronous Connection command is advertised, but not supported.
bluetoothd[1145]: Battery Provider Manager created
kernel: Bluetooth: MGMT ver 1.23
bluetoothd[1145]: src/device.c:device_set_wake_support() Unable to set wake_support without RPA resolution

Elsewhere, the hardware identifies itself as:

kernel: mt7921e 0000:05:00.0: ASIC revision: 79220010
kernel: Bluetooth: hci0: HW/SW Version: 0x008a008a, Build Time: 20240716163633
kernel: mt7921e 0000:05:00.0: HW/SW Version: 0x8a108a10, Build Time: 20240716163242a
kernel: mt7921e 0000:05:00.0: WM Firmware Version: ____000000, Build Time: 20240716163327

Kernel mailing list link: [REGRESSION] bluetooth: mt7921: Crash on Resume From Suspend And Hibernate

@callahad, in case you want to say something there, since it worked for you.

I’ll try to see if this patch fixes the issue after deleting everything-- maybe something just didn’t get rebuilt. I see this when I start the build:

====================================== User Patches ======================================
 -> Applying user patch 0001-btusb-hao.patch...
patching file drivers/bluetooth/btmtk.c"

I double-checked the paths in the EFI and datestamps on vmlinuz-git+initramfs-git, and it appears to be running what was built this morning with the patch… There could also be something wrong with AUR (en) - linux-git. It’s also possible that something else broke between the last RC and HEAD, but probably somewhat more likely that something is wrong between the keyboard and the chair, as they say.

EDIT: I also tried 6.12-rc3 and read the code myself to make sure the patch was applied correctly by my build script. It was. I just don’t know what to say-- this patch doesn’t fix it for me. Very strange.

@callahad Are you sure that you don’t have any fixes running, such as your sleep hook that runs rfkill?

I’m 110% certain that none of the workarounds were present in the attempts that worked; I did double-check that the rfkill script was not present on patched attempts, but I’m also running NixOS, so the system state is fully declarative and explicit.

I’ve been reluctant to jump into the mailing list until I can properly bisect (which will hopefully be pretty easy on NixOS?) The main blocker there is reformatting to a sensible filesystem that works in both 6.10 and 6.11. Should have a new NVMe drive I can work with in a day or two.

Huh. Just tested with 6.11.5 and 6.12-rc4. Both fail to resume with or without the patch. The rfkill script is still a functional workaround.

Will start bisecting over the weekend.

1 Like

Which solution is better? patch or stay at 6.10?

Per the most recent discussion, the kernel patch does not appear to resolve the issue on mainline with this hardware.

Among doing nothing, the rfkill script, and staying at 6.10, that probably depends on your distro and personality. I wouldn’t want to use a distro like Arch too far from mainline-- things will start to break pretty quickly, so my plan is to just remember to press the airplane button before closing the lid until this is fixed. The way I’ve got things set up, it sleeps for an hour before hibernating, so I can always just wake it up and press the button again if I forget. I don’t want to install a sleep hook, because I’ll probably just forget about it and then wonder why my laptop always wakes up with bluetooth off a year from now.

I’d be happy to offer some guidance if you care to share exactly what you’re running.

1 Like

FWIW, it seems like the sleep hook runs after the systemd service that persists wireless state (systemd-rfkill.service) so my laptop resumes from hibernation with bluetooth on if it was on when the hibernation sequence began.

Working on bisecting, but things are a failing in unexpected ways (some revs won’t boot at all, others won’t finish entering hibernation, etc.) Slowly working through it.

Not a solution, rather a workaround. I tossed something together that is testing nicely on Fedora 41 (6.11 kernel)

  • Before this, suspend is a no go with Bluetooth enabled.

  • This script will setup both a rfkill and unblock for Suspend and Hibernate. It’s a workaround.

#!/bin/bash

# Define service files and paths
SUSPEND_SERVICE="/etc/systemd/system/bluetooth-rfkill-suspend.service"
RESUME_SERVICE="/etc/systemd/system/bluetooth-rfkill-resume.service"

# Create and configure the suspend service
echo "Setting up bluetooth-rfkill-suspend.service..."
sudo tee "$SUSPEND_SERVICE" > /dev/null <<EOF
[Unit]
Description=Soft block Bluetooth on suspend/hibernate
Before=sleep.target
StopWhenUnneeded=yes

[Service]
Type=oneshot
ExecStart=/usr/sbin/rfkill block bluetooth
ExecStartPost=/bin/sleep 3
RemainAfterExit=yes

[Install]
WantedBy=suspend.target hibernate.target suspend-then-hibernate.target
EOF

# Create and configure the resume service
echo "Setting up bluetooth-rfkill-resume.service..."
sudo tee "$RESUME_SERVICE" > /dev/null <<EOF
[Unit]
Description=Unblock Bluetooth on resume
After=suspend.target hibernate.target suspend-then-hibernate.target

[Service]
Type=oneshot
ExecStart=/usr/sbin/rfkill unblock bluetooth

[Install]
WantedBy=suspend.target hibernate.target suspend-then-hibernate.target
EOF

# Reload systemd to recognize the new services
echo "Reloading systemd..."
sudo systemctl daemon-reload

# Enable both services
echo "Enabling bluetooth-rfkill-suspend.service..."
sudo systemctl enable bluetooth-rfkill-suspend.service

echo "Enabling bluetooth-rfkill-resume.service..."
sudo systemctl enable bluetooth-rfkill-resume.service

# Suggest reboot to finalize setup
echo "Both Bluetooth rfkill services have been set up and enabled successfully."
echo "It's recommended to reboot now to apply the changes."

# Prompt for reboot confirmation
read -p "Would you like to reboot now? (y/n): " response
if [[ "$response" == "y" || "$response" == "Y" ]]; then
    sudo reboot
else
    echo "Reboot skipped. Please remember to reboot later to finalize the setup."
fi

Any reason to reboot instead of systemctl enable x.service --now?

Preference. Enable and starting services or restarting services is “fine” until something misfires and fails to be caught, then becoming a ticket. Rare, but happens.

Mediatek have released some new wifi and bluetooth firmware. The MT7922 is in the FW16.
Maybe someone who is seeing this problem can test.

I picked up the latest linux-firmware from Arch, which is dated 2024-11-11, but got a similar panic screen on resume from hibernate. The panic includes the mt7921e firmware build time, which is listed as 20240716163327. I suppose it’s likely that Arch hasn’t picked up this firmware yet somehow. Strange. Am I missing something here?

EDIT: I see this was merged on 11/12, a day after Arch built their firmware release. I’ll wait for the one this week and try it out then.

Sorry for the delay; had a bit of a family emergency.

I can confirm that resuming from hibernation works correctly with the 20241106... firmware builds (on kernel 6.12.1). Unfortunately, those just barely missed the November linux-firmware release, so most folks will be waiting for a few more weeks before things Just Work again.

I’m not totally happy using new firmware to fix a software regression, but it does work, and I’m going to be tracking the latest kernel anyway, so this is a Good Enough resolution for me.

I did a suspend to disk (lid down) on 6.12 Debian testing and it resumed and than crashed (could only get a photo :frowning: )

at +56 lib/list_debug.c:

I might try bumping up my mt7921e firmware as it seems old:

dmesg | grep mt7921e
[   24.830459] mt7921e 0000:01:00.0: ASIC revision: 79220010
[   24.908656] mt7921e 0000:01:00.0: HW/SW Version: 0x8a108a10, Build Time: 20240716163242a
[   25.280718] mt7921e 0000:01:00.0: WM Firmware Version: ____000000, Build Time: 20240716163327
[   26.398263] mt7921e 0000:01:00.0 wlp1s0: renamed from wlan0

Also

ethtool -i wlp1s0
driver: mt7921e
version: 6.12.0-amw-dirty
firmware-version: ____000000-20240716163327
expansion-rom-version: 
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no

Hmm non of the images in linux mediatek firmware tree seem to have a name with mt7921 ??

I updated as per How to update Media Tek USB wifi mt7921 firmware
and reboot but still have:

ethtool -i wlp1s0
...
firmware-version: ____000000-20240716163327
...

But it’s fetching out of initrd …

Quickly do

update-initramfs -v -u

and reboot but still getting:

firmware-version: ____000000-20240716163327