[RESPONDED] FW13 AMD Fedora 39: System clock advances 50+ years during overnight suspend

Mario_Limonciello · November 16, 2023, 5:55pm

that the EC can, in general, have an indirect effect on RTC behavior/use during s2idle .

IIRC the Framework EC is connected over eSPI, which it’s possible to read RTC time values through. Given all these failures are happening around the s2idle sequence is it plausible that it’s requesting RTC time values at the same time as Linux is?

Thomas_Weissschuh · November 16, 2023, 6:26pm

Maybe it’s relevant:

The probing of cros_ec_lpc fails.
The ID read via MEC is “0x00 0x00” and via non-MEC it’s “0xff 0xff”.

I applied the provided patch, maybe I can reproduce it.

qemu-system-x86_64 · November 16, 2023, 6:47pm

Johnny ?

jwp · November 16, 2023, 10:16pm

Yeah as far as I can tell the framework ec_sros_lpc patches that went in sometime around the 6.2 series don’t support the newer ec in the amd framework.

They are certainly not in any of the mainline trees if they exist at all. Have asked if ec_cros_lpc loads with the magic OEM kernel people mention for the ubuntu distro. But I haven’t found anything in any of the trees i’ve looked through.

There is a ec_tool efi loadable i’ve tried and it also doesn’t support the ec on the amd framework; spitting out invalid checksum.

dimitris · November 16, 2023, 10:25pm

Just to check, did you mean cros_ec_lpcs?

$ lsmod |grep cros
cros_ec_lpcs           20480  0
cros_ec                20480  1 cros_ec_lpcs

$ dmesg |grep cros_ec
[   20.641961] cros_ec_lpcs cros_ec_lpcs.0: EC ID not detected

Mario_Limonciello · November 16, 2023, 10:27pm

The cros-ec support for Framework AMD is this patch series: [PATCH v1 0/4] cros_ec: add support for newer versions of the Framework Laptop (kernel.org)

jwp · November 16, 2023, 10:34pm

Thanks @Mario_Limonciello my google foo is not as good as yours.

This is still out of tree for 6.7 currently yah?

jwp · November 16, 2023, 10:35pm

@dimitris ; yup - dyslexia strikes again

Mario_Limonciello · November 16, 2023, 10:41pm

Well it helps that I was CC’ed on the series

Yes, Dustin didn’t submit a v2 AFAIK to take into account the trivial review feedback.

Mario_Limonciello · November 17, 2023, 12:06am

I noticed that I linked the wrong debugging patch (sorry!). I edited the post.
So if anyone has built a kernel with it, please pick it again and rebuild.

The patch that is linked significantly increases the number of iterations mc146818_avoid_UIP will try and logs when it’s over 100. With this patch in place if you have reproduced the issue you’ll see a warning in your logs:

reading the RTC time required %d loop iterations

But hopefully your clock doesn’t jump forward. Please share logs with that patch in place to see how many iterations it required.

Matt_Hartley · November 17, 2023, 12:21am

Looks Mario has begun trucking through this, but I am CCing to engineering now.

Mario_Limonciello · November 20, 2023, 2:23pm

I’ve sent this series up to the mailing list for this issue.

https://lore.kernel.org/linux-rtc/20231120141555.458-1-mario.limonciello@amd.com/T/#m5234a9a5cd4c320efa69fc591d626efa89c5bf5d

I have never reproduced the issue though so please let me know if you reproduced it with that patch series applied.

Matt_Hartley · November 20, 2023, 11:32pm

Amazing work, thank you!

jwp · November 21, 2023, 4:45am

Have run up a new build of my patched kernel with this against the fedora 6.7-rc2 os-build tree. And removed the rtc kernel flag - will let you know if I encounter any time skipping.

jwp · November 21, 2023, 4:56am

I am still seeing these:

2023-11-21 17:49:38,716 DEBUG:  [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=14
2023-11-21 17:49:38,717 DEBUG:  [drm:amdgpu_mes_reg_write_reg_wait [amdgpu]] *ERROR* failed to reg_write_reg_wait

consistently during resume running the amd_s2idle.py script ; is there an open bug in the amd gitlab for this? As it’s still there with latested mainline patches and linux-firmware for Phoenix.

Mario_Limonciello · November 21, 2023, 5:54am

And removed the rtc kernel flag - will let you know if I encounter any time skipping.

There are two sets of patches, one for using ACPI for RTC alarm and one for UIP clear not happening in 10ms. Make sure that you’ve got both in your test kernel if you’re not using the kernel command line parameter.

I am still seeing these:

Functionally harmless right?

consistently during resume running the amd_s2idle.py script ; is there an open bug in the amd gitlab for this? As it’s still there with latested mainline patches and linux-firmware for Phoenix.

Nothing is opened in AMD Gitlab for this. FWIW I believe it’s caused by a firmware included in the BIOS not Linux in this case.

jwp · November 21, 2023, 8:54am

Ahh - for some reason patchwork is titling them the same :

https://patchwork-proxy.ozlabs.org/project/rtc-linux/list/?submitter=81779

Mario_Limonciello · November 21, 2023, 1:21pm

Here’s the other one.

https://lore.kernel.org/linux-rtc/20231106162310.85711-1-mario.limonciello@amd.com/

jwp · November 21, 2023, 6:16pm

One thing I do need to create a bug report for is being able to dynamically change resolution in Wayland. I.e changing to 1920x1080 from Display settings or xrandr nets a black (but backlit) screen. It works ok in X11 session and if it’s set from the Login Manager or from a TTY but not once a Wayland session is running. Not sure if it’s a plasma/Wayland or DRM/amdgpu bug but it was happening in the fc39 kernel as well as the 6.6 and rawhide 6.7 userspace.

I also noticed the panel reports it does 48hz as well as 60hz ; does this meant it implement the PSR and but I don’t see VRR support in drm_ibfo etc?

Mario_Limonciello · November 21, 2023, 7:46pm

Might be a slightly different manifestation of
eDP non-native res still corrupt on laptop (affects 6.5.0-9 Ubuntu, and 6.6.1 custom) (#2995) · Issues · drm / amd · GitLab. If it’s not, open a new issue with details and logs.