[With the appropriate disclaimer that I’m not an engineer at Framework Computer,] we now know what’s going on thanks to all the reports here and the case you’ve built up!
The cros_ec_lpcs
driver is generic for any laptop that has a ChromeOS EC on the LPC bus. The patches to add support for the Framework Laptop really just add a device identifier and fix port allocation, but don’t themselves cause the issue.
At the end of the day, it’s the same root cause as this equivalent issue filed by Kieran in the (my) CrosEC Windows driver: EC access is sometimes corrupted. · Issue #3 · DHowett/FrameworkWindowsUtils · GitHub; it will likely also reproduce with coolstar’s crosecbus
driver.
The power and battery state of the machine are managed by ACPI, and the ACPI methods for querying those things call the EC directly³. When it does so, it uses a mutex that can’t be shared with the OS². There’s also a couple ACPI-driven exchanges that occur during wake from sleep. Now, because cros_ec_lpcs
(Linux) and CrosEC
(Windows) use the LPC bus directly, an inflight request from ACPI can collide with an inflight request from one of these drivers.
Since the cros_ec_debugfs
driver (not _lpcs
, mind!) seems to query the EC console repeatedly to surface it via the debugfs interface, it causes a lot of traffic–especially around system startup–that runs a chance of stomping the one ACPI exchange that clears the preOS bit⁴.
Letting ectool
do raw port I/O will “fix” it only because it reduces the incidence of host command exchanges. If you run it in a tight loop starting from the moment the machine wakes, you’ll still encounter some corruption of inflight packets.
I wonder… if you put cros_ec_debugfs
on the disallowlist instead of cros_ec_lpcs
, does it do anything for this issue⁵? (@Matt_Hartley, I would love if you had some cycles spare to help figure out with the community if ..._debugfs
is an effective workaround; if so, people could still use ectool
!)
¹ I’m comfortable saying “we” only because I’m the person who caused this issue
² There’s another generic method (FWMI
) that would allow for the OS to communicate with the EC via ACPI instead of using the I/O ports directly, but using it would need a solid chunk of driver work.
³ Beware, this file is huge. DSDT from 11th gen v3.17 > EC0.M001
⁴ The one that you noted earlier and is used to determine whether to ignore/respect Fn
⁵ This might help all users, minus the subset of people who really are using ectool
during early boot. It would be a more hit-or-miss fix for those folks. It will unequivocally reduce host command traffic!