Looks like your ticket is waiting on you to provide requested logs. From there, the ticket is then escalated and if need be, will be sent to engineering.
The kernel logs are already here and in the linked kernel.org bugs, but I just replied to the ticket with a copy of them there, too.
It looks like Kernel 6.9 will be getting a fix that will allow the thermal zones to register despite the bogus trip values. However, that does not eliminate the need for Framework to put valid thermal zone data in the ACPI tables for the various operations that may utilize the trip values.
Is this bug report now tracked by the framework support?
I’m currently on kernel 6.9.5 and the issue is still present.
Also on Debian Trixie with kernel Linux phoenix 6.9.12-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.9.12-1 (2024-07-27) x86_64 GNU/Linux
the issue still exists:
Aug 04 11:34:59 phoenix kernel: ACPI: thermal: [Firmware Bug]: Invalid critical threshold (-274000)
Aug 04 11:34:59 phoenix kernel: ACPI: thermal: [Firmware Bug]: No valid trip points!
Aug 04 11:34:59 phoenix kernel: thermal LNXTHERM:00: registered as thermal_zone0
Aug 04 11:34:59 phoenix kernel: ACPI: thermal: Thermal Zone [TZ00] (41 C)
Aug 04 11:34:59 phoenix kernel: ACPI: thermal: [Firmware Bug]: Invalid critical threshold (-274000)
Aug 04 11:34:59 phoenix kernel: ACPI: thermal: [Firmware Bug]: No valid trip points!
Aug 04 11:34:59 phoenix kernel: thermal LNXTHERM:01: registered as thermal_zone1
Aug 04 11:34:59 phoenix kernel: ACPI: thermal: Thermal Zone [TZ01] (41 C)
Aug 04 11:34:59 phoenix kernel: ACPI: thermal: [Firmware Bug]: Invalid critical threshold (-274000)
Aug 04 11:34:59 phoenix kernel: ACPI: thermal: [Firmware Bug]: No valid trip points!
Aug 04 11:34:59 phoenix kernel: thermal LNXTHERM:02: registered as thermal_zone2
Aug 04 11:34:59 phoenix kernel: ACPI: thermal: Thermal Zone [TZ02] (40 C)
Aug 04 11:34:59 phoenix kernel: ACPI: thermal: [Firmware Bug]: Invalid critical threshold (-274000)
Aug 04 11:34:59 phoenix kernel: ACPI: thermal: [Firmware Bug]: No valid trip points!
Aug 04 11:34:59 phoenix kernel: thermal LNXTHERM:03: registered as thermal_zone3
Aug 04 11:34:59 phoenix kernel: ACPI: thermal: Thermal Zone [TZ03] (78 C)
I don’t know about the messages mentioned above. But I would very much like to see each temp sensor named/described in the ACPI tables.
I.e. Give a name to temp1, temp2, temp3, temp4 etc.
All I have currently is:
acpitz-acpi-0
Adapter: ACPI interface
temp1: +46.8°C
temp2: +48.8°C
temp3: +46.8°C
temp4: +45.8°C
good news, i can KINDA help with that
if you use dhowett’s ectool fork, you can correlate the temperatures ectool shows with the acpitz-acpi-0
sensors. i did that for my fw16, and also was guided to a framework hosted repository that has config files for lm-sensors to show names for these values
of course, ectool and that set of config files disagree, and if you look you can see i commented on the linked pull request with my findings and the results of some testing. (plus the pull request has ‘double-check sensor names’ as a todo, so it seems plausible these aren’t final/validated values)
my understanding is the temp1 through temp4 values SHOULD be the same for an amd fw13 (or an fw16 without dgpu), it’s temps 5 through 8 that are specific to having a dgpu.
in a more distant future, newer linux kernels (6.11 ish, as i understand it?) are expected to have working embedded controller drivers for our machines, which will let the kernel get at things more directly
While I feel like it should be easy enough for a Framework employee to just ask an engineer, in the absence of direct info from Framework, if you have not already considered it, it might be quicker and easier to use a hair dryer/heat gun to blow some warm air on the board and find the sensor locations that way.
hmmm. hadn’t considered that, no. well, now i guess a hilarious new conversation starter with the spouse is on the table.
“My laptop needs your hair dryer …”
“Your laptop needs what ???”
“It needs your hair dryer”
“what does it need that for ???”
“to dry its hair of course …”
“but it doesn’t have any hair!!!”
"You haven’t seen the hairy things I’m going to do to it … "
How about this problem - any updates on this?
This certainly needs to get fixed, and the fix should be simple.
Does anyone know if the Linux patch will ship with 6.12?
Or any previous version (6.9 does not seem to fix it)?
Which sensors do people think are missing?
the “amdgpu_top” tools seems to have temp sensors for each of the 8 CPU cores. What else are people looking for?
I guess it would make sense for those amdgpu_top sensors to also appear in “sensors”, but it is good enough for now.
I had a quick look, the current kernel code (6.12) includes the fix.
Next step is to find out if there is a Ubuntu/Fedora kernel package which already includes the fix …
UPDATE:
Bad news for Ubuntu, the most recent kernel (6.11.0-9.9) package does not include the fix. So no use updating to the latest version, or build yourself.
Thought I might mention, the 3.06 BIOS beta has realistic temperature thresholds now. So this issue no longer occurs.
Hi.
Which sensors got missed?
I have not noticed any new sensors appearing after upgrading the BIOS.
This is for the FW 13 I guess. The FW 16 most recent BIOS update (3.05) doesn’t mention such a fix.
ACPI thermal sensors are not registered anymore after a “fix” to the linux kernel from ~ 6.7 on. See the first post by Quentin.
Thanks to everyone has posted here so far. I, too, am seeing this exact issue on my FW16. Are there any plans to provide the FW13 update to FW16 for this issue?
The missing sensors issue was tracked at 218586 – No ACPI Thermal Zones after Kernel 6.8 and fixed by kernel version 6.9. Proper ACPI thermal zone fixes were mentioned in the BIOS 3.04 beta notes. Now I guess that 3.04 beta is abandoned and will never be released as non-beta version, but presumably the ACPI table fix also exists in 3.05 beta since there would be no reason not to.