Brief update: this is very likely a hardware issue. I could demonstrate for support on video(*) that the front left port USB-3 was dead, and a live shutdown of USB-3 of same on front rear. (USB-2 was OK.) Support agreed to send a replacement board that was installed today. Now wait and see…
(*) The recorded tests were set up like this:
- Before testing, all Framework modules were removed to exclude them as a possible source of errors.
- To check for an existing USB connection, the USB configuration was continually polled in terminal, running 10x/second the command
lsusb -vt
.
- The output of the test device (if found on USB-3) was highlighted in terminal. Each poll output covered about half the screen, so two poll results with their highlights were visible at a time.
- For the actual test, a test device was repeatedly plugged in all ports of the Laptop, and the terminal output observed. This was recorded in the videos.
=============
Below are the conclusions I also sent to support, appended to a very detailed report of observations, tests, and issues. Maybe they help others to hunt down a whack-a-mole issue. If you want you can skip to the last three paragraphs for the conclusion.
Note that the other users mentioned below were found researching the black screen issue. I now think this is an early sign of a failing USB-3 in connection with an HDMI module. I hope I’m wrong.
My former tests (some details can be found in the black screen support conversation) were done to find the cause of a black screen waking from hibernation.
They showed that I could boot without a black screen into an installed Fedora 38 or a Fedora 38 live ISO, and in Manjaro installation and live ISO (ISO boots loaded via Ventoy).
Fedora suspension was also OK. But Fedora has no hibernation OOB, so the actual problem could not be tested. Hacking around to enable hibernation would result in a non-standard configuration, so I did not go there.
Testing for post-hibernation black screen with an ISO is not possible; ISOs have no writable swap file or partition. The black screen originally happened only after hibernation, later rarely on boot also.
Searching the Community forum for clues I found other users suffering from the same issue. Some of them saw it later than I did – later begin of laptop usage, or varying time from deployment to issue manifestation?
Below are the respective forum posts. I did not search elsewhere; there may be others who posted outside the Community forum.
[TRACKING] HDMI screen black after wake from hibernation, only on left front port
- David_Alexander, also on Manjaro. He already saw it at his post date with multiple ports. I had more affected ports later on.
- Scott_H had the same issue running Windows.
- Robin_Fruytier1, OS not mentioned
HDMI expansion adapter does not work in one slot - #2 by Robin_Fruytier1
HDMI expansion adapter does not work in one slot - #19 by Philip_Lawton
- debuser on Debian 12, Debian Trixie, and Ubuntu live on both left ports (right not tested).
- Philip_Lawton: “I am having the same issue in the same slot on the AMD 7040 MoBo”
So this issue manifests on machines running
-
-
-
-
-
-
-
-
– not excluding others. OSs 3, 7, 8 might be identical – Windows 11 --, though.
These are all data points indicating that the issue is not solely caused by an unfortunate choice of distros on my part.
Also, Ubuntu is an officially Framework supported distro and its upstream Debian is not immune, too.
Could the RTC rework play a causal role? But AFAIK, none of these users applied the RTC rework.
And if the black screen was related to the rework, it would have manifested immediately after the rework (and I for sure would have seen the connection and not complain two months later in the Community forum about it!).
For these reasons I think we can safely rule out a mainboard damage from the rework.
Tests with powered hubs for all power-hungry USB devices rule out that too low voltage is the cause.
So do the tests shown in the video – some ports work, others don’t. Unless the mainboard has voltage issues on some ports and not on others, that would be design or manufacturing flaw.
I think that at this point we can skip checking my earlier various USB setups. The issue manifests with a single stick and not even the FW modules plugged in.
Why some distros (Fedora, mostly Ubuntu) appear immune may be due to different driver versions they shipped. But…
A driver that affects a single port and then increasingly more must undergo an evolution – I never heard of something like that.
Also (albeit a really exotic race condition is theoretically conceivable), why would a driver affect the USB intermittently and do so on several rather different operation systems?
And Windows and Linux do not use the same driver (maybe the same code base, with the same compiler, etc. – extremely unlikely that this resulted in the same erratic behavior at runtime at so fundamentally different OSs).
Software is a dead end; thinking turns to hardware. A coincidence of several weak points, all of which have to be severe enough that together they manifest the issue. Possibly, driver or firmware or operating system form part of this causal nexus.
This sounds exotic, but does happen. As an example of such a complex hardware issue which has appeared in an apparently unsystematic pattern, you may have heard about the recent SanDisk SSD defects that appear after various times of use. Bad solder plus too small contact area between parts and board plus heat that caused the solder to form bubbles – this did not allow a clear pattern of breakdowns to emerge.
SanDisk Extreme Pro Failures Result From Design and Manufacturing Flaws, Says Data Recovery Firm | Tom's Hardware
The sample size of Framework users might be too small and/or they might have not yet used their machine long enough to make a clear pattern appear. IMO, the already seen affected boards point into that direction at this time, though. With longer use and more motherboards in use this would become more visible, I guess.