Catastrophic disk failure or ... not?

I have had a problem twice now.

First, when I open up my laptop, I am unable to authenticate. The screen says “Authentication failed” and that is flickering, as if constantly refreshing. And there is a square cursor on the screen which I can move around but it does not let me type anything.

Then I power down and power up and the OS cannot find a disk from which to boot.

Both times I power down, wait at least several minutes (not less), and then everything comes up fine. And the logs that I have been able to check show no disk failures.

Obviously I need to make sure my backups are up to date.

But what else can I do? Which logs might show me what the problem is? Has anyone seen this kind of issue?

I will be sending a note to support, also, but sometimes that leads to a productive exhange of information and sometimes it does not.

Any suggestions? System details below. Sorry if I have selected the wrong tag above. It is not obvious from the system info which option to pick.

System Details Report

  • Date generated: 2025-05-16 10:20:16

Hardware Information:

  • Hardware Model: Framework Laptop 13th Gen Intel Core
  • Memory: 32.0 GiB
  • Processor: 13th Gen Intel® Core™ i5-1340P × 16
  • Graphics: Intel® Graphics (RPL-P)
  • Disk Capacity: 1.0 TB

Software Information:

  • Firmware Version: 03.04
  • OS Name: Ubuntu 24.04.2 LTS
  • OS Build: (null)
  • OS Type: 64-bit
  • GNOME Version: 46
  • Windowing System: Wayland
  • Kernel Version: Linux 6.11.0-25-generic

I would check all the partitions with fsck. Second step - check SMART info, maybe there are some hints there…

I had looked at the smart info. I will include the results here but the take-away is where it says: “No Errors Logged”.

# smartctl --all /dev/nvme0
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.11.0-25-generic] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       Samsung SSD 990 EVO 1TB
Serial Number:                      S7M3NL0X909209R
Firmware Version:                   0B2QKXJ7
PCI Vendor/Subsystem ID:            0x144d
IEEE OUI Identifier:                0x002538
Total NVM Capacity:                 1,000,204,886,016 [1.00 TB]
Unallocated NVM Capacity:           0
Controller ID:                      1
NVMe Version:                       2.0
Number of Namespaces:               1
Namespace 1 Size/Capacity:          1,000,204,886,016 [1.00 TB]
Namespace 1 Utilization:            450,909,331,456 [450 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            002538 2941a023f7
Local Time is:                      Fri May 16 12:46:22 2025 PDT
Firmware Updates (0x16):            3 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x00df):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp Verify
Log Page Attributes (0x2f):         S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Log0_FISE_MI
Maximum Data Transfer Size:         128 Pages
Warning  Comp. Temp. Threshold:     85 Celsius
Critical Comp. Temp. Threshold:     85 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     7.47W       -        -    0  0  0  0        0       0
 1 +     7.47W       -        -    1  1  1  1      500     500
 2 +     7.47W       -        -    2  2  2  2     1100    3600
 3 -   0.0800W       -        -    3  3  3  3     3700    2400
 4 -   0.0070W       -        -    4  4  4  4     3700   45000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        38 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    2%
Data Units Read:                    4,167,521 [2.13 TB]
Data Units Written:                 22,251,512 [11.3 TB]
Host Read Commands:                 36,419,617
Host Write Commands:                543,523,903
Controller Busy Time:               11,530
Power Cycles:                       121
Power On Hours:                     3,520
Unsafe Shutdowns:                   38
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               45 Celsius
Temperature Sensor 2:               38 Celsius
Thermal Temp. 1 Transition Count:   52
Thermal Temp. 1 Total Time:         79

Error Information (NVMe Log 0x01, 16 of 64 entries)
No Errors Logged

Self-test Log (NVMe Log 0x06)
Self-test status: No self-test in progress
No Self-tests Logged

There are no errors.
On the grub- boot screen, boot an alternative kernel (old one).
If the screen still flickers at the login screen, got down to the console with: --F4
Hit enter a few times so systemd spawn a console and log in.
Once logged in, issue a:
“sudo apt update && sudo apt dist-upgrade”
to apply all system updates, reboot, and try again.
If that still does not work, Boot into an old kernel in the grub advanced options.

I had a similar problem with a PC, not a Framework, where the hard disk would disappear and then reappear. Also other strange symptoms. Turned out to be a failing CMOS battery. Once I replaced it, problems went away.

Hugh