The “best” hardware failures are binary: it works and then it doesn’t. When that happens you replace the failed component and you’re golden.
The behaviour you’re sharing with us hints at an intermittent problem. The worst (and most techs will tell you that they’re the most common) kind. I’m guessing it’s storage: you’re running a common O/S and the applications that are giving you grief are in common use. There is nothing close to bleeding edge in what you describe. That stuff should just work.
Now when I say “storage”, that’s a very complex subsystem with lots of different layers of software, firmware and hardware that are interconnected in ways that can become brittle.
As frustrating as the journey will be, methodically capturing evidence that identifies the fault is the only way forward (short of complete replacement). I’m sure you’d have bought an MBP if you thought replacement was a fix.
Thank you, I’ll continue collecting information.
Earlier, after a long period of nothing working, Fix MergeList problems from Software Sources randomly worked and got rid of the APT error (at least it was probably random, I tried the extra steps from this Wensheng: apt update issue: Problem parsing dependency but they gave no output)
Firefox opened, I was able to install a backup tool, and might have successfully backup up /home/ to my ssd. I’m not certain, since the process ended while I was away from the computer and afterwards Firefox crashed and the native Linux backup tool wouldn’t open.
dmesg -kw said it was segfaults
Right now, Firefox still won’t open but there is no APT error.
Update: After a reboot and ten minutes, the APT error is back (minus an extra bit about being unable to parse something). Also, segfault with something called nemo[3457] and error 4 in libpython3.8 etc etc
Currently, no sudo commands with rm work, and sudo apt-clean doesn’t output anything either. Nothing new shows in dmesg when i try.
Strangely, redownloading the cache after doing fix MergeList Problems doesn’t run into an issue.
I’m not sure there’s anything more I can add to what has been said. The seemingly random user-space failures that you’re experiencing are almost certainly as a consequence of things your kernel will have logged. I expect there would have been 100s of kernel log messages based on the various failures you mentioned in your last couple of posts. Without the information contained in those messages it’s impossible suggest what might be failing.
I think your priority should be to understand how to effectively view and save kernel log entries. I suggested dmesg because it’s simple and doesn’t depend on an error-free storage system. That simplicity might be its downfall; it’s not interactive and you need to be comfortable with manipulating a terminal session. There are alternatives: journalctl and the kern.log* files in /var/log however they may be unreliable if your NVMe device is failing.
You should limit the interactions with your system to the minimum you need to get work done. Until you understand the nature of any low-level failure, higher level repairs are futile since they will almost certainly be undone by subsequent low-level failures. This is borne out by what you’ve told us about your recurring problems with the apt cache.
Whenever a high-level failure occurs (e.g. ffx crashes with a segfault) scan back through the log looking for messages about storage I/O errors. Copy them somewhere safe. They will tell a story that will help pinpoint precisely what is broken.
I do have some insight into how frustrating this is. I wish you luck and a speedy resolution.
Thank you again. I’m set to meet with a data science professor Friday for help, hopefully things will be solved then (or, at least maybe I’ll be able to reinstall Mint and see if that works or not).
How do I recognize storage I/O errors? I haven’t seen anything like that, as far as I can tell, not with dmesg.
@uraneaUmbra just out of curiosity what kernel are you running? You can use uname -r in a terminal to get the version. Also it would help to know what processor, what nvme drive, and whether you have attempted to reseat the drive and ram? Also is the drive on the latest firmware? FInally what BIOS version?
@Matt_Hartley Oh, I wouldn’t have thought to avoid updating once I have a live USB to boot from in order to diagnose the issue, thanks!
@nadb The kernel I’m running is 5.15.0-60 generic. Processor is 11th Gen Intel Core i5-1135G7 2.40GHz. When I do lsblk it tells me nvme0n1 splits into nvme0n1p1 (/boot/efi) and nvme0n1p2 (/ ). I haven’t tried to reset the drive and ram. As far as deleting things ive only used sudo rm -fr /var/lib/apt/lists/* which had temporarily gotten rid of the error a few times. I do not know about my drive’s firmware, but I have only been updating things through the update manager.
My BIOS version is 03.07.
@uraneaUmbra I meant more of a what brand and model is your drive, some have initial firmware issues that get resolved with later updates. You should be able to pull up the data in a terminal using fwupdmgr or you can install the gui for it if it is available.
The /var/lib/apt/lists corruption issue is fairly common, if I remmeber correctly the software management application on Mint has a maintenance tab like a lot of package managers where you should be able to take care of that…the other stuff does sound like a hardware issue, which is why I recommend looking at your drive firmware and seeing if something newer is available, along with the obvious reseat. Also you have one stable BIOS release available to update to this may also benefit your overall experience.
TLDR: Check and see if you have a frimware update available for your drive.
I’m trying to install Mint MATE 21.1 and I’m getting errno 5… I verified the ISO and the flash drive is brand new.
Until that, it was working well through live boot.
I do not have another stick, but I’ll try to get one.
First a computer science professor used my usb to get the ISO, and when it failed to install I assumed it was because he didn’t do the verification step cause he did it quickly. So I redid it, the verification was successful, but then the same error happened.
I followed the Framework guide so I downloaded the ISO file from a mirror, right clicked and did ‘save link as’ to get the two files for verification, followed a guide for Windows in order to verify it on my friends laptop, and then I wrote it to the USB with Rufus.
I’m not sure what the professor used the first time, but he had to do something extra in order to make a Sandisk USB bootable at all.