I recently was delivered a 1334U variant of the new Framework 12, and I’m experiencing some unusual crashes and btrfs corruption issues. Without muddying the water with other details, I can reproduce this issue by doing the following:
1. Booting with the latest Arch ISO
2. Connecting a known good BTRFS drive (encrypted with luks argon2id, and with btrfs using the xxhash64 hashing algorithm, in case that matters)
3. Mount and run a btrfs scrub operation
Doing the above on known good drive (i.e., one that passed btrfs scrub on another machine without issue) usually results in several btrfs read and checksum errors on both internal and external drives.
I’ve tried this now with several known good disks and clones of said disks that don’t have a problem in any other machine but this one.
So if anyone has any ideas, I’m open to them.
P.S. RAM has been tested with memtest that’s on the Arch ISO and it passes. It’s specifically a 64GB Crucial.
To my surprise, it was the ram. I put in a spare 48GB RAM stick in it and no issues after that. Ugh.
Turned off the “Hardware Prefetcher” and “Adjacent Cache Line Prefetch” options in the advanced cpu settings and that seems to have fixed it over a couple of reboots… will confirm in a little bit with some more testing.
If the btrfs scrub fails randomly on different files. You are getting random bit flips. If the mainboard had supported ECC, you would have known, but due to lack of ECC, it bit flips silently.
I am surprised a few cycles of memtest did not find it, so it is most likely a cpu issue.
I’m going to stick it in another machine and let it run for a while; I only did one pass of memtest and assumed that was sufficient.
Anyway I don’t believe it’s the CPU. The problems disappeared instantly when I swapped the ram out and I’ve now put a different 64GB stick in it and it’s still fine.
This doesn’t make any sense… I’ve been running memtest for 16 hours now on this ram stick and it’s passing… yet running with this test causes a handful of problems that all point to memory, and said symptoms entirely disappear with another identical ram stick…
I’m a bit curious how you have 64GB of RAM working when the official specs say only up to 48GB are supported, and whether the official limit is artificially lower than it should be, or if using 64GB is getting into the realm of weird edge case behaviour that may cause intermittent failures like this.
–edit: found sort of an answer to myself.. this was asked about back in april, and it was noted that while there is at least one person running a 64GB stick in their FW12, neither Framework nor Intel have validated this, so they can’t speak to compatibility
basically, as I understand it, ‘yes it works, but it’s not fully tested, so you may have weirdness.’
…which…is entirely consistent with having a ram chip that passes every test on another computer, but fails on that one, and an identical chip that works fine on that one. it’s fricken grimlins I tell you!
(mine’s 48gb )
That could be it, but then this would be the first example I’ve seen of a machine being able to POST with ram that’s going to cause problems just because it’s outside of the specified maximum. For example I have a handful of DDR5 machines that just won’t POST with any memory sticks larger than 32GB.