@olenananas
I don’t know how AMD arranges RAM addresses vs RAM chips.
It might be interleaved. I.e. byte 0 on RAM chip1, byte 1 on RAM chip2, byte 2 on RAM chip1, byte 3 on RAM chip2.
I Believe ECC is used at L1 Cache, L2 Cache, L3 Cache, on the DDR5 RAM chip.
I don’t believe ECC is used on the data lines between the CPU and the RAM chip.
In which case, it might be useful to find out how AMD arrange the RAM addresses, and then look in the errors for the offsets of the mismatching data, and work out from their which RAM chip has the problem, or if like you said, the mismatch does not move with the RAM chip, so must therefore be motherboard based problem.
I have always thought the ECC that also covers the data lines between chips is the way to go, but Intel and AMD only use that on Servers as far as I know.
The cause might also be RF interference. Do you have any mobile phones or washing machines or other electrical items near the laptop when doing the memtest ?
I found out how the AMD 7840 maps physical address to RAM chip. One RAM chip is channel A, the other is Channel B. The LSB bit of the 64 bit address selects the Channel A/B.
So, it is actually as I describe above. Byte 0 → Channel A, Byte 1 → Channel B etc.
Looking at the data from the pics you posted:
1e21 679d 38c6 5c6b < Expected
e121 679d c7c6 5c6b < Found
0741 6ba0 6cf4 4a2c < Expected
f841 6ba0 93f4 4a2c < Found
cc7b c3f8 0b75 4987 < Expected
337b c3f8 f475 4987 < Found
dcb4 2aee 0ad6 c2eb < Expected
23b4 2aee f5d6 c2eb < Found
The errors are all when LSB=0, and OK when LSB=1. See numbers in bold for the errors.
So, this points to a fault on one of the DDR5 RAM channels.
So, if you swap the RAM chips, the errors should move from being on LSB=0 to bad on LSB=1.
But, looking at some values further down, the LSB=1 are bad.
E.g.
Via CPU 3 == LSB=0 bad
Via CPU 9 == LSB=1 bad.
There is not a big enough sample size there though.
I therefore suspect a Motherboard change is needed, as it might even be the CPU having the bug, but the CPU is soldered on the MB.