ECC support?

Although we have not yet found the root cause of the following problems.
It is feasible that they are all caused by bit-flips, and that ECC RAM could have prevented them.

Although DDR5 has some internal ECC error correcting, there is a gap in the error detection on the path between the DDR5 chip and the CPU. The CPU also has ECC error correcting internally on their cache lines. It seems a shame that it is not ECC error correcting / detecting end-to-end.

2 Likes