ECC support?

ECC error logging just told me that my overclock on my desktop PCs RAM is not stable any more. About 1 corrected error every hour. I configured that overclock about a year ago with extensive (days of memory load) and error free stability testing. So either my memory has degraded, my powersupply is less stable than before or a plethora of other possible reasons can be the cause.
But this example shows how ECC is a useful addition even for non professionally used systems. These hourly errors would have slowly corrupted my data and it probably would need to get a lot worse for me to actually notice something being wrong with hardware. Instead of knowing there are corrected errors happening I would see software crash every now and then, maybe my checksumming file system would catch some incorrectly checksummed files or stuff like that. Everything easily attributable to software bugs, updates or whatever else changed. Hardware is usually the last thing I think about in such cases.
One could make the argument that it is my fault for overclocking the RAM, but think about how many people run XMP/EXPO profiles on non-ECC memory. Typical advice is doing some rounds of memtest++ and call it a day for such configs. My RAM runs not even close to settings that the same Samsung B-die memory was used for in “gaming” sticks.
I’m happy to be in the know on this potential hardware problem and can now dial the overclock a bit back or give it a bit more voltage and just keep an eye on my error logs.

17 Likes