FW13 AMD AI 300 (HX 370): 48 Data Fabric Sync Flood crashes in 2 months — comprehensive data

Hi everyone,

I’m sharing a detailed report of persistent Data Fabric Sync Flood crashes (0x08000800) on my Framework 13 AMD Ryzen AI 300 in the hope that the data helps Framework and AMD engineers root-cause this issue. I’ve been systematically logging every crash since December 2025.

@Jesse_Darnley mentioned finding a reproducible trigger in June 2025 (power-adapter related, since fixed), but I haven’t seen further updates. This post adds a large, methodical dataset from a different angle: my crashes happen during normal use, not just at sleep/wake, and reproduce on a stock Ubuntu live USB — ruling out custom kernels and installed software.

System Information

Component Value
Laptop Framework Laptop 13 (AMD Ryzen AI 300 Series)
CPU AMD Ryzen AI 9 HX 370 w/ Radeon 890M
RAM 2×48 GB Crucial DDR5 (96 GB total); originally 1×48 GB (Framework stock)
Storage 1 TB WD_BLACK SN770 NVMe, firmware 731100WD
Wi-Fi Intel AX210
BIOS 03.05 (2025-10-30)
Kernel 6.18.0-fw13 (custom built from mainline); previously 6.14-1016 (Ubuntu)
OS Ubuntu 24.04.3 LTS
Kernel args amdgpu.dcdebugmask=0x12 (disables PSR + Stutter mode); just changed to 0x412 (adds Panel Replay disable)
Power profile Balanced

The Problem

The dmesg message after every crash:

x86/amd: Previous system reset reason [0x08000800]: an uncorrected error caused a data fabric sync flood event

The crash is near-instantaneous — no kernel panic, no oops, no pstore data, no kdump capture. The hardware simply resets. Occasionally I notice a brief freeze (~5 seconds) before the reset, sometimes with a CPU core spiking to 100% in the system monitor. The only post-mortem evidence is the reset reason register read at next boot.

Crash Statistics: 48 Sync Floods

I log every crash with DIMM temperatures (from collectd/spd5118), awake uptime between crashes, and activity at crash time. DIMM temperature monitoring was added starting crash #7. Here is the full table:

# Date Uptime RAM Kernel DIMM temps (°C)
1 2025-12-02 11:58 ? 1×48 6.14
2 2025-12-02 12:35 < 1 h 1×48 6.14
3 2025-12-03 20:15 ~28 h 1×48 6.14
4 2025-12-11 18:38 ? 2×48 6.14
5 2025-12-11 19:28 < 1 h 2×48 6.14
6 2025-12-11 20:13 < 1 h 2×48 6.14
7 2025-12-15 15:46 ~41 h 2×48 6.18 56–61
8 2025-12-23 16:19 ~7 h 2×48 6.18 47–50
9 2025-12-24 10:24 ~1 h 2×48 6.18 59–67
10 2025-12-25 07:04 ~21 h 2×48 6.18 61–66
11 2025-12-25 14:48 ~8 h 2×48 6.18 50–53
12 2025-12-26 04:56 ~2 h 2×48 6.18 65–72
13 2025-12-26 06:36 ~1 h 24 2×48 6.18 42–46
14 2025-12-28 05:50 ~23 h 2×48 6.18 47–51
15 2025-12-31 04:30 ~35 h 2×48 6.18 52–54
16 2025-12-31 12:09 ~4 h 2×48 6.18 68–73
17 2026-01-01 07:16 ~10 h 2×48 6.18 57–71
18 2026-01-01 10:06 ~3 h 2×48 6.18 60–66
19 2026-01-06 09:00 ~64 h 2×48 6.18 57–60
20 2026-01-06 10:39 ~1 h 37 2×48 6.18 61–65
21 2026-01-06 11:32 ~51 min 2×48 6.18 52–54
22 2026-01-07 08:39 ~12 h 2×48 6.18 56–66
23 2026-01-10 10:24 ~41 h 2×48 6.18 57–64
24 2026-01-12 02:54 ~23 h 2×48 6.18 49–51
25 2026-01-12 15:31 ~12 h 2×48 6.18 54–58
26 2026-01-14 05:53 ~20 h 2×48 6.18 55–57.5
27 2026-01-15 10:58 ~21 h 2×48 6.18 57–62
28 2026-01-15 13:09 ~2 h 2×48 6.18 50–53
29 2026-01-17 01:14 ~18 h 2×48 6.18 48.5–64
30 2026-01-19 05:49 ~26 h 2×48 6.18 51.5–53.5
31 2026-01-20 11:36 ~20 h 2×48 6.18 75–81
32 2026-01-24 08:29 ~54 h 2×48 6.18 61–71
33 2026-01-26 04:09 ~14 h 2×48 6.18 56–63
34 2026-01-27 07:47 ~18 h 2×48 6.18 62–71.5
35 2026-01-27 10:04 ~2 h 17 2×48 6.18 63–68.5
36 2026-01-28 03:45 ~11 h 2×48 6.18 54–61.5
37 2026-01-28 04:02 ~15 min 2×48 6.18 62–69
38 2026-01-30 13:27 ~37 h 30 2×48 6.18 57.5–61.5
39 2026-01-31 08:15 ~9 h 58 2×48 6.18 60–71
40 2026-01-31 08:37 ~22 min 2×48 6.18 60–67
41 2026-01-31 08:45 ~7 min 2×48 6.18 63.5–72.5
42 2026-01-31 12:44 ~3 h 55 2×48 6.18 48–51
43 2026-02-01 10:54 ~8 h 53 2×48 6.18 50–52.5
44 2026-02-01 16:19 ~8 min 2×48 6.18 60–67
45 2026-02-02 17:26 ~18 h 2×48 6.18 62.5–67.5
46 2026-02-03 01:41 ~1 h 36 2×48 6.18 62–68
47 2026-02-03 01:55 ~13 min 2×48 6.18 64.5–72
48 2026-02-03 ~04:50 ~2 h 51 2×48 6.11* 67–69

* Crash #48 occurred on a stock Ubuntu 24.04.3 live USB (kernel 6.11.0-17-generic, no custom kernel args, no amdgpu.dcdebugmask, no encrypted root, no collectd/Docker). Same 0x08000800 reset code.

Uptime between crashes ranges from 7 minutes to 64 hours. Average is roughly 12–15 hours of awake time. Last week (Jan 27 – Feb 3): 15 crashes, average ~7 h 40 min, min 7 min, max 37 h 30 min — the frequency is increasing. Note: uptime is cumulative awake time only — suspend periods are excluded. Longer uptimes span multiple wake/suspend cycles (e.g., the 64 h entry spans 11 sessions over 5 days).

What I’ve Ruled Out

Variable Tested Result
Temperature Crashes at DIMM temps 42–46 °C (cold) and 75–81 °C (hot). Ran 1 h 45 min video call at 75–77 °C without crash. Ran 1 h+ session at DIMM2 83–84 °C SPD (potentially 96–101 °C hotspot) without crash. Next crash was at 57 °C. Not the cause
Cooling Used laptop cooling stand with fans for weeks — dramatic temp reduction, zero impact on crash frequency Not the cause
Kernel 6.14-1016 (Ubuntu stock), 6.18.0-fw13 (custom mainline), 6.11.0-17 (stock Ubuntu live USB). 6 crashes on 6.14, 41 on 6.18, 1 on stock 6.11 Not the cause
Custom software Live USB test: stock Ubuntu 24.04.3, no custom kernel args, no amdgpu.dcdebugmask, no encrypted root, no collectd/Docker — crashed after ~2 h 19 min Not the cause
RAM config 1×48 GB from Framework → 2×48 GB Crucial DDR5 Not the cause
iGPU VRAM BIOS: 0.5 GB → 16 GB No effect
Power supply Framework charger + third-party 100 W PSU No effect
CPU load Crashes during idle, during terminal work, during compilation, during Firefox No correlation
amdgpu PSR amdgpu.dcdebugmask=0x12 — this fixed an earlier, much worse crash pattern (crashes within minutes of boot). Sync floods still occur with it. Mitigates a different issue

What I Haven’t Tried Yet

  • amdgpu.dcdebugmask=0x412 — just applied, adds Panel Replay disable (DC_DISABLE_REPLAY) to my existing PSR + Stutter disable. No data yet on whether it changes crash frequency.

Key Observations

  1. Reproduces on stock Ubuntu live USB. Crash #48 occurred on an unmodified Ubuntu 24.04.3 live USB (kernel 6.11.0-17-generic) — no custom kernel args, no amdgpu.dcdebugmask, no encrypted root, no installed software. This rules out my kernel build, configuration, and software stack as contributing factors. The issue is firmware or hardware.

  2. amdgpu.dcdebugmask=0x12 mitigates a related but separate issue. Without it, my first install on the HX 370 board had crashes within minutes of boot — sometimes before the kernel fully loaded. With it, I get daily-ish crashes instead. However, the live USB crashed after ~2 h 19 min without this flag, suggesting the display controller / PSR triggers a more aggressive crash pattern, while the sync floods are a distinct underlying problem.

  3. Crashes happen during active use AND idle. Several crashes occurred while I was away from the computer (lid open, system idle, no screensaver). One notable crash (#14) happened a few minutes after I left to eat — could be a power state transition.

  4. Clustering pattern: Jan 31 had 4 crashes (08:15, 08:37, 08:45, 12:44). Once the system starts crashing, it tends to crash again soon. The first three were only 22 min and 7 min apart.

  5. The RDSEED bug exists on my CPU: RDSEED32 is broken. Disabling the corresponding CPUID bit. — This is a known AMD hardware bug on the HX 370. While the kernel works around it for random number generation, it signals silicon-level issues on this platform.

What Would Help

  • Framework engineering: Is there any firmware/EC diagnostic I can run? I’m happy to install fw-ectool, run custom kernels, or enable any debug tracing you need. I have collectd logging temperatures, detailed Framework diagnostic logs for each crash, and can provide anything else.

  • Other FW13 AI 300 (HX 370) users: Are you seeing 0x08000800 in your dmesg? Run journalctl -b 0 | grep "reset reason" after an unexpected reboot. Please report your findings here. Also, if your FW13 HX 370 is running stable on Linux, I’d love to hear about it — I’m trying to determine whether this is a widespread platform issue or specific to my unit, and positive data points matter as I’m considering a replacement.

  • Framework team: It would help the community to know roughly how many RMAs have been filed for sync flood / 0x08000800 crashes on any FW (FW13/FW16) using AMD procs. Understanding whether this affects a small batch or a significant portion of units would help owners decide whether to wait for a fix or request a replacement.

RMA Status

I have an open support ticket with Framework. They’ve asked me to provide diagnostic logs (using their log-helper script), which I’ve done for every crash. Awaiting next steps.

Related Threads & References

Framework Community:

GitHub:

Non-Framework reports (same error):

Kernel / AMD:

1 Like

What devices do you have plugged into the card slots? I have proved that devices can cause this.

Devices tested had - at the same time - enough variations, and enough stability, to undermine the hypothesis that they may participate in the issue. I have 2 setups with different screen/docks, and I also tested with just nothing connected. I had at least one crash each times. Many crashes occurred with the same setup and no changes in the setup in all the runs in various frequencies (minutes after boot to several days). The crash never happened while connecting/disconnecting a device. Some of my devices have clear issues and are faulty (I have a dock with visible issues after some times) which triggers many connection problems in Ubuntu, or on hardware detection, but I never could connect it to a crash in a clear way. I connected it to 2 different 4k screens in long period of time through to different dock and tried also directly, with no changes in the crash frequency. The crash happened also without any dock nor any device connected on it.

By any chances, do you own a Framework on AMD ? Do you have random freezes ?

Hi,

I started this, so yes:

For background, there have been multiple false negatives investigating this, so nothing should be ruled out unless physically proved at your own hands, or at least reproduced by multiple people.

It might also be mitigated with kernel parameter:
processor.max_cstate=1

Can you see if it helps your situation, as you seem to be able to reproduce it more than me.

Taken from: