[TRACKING] Freezing Arch Linux AMD

:frowning:

Yes, I’m on 03.03

Are you noticing any other failures in your journal or dmesg relating to amdgpu?

any progress made on the i2c errors?

I had yesterday the same issue, after a reboot all is works again but dmesg shows this:
i2c_designware AMDI0010:00: i2c_dw_handle_tx_abort: lost arbitration

The machine is working perfectly, but the following messages appeared on journalctl -g i2c:

Jan 13 14:38:23 mls.frame kernel: i2c_hid_acpi i2c-FRMW0004:00: failed to change power setting.
Jan 13 14:38:23 mls.frame kernel: i2c_hid_acpi i2c-FRMW0004:00: PM: dpm_run_callback(): acpi_subsys_resume+0x0/0x80 returns -121
Jan 13 14:38:23 mls.frame kernel: i2c_hid_acpi i2c-FRMW0004:00: PM: failed to resume async: error -121
Jan 15 10:33:36 mls.frame kernel: i2c_designware AMDI0010:00: i2c_dw_handle_tx_abort: lost arbitration
Jan 15 11:22:57 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 12:05:05 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 13:00:20 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 13:12:56 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 13:15:47 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 13:18:37 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 13:26:46 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 13:51:20 mls.frame kernel: i2c_designware AMDI0010:00: i2c_dw_handle_tx_abort: lost arbitration
Jan 15 14:30:11 mls.frame kernel: i2c_designware AMDI0010:00: i2c_dw_handle_tx_abort: lost arbitration
Jan 15 15:07:15 mls.frame kernel: i2c_designware AMDI0010:00: i2c_dw_handle_tx_abort: lost arbitration
Jan 15 15:07:37 mls.frame kernel: i2c_designware AMDI0010:00: i2c_dw_handle_tx_abort: lost arbitration
Jan 15 17:14:19 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 17:42:25 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 17:55:55 mls.frame kernel: i2c_designware AMDI0010:00: i2c_dw_handle_tx_abort: lost arbitration
Jan 15 19:00:26 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 19:18:04 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 19:47:19 mls.frame kernel: i2c_designware AMDI0010:00: i2c_dw_handle_tx_abort: lost arbitration
Jan 15 19:47:22 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 20:01:58 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 20:30:16 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 21:19:40 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 21:25:16 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 22:16:14 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 22:18:14 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 22:28:22 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 15 23:07:40 mls.frame kernel: i2c_designware AMDI0010:00: i2c_dw_handle_tx_abort: lost arbitration
Jan 15 23:39:58 mls.frame kernel: i2c_designware AMDI0010:00: i2c_dw_handle_tx_abort: lost arbitration
Jan 15 23:55:12 mls.frame kernel: i2c_designware AMDI0010:00: i2c_dw_handle_tx_abort: lost arbitration
Jan 16 00:00:07 mls.frame kernel: i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
Jan 16 16:14:19 mls.frame kernel: i2c_designware AMDI0010:00: i2c_dw_handle_tx_abort: lost arbitration

Just experienced the GPU crashing again. I’m on the kernel 6.7.1

Jan 26 09:58:15 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:15 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:15 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:16 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:16 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:16 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:16 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:17 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:17 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:17 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:17 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:18 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:18 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:18 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:19 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:19 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:19 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
Jan 26 09:58:19 smigs-space kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
2 Likes

Pretty certain this can be resolved with these firmware blobs. I’ve had them installed about a week now and haven’t had this specific crash happen anymore. I haven’t had any other crashes, so I can’t speak to those.

Try downloading the two files from this amdgpu commit and placing them in /lib/firmware/amdgpu/ and then running the following (probably good practice to backup the existing files before replacing them):

# Keep in mind that '-k all' will operate on all installed kernals. Skip this flag if you only want to affect the latest kernel
sudo update-initramfs -c -k all

Source: AMDGPU crash Error queuing DMUB command: status=2, Error waiting for DMUB idle: status=3 (#2862) · Issues · drm / amd · GitLab

1 Like

Any news on the internal ticket for i2c issue @Matt_Hartley ?

I’ve just noticed those messages on my AMD FW on kernel 6.1.82 (LTS kernel) :

[106124.933095] i2c_hid_acpi i2c-FRMW0005:00: i2c_hid_get_input: incomplete report (7/65535)
[106376.420125] i2c_designware AMDI0010:00: i2c_dw_handle_tx_abort: lost arbitration
[106922.161361] i2c_designware AMDI0010:00: i2c_dw_handle_tx_abort: lost arbitration

I’ve also noticed the other issue I believe

[91789.549107] [drm:dc_dmub_srv_cmd_queue [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
[93661.864250] [drm:dc_dmub_srv_cmd_queue [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
[101086.107607] [drm:dc_dmub_srv_cmd_queue [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3

@cmstew has this issue about DMUB idle status=3 been fixed upstream on your side ?

Working the tickets in order, so as I come across it I will be replying there. :slight_smile:

1 Like

I think I’m having the same issue on FW 16 running arch, I’m getting the following errors on dmesg. Symptoms are that my mouse and keyboard inputs become extremely laggy

[ 1388.384724] amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
[ 1388.622583] amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
[ 1388.861489] amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
[ 1389.099321] amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
[ 1389.338171] amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
[ 1389.578341] amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
[ 1389.817280] amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
[ 1390.057236] amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2
[ 1390.296214] amdgpu 0000:c1:00.0: [drm] *ERROR* Error queueing DMUB command: status=2

I’m seeing the same errors on my FW16 too for the past 2~ weeks before and after BIOS 3.0.3 and happens at random, I don’t know any triggers of this yet

There seems to be others with similar or same issue(s).

When the issue is there, these errors are repeated ( looks like every 1-2 seconds ), my keyboard and track pad are just really slow at updating and the screen is updating really slow. I usually just cold reboot.

I’m currently using @Mario_Limonciello’s suggestion from here ( I think it’s just for debugging purposes ) which is a kernel parameter

amdgpu.dcdebugmask=0x10

This parameter disables PSR ( Panel Self Refresh ).

Since I’ve set that kernel parameter it’s been nearly two days so far the issue hasn’t propped up yet but maybe to early to tell as it’s a random issue.

I’m using Arch Linux ( Linux 6.8.4 ) with KDE Plasma 6.0.3 and (gitlab) linux-firmware commit 2180c887

On the latest BIOS and this kernel: 6.8.5-arch1-1 and noticed things are running a lot smoother.

Haven’t seen an i2c error in a while either.

since using the kernel parameter

I’ve not seen the error and that’s using kernels 6.8.4, 6.8.5, 6.8.6 and 6.8.7

been 2 weeks now