[RESPONDED] Deterministically Flaky DisplayPort - Linux, Intel 12th Gen, Multiple Docks

I’m having lots of fun trying to track down a displayport issue when using multiple monitors connected to multiple docks. My setup is as follows:

  • 12th Gen Intel Framework in the Cooler Master case, so no laptop screen.
  • Caldigit Element Hub plugged directly into one port on the mainboard.
  • Pluggable hub (this one) plugged directly into an adjacent port on the mainboard.
  • Linux kernel 6.3 on Void Linux, displays managed via X.

The two docks are required to get the correct mix of ports for my use case. I normally run with only the Caldigit hub being powered, but I have also tried powering the pluggable hub simultaneously with no change in the observed operation.

This whole setup is built into a ruggedized portable rack that I use for video work in a more self-contained form factor than my previous setup, which was multiple milk-crates full of cables. I use 2 portable displayport monitors connected over usb-c cables to the caldigit hub to provide the screens I’m looking at, but then I also would like to use 2 HDMI ports coming off the pluggable hub to provide additional auxiliary outputs. This adds up to the 4 total screens I would expect to be able to drive from the Intel GPU, and crucially all are internally DisplayPort.

If I boot the system with just my DP portable screens plugged in then they both light up and the dual head X session loads up just fine. In this state, the DP ports on the pluggable dock are not even enumerated and do not show up either in xrandr --query nor find /sys/devices -name "edid". Just to be sure, I also checked the output of xrandr --listproviders and only one provider is shown as expected.

In this state, the output of xrandr shows DP-1 through DP-4, with my monitors plugged into DP-3 and DP-4. When viewed from the top oriented with the audio connector on the lower left, the caldigit hub is plugged into the top left port and the pluggable hub is plugged into the lower left port. I’m not sure where DP-1 and DP-2 are coming from, as there is nothing else plugged in that provides a DisplayPort sink.

If I reboot with an HDMI monitor plugged into the pluggable dock, then it lights up and DP-4 works, but DP-3 appears to become a dead port and I cannot enable the monitor connected to it via any invocation I have yet tried. The output of the find command expands to the expected 6 devices, and the naming convention of the new devices makes me think that DP-MST is involved in the pluggable dock, though I do not have documentation to support this claim.

I can reboot, swap cables, restart sddm and xorg, but regardless of any changes I make these two states are stable, and the only transition that allows moving between them appears to be a full reboot. The monitor output is the same during the UEFI post, so while I am running Linux, I don’t think that’s actually germane to the given problem.

Any advice is appreciated here to achieve a state where:

  • All DP interfaces are reliably detected.
  • A total of 4 DP sinks can be enabled at once.
  • An understanding of what is actually happening that causes the pluggable hub’s DisplayPort sinks to become invisible.

This could be an issue as the docks fall outside of our control. The number one rock solid piece of advice I offer is avoid docks for HDMI/DP duties in Linux. They “should” work, but, when they don’t work, our hands are tied.

The results here are going to be pretty frustrating. Docks are best for USB as they do this well.

Expansion cards however, that we can control and help with.

Now, while totally not something we can fix per se, you can try different kernels. Xorg vs Wayland and see if you get a better result.

Hi @Matt_Hartley, thanks for the reply.

I think a key thing that got latched onto here, and that I’d like to step back from, is that I’m observing this behavior while running Linux. The behavior is also present while in only the UEFI configuration screens, so we can reasonably completely remove the kernel, the windowing system, and the entire non-framework managed code from the picture.

I don’t think you meant it this way, but this comes across as really disingenuous and a “buy our products” type answer. It also ignores that expansion cards are not a viable solution to drive 4 simultaneous displays and do anything else at the same time.

An expansion card is by definition a single port dock, and if the mainboard is conformant to the thunderbolt specification, which I have no reason beyond the DP conformance issues to doubt this claim, then multiple streams to multiple sinks is a fully supported part of the DisplayPort standard as embedded into the thunderbolt standard. Given that a conformant TB source must support at least one full DP interface (4 data lanes plus the standard AUX lane) I should always be able to see at least one DP interface on each dock, even in degraded 2 lane mode in the case that the TB controller never clocks up the port out of the dual personality DP/USB3 mode (even in the reduced functionality mode, all the hardware should enumerate, which it does not).

Since my original post, I secured a different dock to test with from a different manufacturer that uses a different internal controller layout and chipset. Given that it exhibits the same behavior, and given that the only point of commonality in that test is the framework, all tests right now are pointing at a probable firmware defect. If there are additional tests or signal topologies you would like me to test I am happy to do so, but this really doesn’t look like a bug on the dock end of the equation. Unless the mainboard ports are somehow sharing TB control hardware, an enumeration failure like this should be impossible.

Shortly after sending that I couldn’t shake the feeling that I hadn’t actually confirmed 100% for sure that the pluggable dock is actually a thunderbolt dock. A quick check with boltctl confirmed my suspiciouns:

maldridge@omnicast:~$ boltctl
 ● CalDigit, Inc. Element Hub
   ├─ type:          peripheral
   ├─ name:          Element Hub
   ├─ vendor:        CalDigit, Inc.
   ├─ uuid:          <redacted>
   ├─ generation:    USB4
   ├─ status:        authorized
   │  ├─ domain:     <redacted>
   │  ├─ rx speed:   40 Gb/s = 2 lanes * 20 Gb/s
   │  ├─ tx speed:   40 Gb/s = 2 lanes * 20 Gb/s
   │  └─ authflags:  boot
   ├─ authorized:    Wed 27 Sep 2023 08:00:44 AM UTC
   ├─ connected:     Wed 27 Sep 2023 08:00:44 AM UTC
   └─ stored:        Wed 27 Sep 2023 07:47:43 AM UTC
      ├─ policy:     iommu
      └─ key:        no
maldridge@omnicast:~$

The absence of the pluggable dock suggests I’m almost certainly dealing with conventionally broken USB behavior. This further seems to be the case given that I can plug in one of my known good Lenovo docks directly and drive additional monitors off of it. The Lenovo dock still isn’t stable over a reboot, but it does suggest that the pluggable dock is doing something dumb at the USB level, and explains why behavior that absolutely has to work for TB conformance doesn’t work here, since the device isn’t a thunderbolt device to begin with.

So now I think my quest has become a second reasonably priced thunderbolt dock that provides onboard breakout instead of just providing more thunderbolt ports.

One quick question is the onboard display operational during these attempts? I did not notice in any of your posts where you suggest that it is off. The built in display counts towards the maximum number of display.

@nadb

xrandr does not report that the eDP port is even attempting to be driven, so I think we can reasonably assert that it is not enabled during any of these tests.

1 Like

I believe what Matt was trying to say with this answer was “the issue could be that the dock you are using is having issues with linux, in which case Framework support won’t be able to help you, as they would only be able to offer support if you were having a problem with linux and one of their expansion cards” This is not to say that you need to buy their products, only that they can only offer support to the parts of your setup that they sell.

2 Likes

To be ultimately clear, we have no control over how third party devices interact. We do our best, but when an incompatibility happens, the answer is to remove said device and try something we’ve vetted.

Yes, this is basically it. I have docks, too. Some work great and some do not. Unless I have a matching dock sitting in front of me, my hands are tied.

3 Likes

Sorry I missed that. One thing I have noticed while testing a variety of docks to eventually land on the OWC 11-Port Thunderbolt Dock is that direct hdmi and dp connections usually resulted in the most buggy behavior. Direct usb-c or adapters converting to usb-c connections worked best. Also the 3.05 BIOS was hot garbage as far as any dock connections were concerned. The 3.06 beta cleared up all my Thunderbolt issues.

Progress has happened. I sourced another actual thunderbolt 3 dock, none of this USB-thunderbolt-compatible business, actual thunderbolt that shows up in boltctl. Everything works now and I’m willing to consider this solved, but there’s an artifact of the thunderbolt controller architecture that I’m not getting that might be worth putting in the docs.

When I put both docks on one side, only one can pull 2 DP outputs, whereas if I put them on opposite sides, I can pull all 4 DP outputs out. This is behaving almost like there’s some shared mux between the two ports on each side, but there doesn’t seem to be anything in the external monitors support page that confirms that. Is there a shared mux per side?

What I’d really like to see is a diagram like this one that explains the various controllers and datapaths, and what’s shared across what physical interfaces: https://www.caldigit.com/element-hubs-controllers-and-data-paths/. I think If I’d had that to start with, I would have been able to answer a lot of my own questions about why certain configuratinos were unstable.

Yes. The Intel CPUs only have two TB4 outputs. They’re directed to each side and Intel TB4 drivers/retimers split the signal over the two side ports.

There’s a code name for them - Maple Ridge? JHL8540? Intel's Maple Ridge (JHL8540) Thunderbolt 4 Controller Now Shipping