USB-C Problems with USB-PD cycling every second (ANG)

Hi,

Note: ANG stands for “ACCEPT, No GoodCRC”

Equipment:

  1. FW16 laptop
  2. USB-C external NVME enclosure.
  3. Non-FW16 laptop.
  4. A usb-c cable with breakout.

The problem I see is that with the same NVME enclosure and USB-C, it works reliably on a Non-FW16 laptop. e.g. A HP laptop, but keeps cycling, about every 1 second, and never connects to the FW16 laptop.

I have done some analysis with USB-C sniffers/testers and a data capture oscilloscope.

Some details can be found here:

In summary, the SRC (FW laptop) sends a “Accept” on CC1 but the the SNK (device, NVME enclosure) never sends a GoodCRC back.

I have looked in detail at the content of the “Accept” message and compared the oscilloscope data captures between the FW16 laptop and the Non-FW laptop.

The CC messages themselves are all pretty much the same.
So, why is the SNK never replying to the “Accept”.
After a lot of testing, I think the best theory I have is that the SNK (device,NVME enclosure) is not detecting the IDLE between CC messages, and thus never cuts in and tries to transmit the GoodCRC.

I have found one major difference between the two.

  1. FW16 laptop idles at about 0.88 Volts.
  2. Non-FW laptop idles at about 1.67 Volts.

The normal swing (peak-to-peak) of CC messages is between 0V and 1.2V.
The idle is supposed to be when the output is set to “high impedance”.

As far as I can tell, the only thing different that can affect the idle volts is the value of “Rd” and “Rp” in the laptop.

Note: Another observation is that changing the USB-C cable changes the outcome. I.e. some cables work, some do not work on the FW16, but all the same cables work on a Non-FW laptop.

There are possibly other causes for the NVME not responding to “Accept”, but I need more time to look into those. E.g. 1) As the idle volts is only 0.88V, the first bit of the 64bit preamble is not detected or an extra one is added so not 100% reliable. With idle volts at 1,67V, the 64bits are very reliably detected. But that is why one has the “Sync-1” tokens, so the receiver can get back into sync event if there are not exactly 64 bits of preamble. E.g. 2) The SNK (NVME device) does not have enough power to respond. Needs further investigation. The SNK has been able to send a “GoodCRC” and “Request” messages prior to receiving the SRC “Accept” messages.

The USB-PD R3.2 standard sections relating to “Idle” are:
5.7 Collision Avoidance
5.8.5.4 Inter-Frame Gap
5.8.6.1 Definition of Idle.

From those sections, an endpoint detects idle by detecting less than nTransitionCount transitions in time window tTransmissionWindow. Thus “idle” detection is due to a lack of transitions rather than any idle voltage level. So, I don’t think the 1.67 vs 0.88 Volts is a problem unless the “high impedance” requirement is missing.

I am hoping that someone else, with more knowledge of USB-PD than me, can let me know if these differences are significant or not.

4 Likes

Looking at the screenshot.
This is an oscilloscope trace showing the end of a USB-PD CC message, showing it entering the “idle” period.
The White trace is a FW16.
The blue trace is a non-FW laptop.

Notice that the non-FW laptop holds the level at 0V for longer, before it lets it return to its high impedance state.
The FW is holding it for about 1uS, whereas the non-FW laptop is holding it for about 8uS.
The max allowed hold time is 23uS, the minimum is 1uS.
It looks to me that the FW16 USB-PD is cutting it fine, and should probably hold it for longer.
Note. I think the line is only wobbly because it was done with an only 8bit oscilloscope at a relatively low sample rate. There are only about 5-6 samples per wave.
Notice also, that the FW16 high impedance level is below the wave peaks, but the non-FW laptop high impedance level is way above the wave peaks.

1 Like

Imagine the things we could do if the pd controller was open source too…

1 Like

@Adrian_Joachim
I know. I could then do crazy stuff like actually fix all the FW USB compatibility problems!!!

Here we have a clear and obvious bug:

  1. It looks like the main problem here is the Laptop(SRC) sends the USB-PD CC1 (BMC encoded) “ACCEPT” but never received a GOODCRC in return from the device(SNK).

I don’t know what the cause of this is yet, it could be any off (guesses):

  1. Signal noise or distortion or wrong levels on the Laptop TX side so that when it arrives at the destination device, it sees CRC errors. Note: There is no reporting of bad CRC, so this will be hard to be sure about.
    A: From looking at the oscilloscope traces, their may be a levels problem. I.e. the Volt level during idle periods might be too low. There is no obvious noise or distortion problem otherwise.
  2. The Laptop TX side does not wait long enough between transmissions for the device to react.
    A: Minimum inter-frame-gap is 25 uS. The FW16 waited and retried “Accept” in 1099 uS. So, it waited long enough.
  3. The Laptop RX side is not sensitive enough and sees more CRC errors that other non-FW laptops.
    In summary, I need to find a way for the spy/monitor to report bad CRC errors and maybe hook up a digital oscilloscope to look for problems in the signal.
    The TX and the RX is done over the same cable it will be difficult to know which side is sending when on the oscilloscope display.
    A: From looking at the oscilloscope traces, this looks unlikely. The SNK (device) never actually sends the message to the SRC(laptop), so not likely a Laptop RX sensitivity problem.
  4. laptop not providing enough power to the device to complete the CC power negotiation process.
    A: Not investigated yet.
  5. The SNK (device) is simply not responding to the SRC (laptop) ACCEPT request.
    I have connected a data capture oscilloscope to the CC1 pin and then decoded the resulting CC messages. Checked all their CRCs and they are OK, but (5) is confirmed. the SNK is simply not responding to the SRC ACCEPT request as no pulses for it appear on the oscilloscope.
    A: This is clearly the current symptom / bug.
  6. The SNK (device) is not correctly detecting when the CC link is “idle”, and thus not responding when it should. This needs further investigation. The “gap” between messages does appear to be different when comparing FW and non-FW laptops.
    A: This is the most likely cause currently.
  7. Observation, but no impact on this problem: Neither the FW or the Non-FW laptop send requests to find out which type of cable is attached.
    A: Not investigated yet.

Note: It is interesting to note that the “idle” voltage is not mentioned anywhere in the USB-PD standards document. It does not mention that the “idle” voltage clearly affects the detection of the first bit of the next message or not, and has no commentary on how the receiving device should react to this. So, if this “idle” voltage is the real problem, no USB test equipment would ever test for it!!!

Caveat: When I mention non-FW laptop, I have only tested with a single non-FW laptop. So I cannot have a view on whether all non-FW laptops behave the same as this single non-FW laptop.

2 Likes

Excellent work! As you find out more, your continued sharing here is very helpful for all the others who might be trying to run this problem to ground.

1 Like

Trying to understand why the “idle” voltage is 0.88V for the FW16, but 1.67V for the Non-FW laptop.

That is “Figure 5.24 Transmitter Load Model for BMC Tx from a Sink” from the USB-PD R3.2 standard doc.

Here the left hand side is the Sink (Device) wishing to transmit to the SRC (Laptop) on the right hand side.
During idle in the FW16 case:
The Sink (Device) has Rd.
The SRC (FW16 Laptop) has Rp

During idle in the Non-FW laptop case:
The Sink (Device) has Rd.
The SRC (Non-FW Laptop) has Rp

The only difference therefore is Rp on the laptop side.
But, the Laptop does have both Rp and Rd. They are programmatically switched in and out of circuit depending on the Laptop role (Source or Sink).

So, could it be that the FW Laptop is accidentally leaving Rd in circuit during idle periods and when it is acting as a SRC, when it should not?

I would need the source code for the PD firmware to know for sure, but I don’t have that.
I am only mentioning it, because I think it is the first thing that FW engineering should check.

Some maths to look into this further.
A simple two-resistor divider gives Vout = Vin · R₂ / (R₁ + R₂)
With the Vin = 5V.
For the non-FW laptop case, Vout = 1.67V
Result: R1 = 2 * R2.
So, say R1 = 20k, and R2 is 10k, we would get 1.67V out. So, here Rp = 20k, and Rd = 10k.
Note: Rp and Rd given random values here, the important part in these calculations is their ratio, not their actual values.
What about if Vout is 0.88 V.
Result: R₁ ≈ 4.68 × R₂
So, say R1 is still 20k, R2 is then needing to be about 4.2k.
If there were 2 Rd still in circuit, then 2 Rd in parallel =
1 / Req = 1/R2a + 1/R2b
1 / Req = 1/10k + 1/10k
Req = 5k.

So, with both Rp and Rd in circuit on the SRC, and Rd in circuit on the Sink.
The result is surprisingly close to what we are seeing in reality.
I.e. 5k is close to 4.2K.

Note: The USB Standard states that the Rd is 5.1k ± 5%
The Rp can be (programmatically set to one of): 56K ± 20%, 22k ±5%, or 10K ± 5%.

But, the analysis in this post is not correct. See following posts where the Rp is discussed ,where setting it to the 10K 3A value, instead of the FW set value of 22k, helps things considerably.

1 Like

I have raised an issue here, to help track it.

3 Likes

Is it possible the two laptops are just “passively” requesting different currents?

The whole pullup and pulldown on cc is used by “dumb” pd devices to signify how much 5V the want to/can draw.

I recently had a look at the actual pd spec but I found the parts about how the physical interface is supposed to work pretty hard to read.

As far as I know, the “dumb” PD devices don’t do dumb stuff on USB-C cables.
Its the old USB-A cables that support the “dumb” stuff.
You can get a USB-A to USB-C cable, but when that happens, CC is not involved at all and it uses the VBUS for signalling.
So, essentially, old “dumb” stuff happens on the VBUS pins, Modern USB-C PD uses the CC1/CC2 pins.

No there definitely is stuff going on on the cc lines on dumb pd devices.

I don’t mean the whole oldschool resistors on the usb2 data pairs for power stuff.

Ah, are you referring to this document:
Title: “USB Type-C® Cable and Connector Specification Release 2.4”
DocName: “USB Type-C Spec R2.4 - October 2024.pdf”
https://www.usb.org/sites/default/files/USB%20Type-C%202.4%20Release%20202410.zip

“Table 4-36 Sink CC Pin Voltages for Connect and Current Advertisement Detection for Rd ± 10%”
Where CC Voltage ranges can be:
vRd-1.5; Min=0.746 V, Max=1.164 V
vRd-3.0: Min=1.369 V, Max=2.042 V

So, a CC voltage of 0.88V is reasonable, if the FW16 SRC is advertising 1.5A.
But the bit that is a little confusing is that in the associated CC message, the FW16 SRC advertised 3.0A
and SRC (FW16) “Accept” the Sink Device “Request” for 3.0A.
So, why is the CC voltage 0.88V. Is should be in the vRd-3.0 range to match the CC message.

But none of this really explains why the Sink does not send a “GoodCRC” response back when the SRC sent the “Accept”.

the 1.5A is probably just the default, what it negotiates over active pd later doesn’t have to be related.

The dumb resistor pd power is just needed to power on the pd controller to negotiate more and 1.5A should be enough for that.

Those may be unrelated or at least separate issues.

@Adrian_Joachim
Do you have any ideas why the Sink does not send a “GoodCRC” response back when the SRC send the “Accept” ?

Bug in the pd controller firmware?

Now that I have done quite a few tests, I have thought of some more things to do.
Focusing on the "SRC send “Accept”, Sink fail to respond with “GoodCRC” problem.

  1. Take the oscilloscope measurements at the Sink (Device) end of the cable. I was doing them at the SRC (Laptop) end previously. The signal might be noisier or distorted in some way at the Sink end.
    A: Not investigated yet.
  2. The Non-FW laptop seems to have placed some “per-emphasis” on the CC signal, that is not present on the FW laptop CC signal. Investigate what affect this has on the signal at the Sink end of the cable.
    A: Not investigated yet.
  3. Look for anything in the signal that might cause the Sink (device) to not be able to decode it.
    A: Not investigated yet.
  4. The problem is cable related. Some cables work, some do not on the FW16 laptop. But all cables work on the Non-FW laptop. This implies there there is a signal quality issue, more than a protocol related bug.
    A: Not investigated yet.

I had a thought about the 0.88 V for FW16 laptop vs the 1.67 V for non-FW laptop.
Could their be an API between the EC and the PD that sets which to use?
I already have my own customized EC so I could easily experiment with the API between the EC and PD to maybe set the FW16 to idle at 1.67V instead and see if that helps the problem (Accept, No GoodCRC (ANG) ) at all.
Does anyone have knowledge of that EC to PD API that could tell me what to do to achieve this?

1 Like

You may not even need to modify the ec for it if you have a ccd, the stock ec already has ccd commands that let you send arbitrary messages to the pd controller. You would have to find out what to send though. At least it does answer if it doesn’t like what you send so there is that.

It is also possible the dumb pd value is a physical resistor on the main-board as it kinda has to be there even if the pd controller is entirely off but we don’t have enough schematics to know that.

The EC code has this function:
cypd_select_rp()
It is used to set the profile to either:
CCG_PD_CMD_SET_TYPEC_1_5A == 1
CCG_PD_CMD_SET_TYPEC_3A == 2

It then prints on the EC CCD console messages like this when it sets it:
P:2 SET TYPEC RP=2

When one disconnects a USB-C cable, it set it back to 1.5A
P:2 SET TYPEC RP=1

There is the mention of “safety…” in the comments.
But, what if RP actually stands for setting the Resistor Pull Up value?
One might be able to use it to change the CC1 from 0.88 V to 1.6 V so that it acts more like other Non-FW laptops.

I really need someone to send me the datasheet to be sure:
HPI 001-97863: 001-97863_0N_V.pdf

I can probably test this without even changing anything.
I just need to put an oscilloscope on CC1, plug a device in that sets it to “P:2 SET TYPEC RP=2” and then see what the output from the oscilloscope is. Hopefully it will move to 1.6V <— Not yet tested.

Pretty sure that was one of the ones I was playing around with when I tried messing with the pd controller.

If you look at the output cypd_get_status (ccd) the ports do report being in 1.5A mode by default so there may be something to it.

This function exposes register write on the pd controller through the ccd.

You can figure out what register to write what value to by following cypd_select_rp, I remember this being one of the few registers that actually did something but I didn’t note down the values as I was just messing around.

I my case I was seeing if I could make the ocp less bad so my nice portable oled screen would not need external power to work with the 13, that unfortunately didn’t work but it may have indeed changed the pullup/down config.

That is certainly a possibility.

Does the CCD chip have any OCP on its outputs?
I.e. if it sets 1.5A, and a device tries to pull 2A, will it limit it to 1.5A or switch off.
Or is there no OCP?

I think the limits FW put on its ports current don’t make much sense.
It says the one device can do 3A. And the rest only 1.5A. But that makes the total 7.5A. So that should allow 2x 3A devices, so long as nothing else is plugged in.