XMEGA DFLL USBSOF needs manual COMP tuning - why?

Go To Last Post
33 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hey guys, this is beyond my grasp so here's my problem and the workaround I've been using so far:

 

- XMEGA32A4U, using internal oscillator

- 32MHz internal RC tuned to 48MHz, system clock prescaled down to 12MHz

- USB interface enabled, full speed, ASF framework

- DFLL autocalibration using USBSOF

 

The target application is a USB keyboard controller. PCB layout, components and firmware are identical for every board. Everything is running fine - however, as I've just discovered, there's a small glitch that seems to stem from variations of the individual MCUs used.

 

The glitch is this: without DFLL USBSOF autocalibration, the keyboard controller's lock lights would randomly not get synchronized correctly, i.e. upon pressing, for example, the NL key on a second keyboard, the corresponding LED connected to my controller board would not light up or stay on sometimes. With DFLL USBSOF autocalibration this glitch didn't occur and the LEDs always got synchronized - until recently.

Interestingly, when sending data from the client to the host, there's never been a problem at all. The glitch in USB communication only occurs when receiving data from the host, s.a. a SET_REPORT request (which reports the operating system's lock light status to the client devices connected).

 

I discovered that when tweaking the DFLL COMP values, effectively synching to a higher clock speed than what it ought be, the problem will magically disappear - but only on two boards that I've been working on recently, on a third board the default DFLL COMP values work just fine:

 

ISR(USB_BUSEVENT_vect)
{
    
    // USB bus event has occurred, so USB must be connected
    if(usb_connected == false) usb_connected = true;
        
    
    
    if (udd_is_start_of_frame_event())
    {
        udd_ack_start_of_frame_event();
        udc_sof_notify();
        if((DFLLRC32M.CTRL & DFLL_ENABLE_bm) == 0)
        {
            DFLLRC32M.COMP1 = 0x68;
            DFLLRC32M.COMP2 = 0xBF;
            OSC.DFLLCTRL |= OSC_RC32MCREF_USBSOF_gc;
            DFLLRC32M.CTRL |= DFLL_ENABLE_bm;
        }

 

... // more code here

}

 

So, obviously, with the default value of 0xBB80 the USB clock is off as far as the two individual MCUs mentioned are concerned, otherwise there would be no glitches when receiving host data. So, can I conclude that with the a value of 0xBF68 (= 49MHz) the USB clock is now in fact closer to 48 MHz than before?

 

Another thing: if I don't enable DFLL autocalibration at all the mentioned glitches will also disappear - while with a third controller board that runs perfectly with the default DFLL settings, the glitches will only appear when DFLL autocalibration is disabled. So, I've got, technically speaking, identical boards running exactly the same firmware but there seem to be two different categories of MCU that behave completely differently on the same settings. What's going on here?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It doesn't make sense.  Either there is a problem with your PC host, or a problem with your code.

 

It's probably not the host, but if you can plug your stuff into another PC, that would find out.

 

Can you measure the AVR CPU clock?  I have a module that exports the CPU clock, divided by 1000, on a pin of your choice.  This can be measured by the frequency meter in a DMM, or a scope.

 

Your AVRs aren't identical when not using USB SOF.  The 32 kHz oscillators aren't perfect.  Some are more accurate that others.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

It doesn't make sense.

 

I agree.

 

Quote:
  Either there is a problem with your PC host, or a problem with your code.

 

It's probably not the host, but if you can plug your stuff into another PC, that would find out.

 

it's probably not the host, but I will check just to be sure and report back.

 

What could be wrong with my code?

 

Quote:
Can you measure the AVR CPU clock?  I have a module that exports the CPU clock, divided by 1000, on a pin of your choice.  This can be measured by the frequency meter in a DMM, or a scope.

 

I tried measuring the CPU clock with and without a clock division method (software, timer, clock output to pin). The clock is not very precise when the DFLL is disabled, but I've never had any problems running an XMEGA off the internal oscillator. What I wanted to find out is the actual clock when the USB SOF calibrated DFLL was on (using a while loop to endlessly toggly a pin) which should be something very close to 48MHz. However, after enabling the USB interface there is no output anymore. Why I don't know. Perhaps we should crack that nut first?

 

Quote:

Your AVRs aren't identical when not using USB SOF.  The 32 kHz oscillators aren't perfect.  Some are more accurate that others.

 

Why do you think tweaking the COMP value helps at all? What exactly happens when you increase/decrease that value? I understand that a higher value would result in a clock >48MHz while a lower value would result in a clock <48MHz. Is that hot it works? How does DFLL calibration via USB SOF actually work?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mind_prepared wrote:

 

What could be wrong with my code?

 

Unless you are God, anything.  smiley

 

If the USB data lines are running all over the place, it might cause erratic USB operation.  I don't use ASF so I can't be of much help.  If you fail to correctly adjust the built-in USB resistors, that might cause erratic operation.

 

DFLL works by comparing the controlled freq. (48MHz) with the "gold standard" freq. which could be SOF or it could be the 32kHz RC osc.

 

It finds how many ticks of the controlled freq. it sees between ticks of the gold standard.   It increments a counter for each tick of the controlled freq.  When it gets a tick from the gold standard, it notices the count and then clears it to zero.  If the count is more than it should be, it decrements the controlled freq. fine adjustment register. (CALA).  and vice versa.

 

What Atmel calls the comp (compare), I call the multiplier.  It is the controlled freq. divided by the gold standard freq..  (1000 from SOF or 1024 from the 32 kHz osc.)

 

So when using SOF, it's 48,000,000 divided by 1000 = 48,000 = 0xbb80.

When using 32kHz osc., it's 48,000,000 divided by 1024 = 46,875 = 0xb71b.

 

When monitoring the CPU freq. at a pin, your program can only use idle sleep, or not sleep at all.  Any deeper sleep will disable the CPU clock.

 

 

 

 

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

steve17 wrote:
When monitoring the CPU freq. at a pin, your program can only use idle sleep, or not sleep at all. Any deeper sleep will disable the CPU clock.
Actually, that is not quite right.  Duh.   What I, and probably you, are measuring is the PER clock, not the CPU clock.  Any sleep will disable the CPU clock, but that doesn't matter.  A deeper sleep will disable the PER clock, and that will screw up the frequency at the pin.

 

The PER clock and the CPU clock are the same clock except the CPU clock, but not the PER (peripheral) clock, is disabled when in idle sleep.

Last Edited: Sat. Jan 19, 2019 - 04:09 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

thanks Steve, that makes sense. However, I don't have any reason to believe that something is wrong with my code or clock configuration. The boards all work fine, there is no problem with USB communication, at least not that I know of. Out of the last 30 or so boards I made only two are showing the behavoir described above. What puzzles me is that these boards exhibit no quirks when the DFLL USBSOF calibration is disabled - while all other boards behave exactly the other way around (as you would expect).

 

Why do you think deep sleep will be enabled when the USB interface is enabled? The only thing that could send the CPU to sleep is a suspend signal from the host but that is not the case as far as I can see - unless Windows issues a suspend signal immediately after enumeration without a resume.

 

Something must be causing a deviation or otherwise DFLL calibration would work as it's supposed to.

 

I need to figure out how to measure actual CPU clock when DFLL USBSOF calibration is enabled first and then report back to be sure we're not wasting time groping in the dark or looking in the wrong place.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yeah, you are right about the sleep.  When the USB is running, the only sleep you can use is idle sleep.  

 

I don't know why you can't measure the CPU/PER frequency when SOF is used.  I can.  I use a counter/timer.   I can send you a project or 2 that might help to find what's wrong.

 

I think USB requires the freq. be within 0.5%.  Most of my chip's 32 kHz osc. are, but some aren't. You can adjust the 32kHz osc. to be within spec., at least when the temperature doesn't vary a lot from room temperature.  For commercial use you would want to store the adjustment on the chip, maybe in the User Sig Row.   It's easier to use SOF, but for testing it might come in handy.   

 

I did find one bug in the silicon.  When the host suspends the device, the device no longer sees the SOF.  The silicon sees this and disables DFLL.  The frequency should remain stable without DFLL as long as the temperature, and maybe the voltage, doesn't change much.  When we subsequently get a resume, the silicon re-enables the DFLL.  That works here when I cause the suspend and resume from Device Manager by disabling and re-enabling the device.  However, when my PC goes to sleep and then awakens, the silicon screws up.  It does see the resume and does re-enable the DFLL, but the DFLL doesn't see the SOF even though we get SOF interrupts.  This causes the DFLL to drive the fine freq. adjustment (CALA) down to 1, and the resultant freq. is about 10% low.  I fix that by monitoring CALA after the resume.  If the value goes to 1, I disable and re-enable

the DFLL and that fixes it.

 

I will post a project that allows monitoring of the CPU/PER clock under all conditions.  You can configure the pin to use easily, but if you give me a pin to use, you can use the avr program without rebuilding it.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

OK, so I've got another piece of silicon that needs additional tuning. This time I figured out how to correctly output CPU clock on a GPIO pin. Jfyi, with the standard DFLL COMP setting of 0xBB80 I occasionally get what I presume is some kind of synchronization error when the USB host initiates communication (in this case a SET_REPORT request to update the status of the lock lights). I have a scope but I don't have a hardware USB protocol analyzer to see what's going on on the bus.

 

Here are my findings, quite puzzling still and disappointing, because I haven't gained any new insight:

 

Setup 1:

Default COMP value for DFLL source USB SOF: 0xBB80, i.e. 48000. Main system clock SYSCLK is tuned to 48MHz, CPU clock is SYSCLK/4, i.e. scaled down to 12 MHz. Clock output read with oscilloscope is 12.0000 MHz +/-0.0160. I conclude that the DFLL works nicely, clock speeds seem to be quite accurate - but, oh my!, the supposed occasional sync errors occur notwithstanding!

 

Setup 2:

Tuned COMP value for DFLL source USB SOF: 0xBD00, i.e. 48384. Main system clock SYSCLK is tuned to 48.384MHz, CPU clock is SYSCLK/4, i.e. scaled down to 12.096 MHz. Clock output read with oscilloscope is now 12.0960 MHz +/-0.0160. I conclude that the DFLL still works nicely, but the clock speeds are now actually significantly off - but, oh my!, the supposed occasional sync errors are gone and everyhing works consistently fine!

 

I said 'supposed sync errors'. Since tuning the DFLL COMP value, more precisely: since increasing the COMP value, seems to fix the issue for a particular piece of silicon, I conclude two things:

 

1. the clocks of the USB host and my XMEGA client device must be somehow out of sync without tuning. Otherwise I cannot understand why the client would occasionally 'miss' incoming requests (or receive corrupt data or whatnot, I don't know yet what is going on exactly)

 

2. The XMEGAs USB clock must have been running too slow - even though the measured CPU clock speed seems to suggest that the USB clock has increased as expected.

 

What's going on? Does my reasoning even make sense? What else could be causing the described behavior if not the XMEGA itself?

Last Edited: Wed. Apr 10, 2019 - 04:41 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You must always keep the PER clock running when attached to the host USB.  The only sleep allowed is idle sleep.  Apparently you weren't doing that at one time because you didn't see the clock on a pin sometimes.

 

Maybe you have a bad USB cable or USB connector.  Maybe the USB traces are quite long or quite crooked.

 

I think I can give you a simple (for USB so not very simple) program you can run on your Xmega that uses USB CDC and SOF and exports the PER clock.   It works fine here.

 

EDIT:   Make sure you are setting the USB PAD for D+ and D-.

Last Edited: Sat. Apr 13, 2019 - 09:35 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

thanks for your reply, Steve, I appreciate it. Too bad you're the only one chiming in! :(

 

Quote:

You must always keep the PER clock running when attached to the host USB.  The only sleep allowed is idle sleep.  Apparently you weren't doing that at one time because you didn't see the clock on a pin sometimes.

 

Yes, I know.

 

However, the device didn't got to sleep at any point. The first method I tried to export the PER clock was toggling an arbitrary pin as fast as possible. For some reason that didn't work. It probably had something to do with other parts of the code (it's a full blown keyboard firmware). Commenting out a couple of sections apparently was not enough. There's also a chance I had just missed something simple, s.a. an if() or I put my pin toggle code inside the wrong while() loop, because I was in a hurry. So, when I tried another method using 'PORTCFG.CLKEVOUT = PORTCFG_CLKOUT_PC7_gc' to export the PER clock to Pin 7 on Port C, it worked right away.

 

Quote:

Maybe you have a bad USB cable or USB connector.  Maybe the USB traces are quite long or quite crooked.

 

 

No. It doesn't have anything to do with the cable or USB connector. Like I said, the XMEGAs are mounted on decent PCBs based on a decent layout and made by a professional fab house. I'm pretty sure I've routed the USB signal lines correctly (as close as possible together, parallel, no vias, the least amount of angles possible). The traces are only about 6 cm in length, I don't see a problem there. I have had absolutely no problems with these PCBs. HOWEVER, I cannot be sure I did a good job unless I run a hardware check on the boards - but I don't have the equipment to do that. The fact of the matter is: 9 out of 10 boards, heck, 90 out of 100 boards, do not exhibit the behavior described. Also - I cannot change anything about the PCB. Changing the COMP settings in the firmware for the XMEGA will fix the issue, so naturally I'm going to inquire into that rather than the PCB.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mind_prepared wrote:
... (it's a full blown keyboard firmware). 

...  - but I don't have the equipment to do that.

low-speed USB can be instrumented by an inexpensive logic analyzer; otherwise, the USB instruments up to full-speed are somewhat inexpensive.

If the first level signals are correct (logic analyzer) then the operating system's USB HCI driver may be enough (along with application-side tools)

 

Protocol decoders - sigrok

Beagle USB 12 Protocol Analyzer - Total Phase

Teledyne LeCroy Mercury T2 USB 2.0 Analyzer :

https://www.digikey.com/products/en/test-and-measurement/equipment-specialty/618?FV=ffec9a71

Universal Serial Bus (USB) - Windows drivers | Microsoft Docs

http://janaxelson.com/hidpage.htm#tools via HID Page | Jan Axelson's

 

edit :

One ASIX logic analyzer is in sigrok and it has an value-added optional USB protocol analyzer :

https://www.asix.net/dbg_sigma_accessories.htm#la-usbpa

via ASIX: Debugging Tools (Logic analyzers)

via OMEGA Advanced Logic Analyser | Kanda

ASIX SIGMA / SIGMA2 - sigrok

 

edit2 :

Windows :

USBPcap

 

Linux :

CaptureSetup/USB - The Wireshark Wiki

 

"Dare to be naïve." - Buckminster Fuller

Last Edited: Mon. Apr 15, 2019 - 06:57 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The fact that your device can send okay but not receive makes me think you have a software problem.  Are you using CDC or HID or WinUSB or what?

 

I can send you a program you can put in your Xmega.  If my program works, it would suggest your software is the problem.  If my program does not work, this would suggest your hardware is the problem.  I can give you a hex file or the whole project.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

steve17 wrote:

The fact that your device can send okay but not receive makes me think you have a software problem.  Are you using CDC or HID or WinUSB or what?

 

I can send you a program you can put in your Xmega.  If my program works, it would suggest your software is the problem.  If my program does not work, this would suggest your hardware is the problem.  I can give you a hex file or the whole project.

 

 

 

erm, Steve, I really appreciate your effort to help me get to the root of the issue, but if there were a software problem then every single board would behave like that - but that's not the case! Only some boards are prone to this issue. I suspect that this has something to do with variations in the precision/calibration of the internal oscillator, hence 'offsetting' the COMP values for the DFLL calibration 'solves' the issue. If I assume, for the sake of argument, that your assumption that there is a software problem were true, the procedure (workaround) I adopted would be equivalent to having uncessarily introduced a hardware issue (maladjusted internal oscillator) on top of an unknown software issue. It should stand to reason that adding error to error will result in a bigger error and not in a fix, wouldn't you say?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If you are using SOF DFLL, the frequency should be correct regardless of the internal 32kHz oscillator.  You shouldn't need to tweak the DFLL stuff.

 

The Xmega USB has internal resistors in the D+ and D- circuit that are adjustable.  You need to fetch the proper values from the production signature row and put them in the USB pad calibration registers.

 

You need to be careful about receiving data.  As I remember, I set the Nack0 bit in the Endpoint registers before I enable the USB.  To receive you must put a data buffer address in the output (from the host) endpoint registers, and also set the data buffer size.  Then you set the Nacko bit false.  When you get data, the hardware will set the Nack0 bit true.

 

If you can send okay, it seems to me the USB link is good.  In order to send, the host has to request and receive a bunch of descriptors.  It then has to poll your device.  I don't see how that can happen if the link is somehow bad.

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

steve17 wrote:

If you are using SOF DFLL, the frequency should be correct regardless of the internal 32kHz oscillator.  You shouldn't need to tweak the DFLL stuff.

 

That's why I want to know what the COMP values are actually for. If there were no intrinsic need for setting these values then Atmel wouldn't do it.

 

Quote:

 

The Xmega USB has internal resistors in the D+ and D- circuit that are adjustable.  You need to fetch the proper values from the production signature row and put them in the USB pad calibration registers.

 

 

OK, interesting, I'll have to do some research on that. That's the first time I'm told that the termination resistors are adjustable. I do not see the connection to the DFLL COMP values, but ok, I'll find out in a minute.

 

Quote:

 

You need to be careful about receiving data.  As I remember, I set the Nack0 bit in the Endpoint registers before I enable the USB.  To receive you must put a data buffer address in the output (from the host) endpoint registers, and also set the data buffer size.  Then you set the Nacko bit false.  When you get data, the hardware will set the Nack0 bit true.

 

If you can send okay, it seems to me the USB link is good.  In order to send, the host has to request and receive a bunch of descriptors.  It then has to poll your device.  I don't see how that can happen if the link is somehow bad.

 

Like I said, I'm using Atmel's ASF driver. I'd rather have Atmel take care of issues within ASF components. Nevertheless, I don't think this is a driver issue. If it were, we would have heard about it a long time ago and there would be dozens of threads from people reporting about similar issues.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

 

What Atmel calls the comp (compare), I call the multiplier.  It is the controlled freq. divided by the gold standard freq..  (1000 from SOF or 1024 from the 32 kHz osc.)

 

So when using SOF, it's 48,000,000 divided by 1000 = 48,000 = 0xbb80.

When using 32kHz osc., it's 48,000,000 divided by 1024 = 46,875 = 0xb71b.

 

 

I posted the above stuff earlier.  I'll explain more.   The "gold standard frequency" is the SOF frequency which is 1000 Hz..  The controlled frequency is our 48 MHz RC osc frequency.  So the multiplier is 48,000,000 divided by 1000 = 48,000 = 0xbb80.    The DFLL counts ticks of the 48MHz clock.  When it gets a tick of the "gold standard" clock, it notices how many 48MHz ticks it got, and then clears the counter.  If it gets more than 48,000 (0xbb80) it knows the clock is too fast so it decrements the 48 MHz rc osc. fine freq. adjustment register by one.  If it gets less than 48,000, it increases the register by one.

 

Of course the 48MHz rc osc. is actually the 32MHz osc. running at 48MHz.  

 

So you want the multiplier to be 0xbb80 cuz 1000 Hz * 0xbb80 = 48MHz.  This assumes the host's USB clock is accurate.   That's a pretty good assumption, but I suppose you could have a screwed up PC USB hardware.

 

You are using Atmel's driver, but you must handle the incoming data.  When you are done with the data buffer, you must notify the driver so it can re-use it.

 

 

 

 

 

 

 

 

Last Edited: Sun. Apr 28, 2019 - 12:40 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I've had this issue before too. I think the problem may be that the DFLL doesn't produce a particularly accurate clock, and combined with the tolerance of certain USB ports and temperature effects it sometimes goes far enough out of sync. USB 3 ports seem to be worse for some reason.

 

I'm afraid I didn't find a fix, except to fit a crystal and use that instead.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The problem is with our SOF DFLL hardware.  It seems to be a timing problem.  I can make it work with a quasi simple fix.  I suppose someone ought to report this to someone somewhere.  

 

Here's the situation.  Our device doesn't always get the SOF.  That's normal.  When our device fails to see anything on the bus for 3 milliseconds, it reports a suspend condition.  When it then sees the SOF or other stuff on the bus, it reports a resume condition.  We go through 2 suspend-resume cycles when we first attach to the bus.  We see another cycle when the PC sleeps and wakes up.

 

When we get the suspend, the hardware disables the DFLL.  If it didn't, the lack of the "gold standard" SOF ticks would drive the freq. adj. register (CALA) down to a value of 1, and my Xmega would then produce a 48MHz clock that is 8 percent low.  Disabling DFLL when getting the suspend works fine.  The problem is what happens when we get the resume.  Under some conditions, it works okay.  The hardware enables DFLL, the DFLL sees the SOF, and it works.  

 

It fails with USB 2 when the PC sleeps and then wakes.  It fails with USB 3 for all suspend-resume cycles.  When it fails, the hardware enables the DFLL when it sees the resume.  However something gets screwed up.  The DFLL apparently doesn't see the SOF ticks, so it drives the freq. adj. register value to 1.  The SOF apparently exists because I get SOF interrupts.

 

Following a resume, I monitor the value in the freq. adj. reg. from the SOF interrupt handler.  When I see it has a value of 1, I disable the DFLL, wait for it to complete, and then re-enable it.  Problem solved.  With USB 3, I find I must monitor that register for at least 40 ms. (40 SOF interrupts) before I see the second failure. 

 

By the way, the freq. adj. reg. (CALA) is actually one of the DFLL device registers.

 

I've seen an Atmel video that brags that the Xmega doesn't need a crystal for USB.  Apparently they didn't test enough.  I could report this problem if I knew how to do it.  I guess I could just send them this post.

Last Edited: Mon. May 20, 2019 - 09:22 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If I knew about the USB 3 problem, I would have mentioned it earlier in this thread. Apparently when I was using USB 3 a few years ago, I wasn't using SOF DFLL.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

steve17 wrote:
I could report this problem if I knew how to do it.
Microchip/Atmel support page | AVR Freaks

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Okay, I submitted the bug.  

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Could you record the value of the calibration registers when you get a valid packet, and then poke it back in when resuming from sleep?

 

Having said that I don't think it can resume fast enough for USB 3.0 to be reliable. It's just a design flaw in the hardware, although since it apparently passed the USB IF tests maybe it's a flaw in their tests. In any case, a crystal looks essential.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0


mojo-chan wrote:

Could you record the value of the calibration registers when you get a valid packet, and then poke it back in when resuming from sleep?

No.  To be able to set the freq. adj. reg. you must disable DFLL.  But then we would generally want to re-enable it.  When the DFLL is enabled, whether by us or the silicon, the median value (0x40) is automatically put in the register.

 

I was probably wrong when I said it takes 40 ms. for the failure to occur with the USB 3 port.  That is probably the time of the second failure after the first of 2 resumes.  I can check that.   I was only seeing 1 failure in the past.  When I looked at a USB 3 port a few days ago, I did a quick enhancement to my logging to capture 2 failures and probably did it badly.

 

I think this fix is reliable.

 

Below is a log of a failure and fix.  When it fails, the CALA (freq. adj. reg.) is divided by 2 each ms.  It takes 6 ms. to reach 1, where it remains forever if I don't fix it.  Then you can see the 40 where I disable and enable the DFLL.  It takes another 5 or 6 ms. to converge on the correct value.

 

 

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I got a message from Microchip.  Apparently they don't want to fix the problem.  It's easier to fix the errata.  laugh

 

Created by : Mohit Mo (5/21/2019, 02:43 AM)

Hi Stephen,

This seems to be a known issue with the peripheral based on an internal source:
Entering power save sleep mode causes the DFLL output to be inaccurate. 
After exiting the sleep mode, the CALA register values will be modified to a new value. 
This is known issue with the silicon. The work around is to disable, then re-enable the DFLL, 
the CALA register pops back to its pre-sleep value.

I will contact my internal team and suggest them to add this as an errata 
or publish this in the public knowledge base.

Thank you for taking the time to report this issue.

With Regards,
Mohit M.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Microchip thinks the problem is related to power save sleep.  Not so.  I don't use power save sleep, although I probably should according to the USB rules.  My desktop computer don't care nothin' about no stinkin' rules.

 

I told Microchip about this.  

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1


It seems the fix is simple.  Whenever you get a resume interrupt, disable and enable the DFLL.  Apparently you don't have to bother with any SOF interrupts.

 

Now the log looks like this:

 

Last Edited: Tue. May 21, 2019 - 06:19 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

mojo-chan wrote:

Could you record the value of the calibration registers when you get a valid packet, and then poke it back in when resuming from sleep?

I gave you bad info before.  Enabling the DFLL does not automatically set the CALA value to 0x40, so yes you can do that.  The freakin' hardware that clobbers the DFLL on resumes does do that.  I think you are right.  The value of CALA (fine freq. adj. reg.) shouldn't be changed, but the hardware does it.

 

Actually I find that it is unnecessary to even think about putting the correct value into CALA.  This could be a little more pain than it is worth.  I discovered when I set up the clocks at startup, I didn't wait for the DFLL to enable.  I found it takes from 40 to 240 milliseconds for that to happen.  Then it's another 7 milliseconds for it to find the correct CALA.  But like I said, I found that finding the correct CALA was more complication for nothing.

 

It seems to be fairly simple to get SOF DFLL to work for USB 2 and USB 3.  With USB 3, the freakin' USB bus reset clobbers the DFLL too.  I don't know why.  The fix is the same as for the resume clobber.  Disable the DFLL (and wait) then enable the DFLL.   Don't bother to look at CALA.  At least I think that's all I'm doing now. 

 

I use resume and reset interrupts.  When I get either one, I just disable (and wait) and then enable the DFLL.  I think that's all I'm doing.  I've got a lot of logging in the code too so I can't be sure.

 

There is a little more to do if you want to fix the problem that happens when the PC wakes from sleep.  This is a problem for USB2 and USB3.  For this I use SOF interrupts.  I have these interrupts running all the time for logging purposes, but I don't think you need that.  When a resume interrupt happens, the resume interrupt handler, besides doing the DFLL disable-enable, also sets a "resume happened" bit.

 

The SOF interrupt handler looks at this flag.  If it is not set, the SOF handler does nothing.  If it is set, the SOF interrupt clears that flag, and then checks the CALA value.  If the value is 1, it disables, waits, and enables the DFLL.   I might be able to do the disable-wait-enable without checking the value, but it's only the resume after the PC wakens that the DFLL needs a second whack.  And this still clobbered DFLL should have a constant value of one, seven ms after it was clobbered by the stupid hardware.

 

If you want, I could send you a hex file, (or project) that you could test and torture to see if you could get it to fail. 

Last Edited: Wed. Jun 12, 2019 - 12:07 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

steve17 wrote:
Actually I find that it is unnecessary to even think about putting the correct value into CALA.  This could be a little more pain than it is worth.  I discovered when I set up the clocks at startup, I didn't wait for the DFLL to enable.  I found it takes from 40 to 240 milliseconds for that to happen.  Then it's another 7 milliseconds for it to find the correct CALA.  But like I said, I found that finding the correct CALA was more complication for nothing.

 

My main concern was that it would take so long for DFLL to get back to a working value that it would cause issues on some USB ports. When USB 3.0 came in I found that timing was a lot tighter than with 2.0. If the device didn't start responding to USB packets quickly enough the driver on the host would assume it was broken and try to put it back to sleep. IIRC that was an NEC chipset, but I think I tested Intel as well. I didn't get as far as checking the spec.

 

I try to find my old notes on the maximum start-up time, but I have a feeling they are at the last place I used to work...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

steve17 wrote:
With USB 3, the freakin' USB bus reset clobbers the DFLL too.  I don't know why.
I also don't know the reason though USB reset is apparently configurable.

Registry settings for configuring USB driver stack behavior - USB Device Registry Entries - Windows drivers | Microsoft Docs

(bottom)

ResetOnResume

...

Didn't browser further for how to process a "slow wake-up" USB device (device descriptor?  INF?  WinUSB.DLL?)

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Oh, I should clarify, I was not talking about wake-up from sleep, I was talking about the initial plug-in and power up of a bus-powered device with a USB 3.0 port.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I doubt there would be any more problems with the USB3 ports than the USB2 ports.  I would think USB3 ports must obey USB2 rules when a USB2 device is plugged in. 

 

The only difference I see in my logs is the number of SOFs we receive between when we get the last reset and when we get the next thing, which is a setup packet.  With USB3 we get 33 SOFs.  With USB2 we get 19 SOFs.  In this case the USB2 seems faster.  In either case, there is more than enough time for the DFLL to get the frequency right.

 

In my logs, it seems that it is only the friggin' reset that sets CALA to 0x40.  The resume seems to leave CALA alone.

 

One difference with USB3 is I get a setup request for a device qualifier descriptor.  This is an "unknown" request for a device that can't run high speed, so we should respond with a stall.  I suppose your driver does that.

After more investigation, it seems the resumes are harmless.  It's only the resets that make a mess.   

 

So if your goal is to get the startup to work, all you need is to get reset interrupts.  When you get one, do the DFLL disable-enable thing.  This works with USB2 and USB3. 

 

However if you want to fix the calamity that happens when the host PC wakes up, you will need to get resume interrupts.  Resumes are the only indication that the host may have awakened.  Unfortunately doing the disable-wait-enable on the DFLL at this point doesn't fix the problem. Apparently you need to wait a bit to do it.  I suggest the resume interrupt handler enables the SOF interrupt.  The SOF interrupt  handler then can do the disable-wait-enable thing, and disable SOF interrupts.

 

This works on 2 different computers.  On this computer it works with USB2 and USB3 ports.  The other computer was built before USB3 existed.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

steve17 wrote:

I doubt there would be any more problems with the USB3 ports than the USB2 ports.  I would think USB3 ports must obey USB2 rules when a USB2 device is plugged in. 

 

Maybe it was just the early controllers from NEC and Intel, but they seem to have stuck to the timing requirements for USB 2.0 more strictly than other USB 2.0 controllers.

 

This is quite common. There is the spec, and then there is the fact that people just want their stuff to work when they plug it in so the driver/firmware developers relax the timings a bit. Then the next generation comes in and is all written to spec again.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

The DFLL problems happen when the host suspends SOF for too long a time.  If the suspension is 11 milliseconds, there is no problem.  We get this at startup when plugged into a USB2 port.  If the suspension is 55 milliseconds, the DFLL latches up.  We get this when plugged into a USB3 port.  This is also the problem we get when the host wakes up from sleep.  The fix is to disable and re-enable the DFLL. 

 

I gave Microchip this info complete with logs.