XMEGA DFLL USBSOF needs manual COMP tuning - why?

Go To Last Post
17 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hey guys, this is beyond my grasp so here's my problem and the workaround I've been using so far:

 

- XMEGA32A4U, using internal oscillator

- 32MHz internal RC tuned to 48MHz, system clock prescaled down to 12MHz

- USB interface enabled, full speed, ASF framework

- DFLL autocalibration using USBSOF

 

The target application is a USB keyboard controller. PCB layout, components and firmware are identical for every board. Everything is running fine - however, as I've just discovered, there's a small glitch that seems to stem from variations of the individual MCUs used.

 

The glitch is this: without DFLL USBSOF autocalibration, the keyboard controller's lock lights would randomly not get synchronized correctly, i.e. upon pressing, for example, the NL key on a second keyboard, the corresponding LED connected to my controller board would not light up or stay on sometimes. With DFLL USBSOF autocalibration this glitch didn't occur and the LEDs always got synchronized - until recently.

Interestingly, when sending data from the client to the host, there's never been a problem at all. The glitch in USB communication only occurs when receiving data from the host, s.a. a SET_REPORT request (which reports the operating system's lock light status to the client devices connected).

 

I discovered that when tweaking the DFLL COMP values, effectively synching to a higher clock speed than what it ought be, the problem will magically disappear - but only on two boards that I've been working on recently, on a third board the default DFLL COMP values work just fine:

 

ISR(USB_BUSEVENT_vect)
{
    
    // USB bus event has occurred, so USB must be connected
    if(usb_connected == false) usb_connected = true;
        
    
    
    if (udd_is_start_of_frame_event())
    {
        udd_ack_start_of_frame_event();
        udc_sof_notify();
        if((DFLLRC32M.CTRL & DFLL_ENABLE_bm) == 0)
        {
            DFLLRC32M.COMP1 = 0x68;
            DFLLRC32M.COMP2 = 0xBF;
            OSC.DFLLCTRL |= OSC_RC32MCREF_USBSOF_gc;
            DFLLRC32M.CTRL |= DFLL_ENABLE_bm;
        }

 

... // more code here

}

 

So, obviously, with the default value of 0xBB80 the USB clock is off as far as the two individual MCUs mentioned are concerned, otherwise there would be no glitches when receiving host data. So, can I conclude that with the a value of 0xBF68 (= 49MHz) the USB clock is now in fact closer to 48 MHz than before?

 

Another thing: if I don't enable DFLL autocalibration at all the mentioned glitches will also disappear - while with a third controller board that runs perfectly with the default DFLL settings, the glitches will only appear when DFLL autocalibration is disabled. So, I've got, technically speaking, identical boards running exactly the same firmware but there seem to be two different categories of MCU that behave completely differently on the same settings. What's going on here?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It doesn't make sense.  Either there is a problem with your PC host, or a problem with your code.

 

It's probably not the host, but if you can plug your stuff into another PC, that would find out.

 

Can you measure the AVR CPU clock?  I have a module that exports the CPU clock, divided by 1000, on a pin of your choice.  This can be measured by the frequency meter in a DMM, or a scope.

 

Your AVRs aren't identical when not using USB SOF.  The 32 kHz oscillators aren't perfect.  Some are more accurate that others.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

It doesn't make sense.

 

I agree.

 

Quote:
  Either there is a problem with your PC host, or a problem with your code.

 

It's probably not the host, but if you can plug your stuff into another PC, that would find out.

 

it's probably not the host, but I will check just to be sure and report back.

 

What could be wrong with my code?

 

Quote:
Can you measure the AVR CPU clock?  I have a module that exports the CPU clock, divided by 1000, on a pin of your choice.  This can be measured by the frequency meter in a DMM, or a scope.

 

I tried measuring the CPU clock with and without a clock division method (software, timer, clock output to pin). The clock is not very precise when the DFLL is disabled, but I've never had any problems running an XMEGA off the internal oscillator. What I wanted to find out is the actual clock when the USB SOF calibrated DFLL was on (using a while loop to endlessly toggly a pin) which should be something very close to 48MHz. However, after enabling the USB interface there is no output anymore. Why I don't know. Perhaps we should crack that nut first?

 

Quote:

Your AVRs aren't identical when not using USB SOF.  The 32 kHz oscillators aren't perfect.  Some are more accurate that others.

 

Why do you think tweaking the COMP value helps at all? What exactly happens when you increase/decrease that value? I understand that a higher value would result in a clock >48MHz while a lower value would result in a clock <48MHz. Is that hot it works? How does DFLL calibration via USB SOF actually work?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mind_prepared wrote:

 

What could be wrong with my code?

 

Unless you are God, anything.  smiley

 

If the USB data lines are running all over the place, it might cause erratic USB operation.  I don't use ASF so I can't be of much help.  If you fail to correctly adjust the built-in USB resistors, that might cause erratic operation.

 

DFLL works by comparing the controlled freq. (48MHz) with the "gold standard" freq. which could be SOF or it could be the 32kHz RC osc.

 

It finds how many ticks of the controlled freq. it sees between ticks of the gold standard.   It increments a counter for each tick of the controlled freq.  When it gets a tick from the gold standard, it notices the count and then clears it to zero.  If the count is more than it should be, it decrements the controlled freq. fine adjustment register. (CALA).  and vice versa.

 

What Atmel calls the comp (compare), I call the multiplier.  It is the controlled freq. divided by the gold standard freq..  (1000 from SOF or 1024 from the 32 kHz osc.)

 

So when using SOF, it's 48,000,000 divided by 1000 = 48,000 = 0xbb80.

When using 32kHz osc., it's 48,000,000 divided by 1024 = 46,875 = 0xb71b.

 

When monitoring the CPU freq. at a pin, your program can only use idle sleep, or not sleep at all.  Any deeper sleep will disable the CPU clock.

 

 

 

 

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

steve17 wrote:
When monitoring the CPU freq. at a pin, your program can only use idle sleep, or not sleep at all. Any deeper sleep will disable the CPU clock.
Actually, that is not quite right.  Duh.   What I, and probably you, are measuring is the PER clock, not the CPU clock.  Any sleep will disable the CPU clock, but that doesn't matter.  A deeper sleep will disable the PER clock, and that will screw up the frequency at the pin.

 

The PER clock and the CPU clock are the same clock except the CPU clock, but not the PER (peripheral) clock, is disabled when in idle sleep.

Last Edited: Sat. Jan 19, 2019 - 04:09 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

thanks Steve, that makes sense. However, I don't have any reason to believe that something is wrong with my code or clock configuration. The boards all work fine, there is no problem with USB communication, at least not that I know of. Out of the last 30 or so boards I made only two are showing the behavoir described above. What puzzles me is that these boards exhibit no quirks when the DFLL USBSOF calibration is disabled - while all other boards behave exactly the other way around (as you would expect).

 

Why do you think deep sleep will be enabled when the USB interface is enabled? The only thing that could send the CPU to sleep is a suspend signal from the host but that is not the case as far as I can see - unless Windows issues a suspend signal immediately after enumeration without a resume.

 

Something must be causing a deviation or otherwise DFLL calibration would work as it's supposed to.

 

I need to figure out how to measure actual CPU clock when DFLL USBSOF calibration is enabled first and then report back to be sure we're not wasting time groping in the dark or looking in the wrong place.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yeah, you are right about the sleep.  When the USB is running, the only sleep you can use is idle sleep.  

 

I don't know why you can't measure the CPU/PER frequency when SOF is used.  I can.  I use a counter/timer.   I can send you a project or 2 that might help to find what's wrong.

 

I think USB requires the freq. be within 0.5%.  Most of my chip's 32 kHz osc. are, but some aren't. You can adjust the 32kHz osc. to be within spec., at least when the temperature doesn't vary a lot from room temperature.  For commercial use you would want to store the adjustment on the chip, maybe in the User Sig Row.   It's easier to use SOF, but for testing it might come in handy.   

 

I did find one bug in the silicon.  When the host suspends the device, the device no longer sees the SOF.  The silicon sees this and disables DFLL.  The frequency should remain stable without DFLL as long as the temperature, and maybe the voltage, doesn't change much.  When we subsequently get a resume, the silicon re-enables the DFLL.  That works here when I cause the suspend and resume from Device Manager by disabling and re-enabling the device.  However, when my PC goes to sleep and then awakens, the silicon screws up.  It does see the resume and does re-enable the DFLL, but the DFLL doesn't see the SOF even though we get SOF interrupts.  This causes the DFLL to drive the fine freq. adjustment (CALA) down to 1, and the resultant freq. is about 10% low.  I fix that by monitoring CALA after the resume.  If the value goes to 1, I disable and re-enable

the DFLL and that fixes it.

 

I will post a project that allows monitoring of the CPU/PER clock under all conditions.  You can configure the pin to use easily, but if you give me a pin to use, you can use the avr program without rebuilding it.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

OK, so I've got another piece of silicon that needs additional tuning. This time I figured out how to correctly output CPU clock on a GPIO pin. Jfyi, with the standard DFLL COMP setting of 0xBB80 I occasionally get what I presume is some kind of synchronization error when the USB host initiates communication (in this case a SET_REPORT request to update the status of the lock lights). I have a scope but I don't have a hardware USB protocol analyzer to see what's going on on the bus.

 

Here are my findings, quite puzzling still and disappointing, because I haven't gained any new insight:

 

Setup 1:

Default COMP value for DFLL source USB SOF: 0xBB80, i.e. 48000. Main system clock SYSCLK is tuned to 48MHz, CPU clock is SYSCLK/4, i.e. scaled down to 12 MHz. Clock output read with oscilloscope is 12.0000 MHz +/-0.0160. I conclude that the DFLL works nicely, clock speeds seem to be quite accurate - but, oh my!, the supposed occasional sync errors occur notwithstanding!

 

Setup 2:

Tuned COMP value for DFLL source USB SOF: 0xBD00, i.e. 48384. Main system clock SYSCLK is tuned to 48.384MHz, CPU clock is SYSCLK/4, i.e. scaled down to 12.096 MHz. Clock output read with oscilloscope is now 12.0960 MHz +/-0.0160. I conclude that the DFLL still works nicely, but the clock speeds are now actually significantly off - but, oh my!, the supposed occasional sync errors are gone and everyhing works consistently fine!

 

I said 'supposed sync errors'. Since tuning the DFLL COMP value, more precisely: since increasing the COMP value, seems to fix the issue for a particular piece of silicon, I conclude two things:

 

1. the clocks of the USB host and my XMEGA client device must be somehow out of sync without tuning. Otherwise I cannot understand why the client would occasionally 'miss' incoming requests (or receive corrupt data or whatnot, I don't know yet what is going on exactly)

 

2. The XMEGAs USB clock must have been running too slow - even though the measured CPU clock speed seems to suggest that the USB clock has increased as expected.

 

What's going on? Does my reasoning even make sense? What else could be causing the described behavior if not the XMEGA itself?

Last Edited: Wed. Apr 10, 2019 - 04:41 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You must always keep the PER clock running when attached to the host USB.  The only sleep allowed is idle sleep.  Apparently you weren't doing that at one time because you didn't see the clock on a pin sometimes.

 

Maybe you have a bad USB cable or USB connector.  Maybe the USB traces are quite long or quite crooked.

 

I think I can give you a simple (for USB so not very simple) program you can run on your Xmega that uses USB CDC and SOF and exports the PER clock.   It works fine here.

 

EDIT:   Make sure you are setting the USB PAD for D+ and D-.

Last Edited: Sat. Apr 13, 2019 - 09:35 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

thanks for your reply, Steve, I appreciate it. Too bad you're the only one chiming in! :(

 

Quote:

You must always keep the PER clock running when attached to the host USB.  The only sleep allowed is idle sleep.  Apparently you weren't doing that at one time because you didn't see the clock on a pin sometimes.

 

Yes, I know.

 

However, the device didn't got to sleep at any point. The first method I tried to export the PER clock was toggling an arbitrary pin as fast as possible. For some reason that didn't work. It probably had something to do with other parts of the code (it's a full blown keyboard firmware). Commenting out a couple of sections apparently was not enough. There's also a chance I had just missed something simple, s.a. an if() or I put my pin toggle code inside the wrong while() loop, because I was in a hurry. So, when I tried another method using 'PORTCFG.CLKEVOUT = PORTCFG_CLKOUT_PC7_gc' to export the PER clock to Pin 7 on Port C, it worked right away.

 

Quote:

Maybe you have a bad USB cable or USB connector.  Maybe the USB traces are quite long or quite crooked.

 

 

No. It doesn't have anything to do with the cable or USB connector. Like I said, the XMEGAs are mounted on decent PCBs based on a decent layout and made by a professional fab house. I'm pretty sure I've routed the USB signal lines correctly (as close as possible together, parallel, no vias, the least amount of angles possible). The traces are only about 6 cm in length, I don't see a problem there. I have had absolutely no problems with these PCBs. HOWEVER, I cannot be sure I did a good job unless I run a hardware check on the boards - but I don't have the equipment to do that. The fact of the matter is: 9 out of 10 boards, heck, 90 out of 100 boards, do not exhibit the behavior described. Also - I cannot change anything about the PCB. Changing the COMP settings in the firmware for the XMEGA will fix the issue, so naturally I'm going to inquire into that rather than the PCB.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mind_prepared wrote:
... (it's a full blown keyboard firmware). 

...  - but I don't have the equipment to do that.

low-speed USB can be instrumented by an inexpensive logic analyzer; otherwise, the USB instruments up to full-speed are somewhat inexpensive.

If the first level signals are correct (logic analyzer) then the operating system's USB HCI driver may be enough (along with application-side tools)

 

Protocol decoders - sigrok

Beagle USB 12 Protocol Analyzer - Total Phase

Teledyne LeCroy Mercury T2 USB 2.0 Analyzer :

https://www.digikey.com/products/en/test-and-measurement/equipment-specialty/618?FV=ffec9a71

Universal Serial Bus (USB) - Windows drivers | Microsoft Docs

http://janaxelson.com/hidpage.htm#tools via HID Page | Jan Axelson's

 

edit :

One ASIX logic analyzer is in sigrok and it has an value-added optional USB protocol analyzer :

https://www.asix.net/dbg_sigma_accessories.htm#la-usbpa

via ASIX: Debugging Tools (Logic analyzers)

via OMEGA Advanced Logic Analyser | Kanda

ASIX SIGMA / SIGMA2 - sigrok

 

edit2 :

Windows :

USBPcap

 

Linux :

CaptureSetup/USB - The Wireshark Wiki

 

"Dare to be naïve." - Buckminster Fuller

Last Edited: Mon. Apr 15, 2019 - 06:57 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The fact that your device can send okay but not receive makes me think you have a software problem.  Are you using CDC or HID or WinUSB or what?

 

I can send you a program you can put in your Xmega.  If my program works, it would suggest your software is the problem.  If my program does not work, this would suggest your hardware is the problem.  I can give you a hex file or the whole project.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

steve17 wrote:

The fact that your device can send okay but not receive makes me think you have a software problem.  Are you using CDC or HID or WinUSB or what?

 

I can send you a program you can put in your Xmega.  If my program works, it would suggest your software is the problem.  If my program does not work, this would suggest your hardware is the problem.  I can give you a hex file or the whole project.

 

 

 

erm, Steve, I really appreciate your effort to help me get to the root of the issue, but if there were a software problem then every single board would behave like that - but that's not the case! Only some boards are prone to this issue. I suspect that this has something to do with variations in the precision/calibration of the internal oscillator, hence 'offsetting' the COMP values for the DFLL calibration 'solves' the issue. If I assume, for the sake of argument, that your assumption that there is a software problem were true, the procedure (workaround) I adopted would be equivalent to having uncessarily introduced a hardware issue (maladjusted internal oscillator) on top of an unknown software issue. It should stand to reason that adding error to error will result in a bigger error and not in a fix, wouldn't you say?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If you are using SOF DFLL, the frequency should be correct regardless of the internal 32kHz oscillator.  You shouldn't need to tweak the DFLL stuff.

 

The Xmega USB has internal resistors in the D+ and D- circuit that are adjustable.  You need to fetch the proper values from the production signature row and put them in the USB pad calibration registers.

 

You need to be careful about receiving data.  As I remember, I set the Nack0 bit in the Endpoint registers before I enable the USB.  To receive you must put a data buffer address in the output (from the host) endpoint registers, and also set the data buffer size.  Then you set the Nacko bit false.  When you get data, the hardware will set the Nack0 bit true.

 

If you can send okay, it seems to me the USB link is good.  In order to send, the host has to request and receive a bunch of descriptors.  It then has to poll your device.  I don't see how that can happen if the link is somehow bad.

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

steve17 wrote:

If you are using SOF DFLL, the frequency should be correct regardless of the internal 32kHz oscillator.  You shouldn't need to tweak the DFLL stuff.

 

That's why I want to know what the COMP values are actually for. If there were no intrinsic need for setting these values then Atmel wouldn't do it.

 

Quote:

 

The Xmega USB has internal resistors in the D+ and D- circuit that are adjustable.  You need to fetch the proper values from the production signature row and put them in the USB pad calibration registers.

 

 

OK, interesting, I'll have to do some research on that. That's the first time I'm told that the termination resistors are adjustable. I do not see the connection to the DFLL COMP values, but ok, I'll find out in a minute.

 

Quote:

 

You need to be careful about receiving data.  As I remember, I set the Nack0 bit in the Endpoint registers before I enable the USB.  To receive you must put a data buffer address in the output (from the host) endpoint registers, and also set the data buffer size.  Then you set the Nacko bit false.  When you get data, the hardware will set the Nack0 bit true.

 

If you can send okay, it seems to me the USB link is good.  In order to send, the host has to request and receive a bunch of descriptors.  It then has to poll your device.  I don't see how that can happen if the link is somehow bad.

 

Like I said, I'm using Atmel's ASF driver. I'd rather have Atmel take care of issues within ASF components. Nevertheless, I don't think this is a driver issue. If it were, we would have heard about it a long time ago and there would be dozens of threads from people reporting about similar issues.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

 

What Atmel calls the comp (compare), I call the multiplier.  It is the controlled freq. divided by the gold standard freq..  (1000 from SOF or 1024 from the 32 kHz osc.)

 

So when using SOF, it's 48,000,000 divided by 1000 = 48,000 = 0xbb80.

When using 32kHz osc., it's 48,000,000 divided by 1024 = 46,875 = 0xb71b.

 

 

I posted the above stuff earlier.  I'll explain more.   The "gold standard frequency" is the SOF frequency which is 1000 Hz..  The controlled frequency is our 48 MHz RC osc frequency.  So the multiplier is 48,000,000 divided by 1000 = 48,000 = 0xbb80.    The DFLL counts ticks of the 48MHz clock.  When it gets a tick of the "gold standard" clock, it notices how many 48MHz ticks it got, and then clears the counter.  If it gets more than 48,000 (0xbb80) it knows the clock is too fast so it decrements the 48 MHz rc osc. fine freq. adjustment register by one.  If it gets less than 48,000, it increases the register by one.

 

Of course the 48MHz rc osc. is actually the 32MHz osc. running at 48MHz.  

 

So you want the multiplier to be 0xbb80 cuz 1000 Hz * 0xbb80 = 48MHz.  This assumes the host's USB clock is accurate.   That's a pretty good assumption, but I suppose you could have a screwed up PC USB hardware.

 

You are using Atmel's driver, but you must handle the incoming data.  When you are done with the data buffer, you must notify the driver so it can re-use it.

 

 

 

 

 

 

 

 

Last Edited: Sun. Apr 28, 2019 - 12:40 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I've had this issue before too. I think the problem may be that the DFLL doesn't produce a particularly accurate clock, and combined with the tolerance of certain USB ports and temperature effects it sometimes goes far enough out of sync. USB 3 ports seem to be worse for some reason.

 

I'm afraid I didn't find a fix, except to fit a crystal and use that instead.