TC-based microsecond timestamp accuracy/jitter issue

Go To Last Post
9 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hello all,

 

I've constructed a fairly basic TC-based timestamp on my SAME70. The aim is to be able to timestamp I/O events within the SAME70 with microsecond precision, to be used in further signal/event processing on an external processor. I've used the TC0 Timer Counter (channels 0, 1 and 2) in a daisy-chained manner to implement the timestamp, with the basic setup as follows:

 

  1. TC0_0 provides microsecond counting:
    • Configure TC0_0 to use the PCK_6 peripheral clock.
    • Configure the PCK_6 peripheral clock to use MCK with a prescale value of 150. This provides PCK_6 with a 1MHz clock.
    • Configure TC0_0 in wave mode, with RC compare triggering.
    • Configure TC0_0 with RA = 500 and RC = 1000. This results in an RC compare trigger every 1ms (1000 ticks @ 1MHz), using a waveform with 50% duty cycle.
  2. TC0_1 provides millisecond counting:
    • Configure TC0_1 to use XC1 as its input.
    • Configure TC0_1 in wave mode, with RC compare triggering.
    • Configure TC0_1 with RA = 500 and RC = 1000. This results in an RC compare trigger every 1s (1000 ticks @ 1ms/tick), using a waveform with 50% duty cycle.
  3. TC0_2 provides second counting:
    • Configure TC0_2 to use XC2 as its input.
    • Configure TC0_2 in wave mode, with RC compare triggering.
    • Configure TC0_2 with RA = 30 and RC = 60. This results in an RC compare trigger every 1min (60 ticks @ 1s/tick), using a waveform with 50% duty cycle.
    • Note: unlike the previous two TC channels, the TC0_2 RC compare trigger is serviced by an interrupt which increments a minute counter. This functionality isn't relevant to this issue/question, so I won't detail it further.

 

I then configure the TC0 block mode, such that the output triggers of TC0_0 and TC0_1 are used as the inputs on TC0_1 and TC0_2 respectively. Note that in point 1 above, I state that MCK is used as the PCK_6 source. I have tried using a range of clock sources (e.g. PLLA, main clock, etc.) with the required PCK_6 prescaler, but the inaccuracy/jitter issue appears to be independent of the selected source clock.

 

I have a very simple test circuit in place to validate this timestamp design. I have a GPIO configured as an input, and I capture the timestamp in it's corresponding rising/falling edge interrupt handler:

 

static void PIOC_Interrupt_Handler(uint32_t id, uint32_t mask)
{
    BaseType_t higherPriorityTaskWoken = pdFALSE;
    
    timestamp = ((TC0->TC_CHANNEL[2].TC_CV & 0x3F) << 20) | ((TC0->TC_CHANNEL[1].TC_CV & 0x3FF) << 10) | (TC0->TC_CHANNEL[0].TC_CV & 0x3FF);
    
    //The timestamp value is then sent to an external processor
}

As shown above, I am using the FreeRTOS kernel, but I don't believe FreeRTOS is causing any issues (as outlined below). Note that 'timestamp' above is defined and instantiated elsewhere in the class. The timestamp value is a combination of the second, millisecond and microsecond TC values, which is then processed for measurement (e.g. frequency, duty cycle, etc.) on the external processor.

 

The issue: the recorded timestamp values exhibit a non-trivial amount of jitter, despite the GPIO being provided with a very stable, low frequency input waveform. I'm using very low frequency square wave inputs in the order of 10Hz - 1kHz, so I would expect the generation and handling of the interrupt, and the recording of the TC0 channel values to be executed much faster than the 1us resolution of TC0_0. Below is the disassembly output of the above 'timestamp' line:

 

00407646 9b.4b                 ldr	r3, [pc, #620]		 
00407648 d3.f8.90.30           ldr.w	r3, [r3, #144]		 
0040764C 1b.05                 lsls	r3, r3, #20		 
0040764E 03.f0.7c.72           and	r2, r3, #66060288		 
00407652 98.4b                 ldr	r3, [pc, #608]		 
00407654 1b.6d                 ldr	r3, [r3, #80]		 
00407656 99.02                 lsls	r1, r3, #10		 
00407658 97.4b                 ldr	r3, [pc, #604]		 
0040765A 0b.40                 ands	r3, r1		 
0040765C 1a.43                 orrs	r2, r3		 
0040765E 95.4b                 ldr	r3, [pc, #596]		 
00407660 1b.69                 ldr	r3, [r3, #16]		 
00407662 c3.f3.09.03           ubfx	r3, r3, #0, #10		 
00407666 13.43                 orrs	r3, r2		 
00407668 4f.f0.00.04           mov.w	r4, #0		 
0040766C 93.4a                 ldr	r2, [pc, #588]		 
0040766E c2.e9.00.34           strd	r3, r4, [r2]

The SAME70 is running at 300MHz, so I would expect the above instructions to be executed in far less than 1us, including the GPIO interrupt latency. However, as my test results below illustrate, the actual timestamp values recorded contain variation between timestamps of over 100us.

 

The below data shows the measured frequency of the GPIO input (the frequency was calculated on the external processor based on the timestamp values of consecutive edges) using a 10Hz input signal.

 

Calculated Frequency
10.005
9.99261
9.99251
10.0048
10.0048
10.0048
9.99271
9.99271
10.0048
10.0049
10.0048
9.99251
9.99271
10.0048
10.0048
10.0049
9.99261
9.99261

I have validated the input signal with an oscilloscope; the 10Hz is stable and doesn't exhibit any jitter. I have also added a simple GPIO output in the above PIOC_Interrupt_Handler() function, which either asserts or deasserts a spare GPIO pin based on the input signal state. I have validated on the oscilloscope that this output signal is exactly 10Hz as well, with no jitter. As such, I'm somewhat lost as to why the timestamp values contain jitter, given that the hardware appears to be processing the interrupt without any latency/jitter issues whatsoever (as evidenced by the matched GPIO output signal). Note that I've increased the input signal to 50Hz, 100Hz, 500Hz, 1000Hz, etc. and the result is the same (i.e. the GPIO output signal matches the stability and accuracy of the input signal exactly, while the timestamp value contains considerable jitter).

 

Does anyone have any idea as to why I'm unable to get exact timestamp values with the above approach?

This topic has a solution.
Last Edited: Fri. Jan 7, 2022 - 12:50 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I've put together some additional data, which may provide some further insight.

 

I enabled a 10Hz input signal, and recorded the corresponding measured frequency on both the SAME70 as well as on my oscilloscope. The first graph below shows the frequency of the 10Hz signal as recorded/calculated by the oscilloscope, while the second graph shows the frequency of the same 10Hz signal as recorded/calculated by the SAME70 using the abovementioned chained TC-based timestamp methodology. Note that the y-axis range has been standardised between the two graphs to more easily illustrate the difference in jitter between the two measurements.

 

Graph 1: Frequency as calculated by the oscilloscope

 

Graph 2: Frequency as calculated by the SAME70

 

As you can see from the graphs, that while the 10Hz signal does contain some jitter (as evidenced in the oscilloscope data), the jitter is orders of magnitude worse in the SAME70 data. The average of the recorded frequencies, as well as the standard deviation of the data for both the oscilloscope and SAME70 are shown below:

 

Data Set Average Standard Deviation
Oscilloscope 10.00012152 0.00004055459
SAME70 10.00004171 0.00297141187

 

As shown above, the SAME70 timestamp methodology does result in accurate measurements when the entire sample set is averaged, but the deviation across the sample set (specifically the point-to-point fluctuation) is considerably higher than expected. Is this simply down to the non-deterministic nature of the ARM Cortex-M7 architecture vs. the FPGA-based architecture of the oscilloscope? As stated in my original post, I would have assumed a 300MHz processor would be capable of sub-microsecond accuracy for a task such as this.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Hi,

 

I also use the SAME70 to capture timestamps of digital inputs. The frequency on my digital inputs is about 10kHz. The jitter in my measurement is about +/-100ns. That is the resolution of the timers I use (2 timers with 2 and 3 channels). The main difference between your approach is that I use the capture functionality of the SAME70. I raise an interrupt after I captured a pulse (rising edget - followed by falling edge, using RA, RB register). The timestamp of the edges are stored without any jitter by the TC module. In ISR raised by storing a value to RB I just collect those values and use it in my software elsewhere. Your approach is to raise an interrupt and inside the interrupt you read the counter value. I'm not surprised that you see a large jitter because:

  • maybe cache is turned on and that leads to big differences in execution speed. Mayby times 3 or even larger - depending on the machine code at the point in time when the interrupt occurs. 
  • from your code I see that your are using FreeRTOS. Inside FreeRTOS routines sometimes interrupts are blocked - that will add another source of your jitter
  • maybe other application ISR are active the same time -> the interrupt priority will add an uncertainty (jitter) for the point in time of the execution of your timer interrupt

 

My recommendation would be to use the capture functionality. If you really need a high resolution for a comparable slow signal you might use one capture channel that will trigger an DMA transfer that also covers the CV register of your linked channels (data or µBlock striding can do the job efficiently). That still wouldn't be ideal but much more predictable and if you use the 'fastest' channel to trigger the DMA you should have time enough because I see from your code that the second fastest runs at a resolution of 1s that should be easily be done by the SAME70.

 

Best Regards

Markus

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Check if your interrupt handler priority is above configMAX_SYSCALL_INTERRUPT_PRIORITY. Handlers with higher priority will not be blocked when the FreeRTOS scheduler juggles tasks/queues. The downside is that your handler must not invoke any FreeRTOS API calls.

 

Note that to be higher priority than configMAX_SYSCALL_INTERRUPT_PRIORITY, you need to use a lower number on ARM Cortex-M devices (See https://www.freertos.org/RTOS-Co...).

 

Even with the priority configured correctly, you'll still see jitter due to things like CPU caching or other interrupt masking, but it should be less than you currently have.

 

I recommend you investigate the SAMX70 EVENT System for more precise time stamping in hardware.

 

Steve

Maverick Embedded Technologies Ltd. Home of Maven and wAVR.

Maven: WiFi ARM Cortex-M Debugger/Programmer

wAVR: WiFi AVR ISP/PDI/uPDI Programmer

https://www.maverick-embedded.co...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Firstly, thank you very much Markus and Steve for taking the time to read my post and for your valuable insights, it is greatly appreciated.

 

Markus Krug wrote:

I also use the SAME70 to capture timestamps of digital inputs. The frequency on my digital inputs is about 10kHz. The jitter in my measurement is about +/-100ns. That is the resolution of the timers I use (2 timers with 2 and 3 channels). The main difference between your approach is that I use the capture functionality of the SAME70. I raise an interrupt after I captured a pulse (rising edge - followed by falling edge, using RA, RB register). The timestamp of the edges are stored without any jitter by the TC module. In ISR raised by storing a value to RB I just collect those values and use it in my software elsewhere.

 

I absolutely understand this approach, and if I were only needing to timestamp digital inputs, I would have used this methodology. The issue I have (which wasn't made clear in my original post), is that this timestamp methodology I'm developing is to be used as a global timestamp for all sorts of activity across a range of interfaces within the SAME70. I.e. the timestamp is not only used for capturing edges on input digital signals, but also for ADC conversions, UART data, DMA activity, etc. As such, it isn't possible to use the capture approach.

 

Markus Krug wrote:

Your approach is to raise an interrupt and inside the interrupt you read the counter value. I'm not surprised that you see a large jitter because:

  • maybe cache is turned on and that leads to big differences in execution speed. Mayby times 3 or even larger - depending on the machine code at the point in time when the interrupt occurs. 
  • from your code I see that your are using FreeRTOS. Inside FreeRTOS routines sometimes interrupts are blocked - that will add another source of your jitter
  • maybe other application ISR are active the same time -> the interrupt priority will add an uncertainty (jitter) for the point in time of the execution of your timer interrupt

 

I've tried my approach with cache both enabled and disabled, and the jitter is exactly the same. Not 'similar', the jitter (measured as the period between consecutive edges measured in microseconds) is exactly the same. I'll touch on this further below, but the consistency of the jitter, despite changing as many of the suspected variables as possible, is where I'm still considerably confused.

 

With regards to FreeRTOS, I've tried a number of approaches here as well, to minimise/eliminate the impact of FreeRTOS. I've set the interrupt priority of the PIO port to 0 (the highest priority), such that it is NOT masked by FreeRTOS (more on this below in response to Steve's comments), and have also tried using critical sections to further prevent FreeRTOS from meddling with the interrupt routine. Again, there is absolutely no difference observed in the measured period jitter, the numbers recorded (i.e. the raw period between consecutive timestamps) are essentially exactly the same.

 

As to your last point, I've essentially disabled all other interfaces for this testing. The only other routines running are FreeRTOS tasks, which as I've outlined above, shouldn't cause a noticeable issue as the PIO interrupt is of much higher priority, and I'm disabling context switching once in the interrupt handler.

 

Markus Krug wrote:

My recommendation would be to use the capture functionality. If you really need a high resolution for a comparable slow signal you might use one capture channel that will trigger an DMA transfer that also covers the CV register of your linked channels (data or µBlock striding can do the job efficiently). That still wouldn't be ideal but much more predictable and if you use the 'fastest' channel to trigger the DMA you should have time enough because I see from your code that the second fastest runs at a resolution of 1s that should be easily be done by the SAME70.

 

I've not used the capture process before, and despite reading the associated section in the datasheet, I still don't have a complete understanding. Can multiple sources be used in conjunction with the capture functionality? I.e. if I have 8 digital inputs (across each of the PIO A/B/C/D banks), can each of them be used with the capture approach?

 

scdoubleu wrote:

Check if your interrupt handler priority is above configMAX_SYSCALL_INTERRUPT_PRIORITY. Handlers with higher priority will not be blocked when the FreeRTOS scheduler juggles tasks/queues. The downside is that your handler must not invoke any FreeRTOS API calls.

 

Note that to be higher priority than configMAX_SYSCALL_INTERRUPT_PRIORITY, you need to use a lower number on ARM Cortex-M devices (See https://www.freertos.org/RTOS-Co...).

 

As mentioned above, I've reviewed the FreeRTOS side of the equation, and I don't believe FreeRTOS is to blame. My configMAX_SYSCALL_INTERRUPT_PRIORITY configuration is based on a priority of 4, and I've set the PIOC interrupt to an NVIC priority of 0. My actual PIOC handler does in fact use a FreeRTOS API call (I use an indexed notification), so I've amended the approach so that I don't involve an FreeRTOS API calls, allowing me to use the highest priority interrupt. As mentioned above, there is no change whatsoever in the recorded jitter between the priority 0 approach and the FreeRTOS API approach.

 

scdoubleu wrote:

Even with the priority configured correctly, you'll still see jitter due to things like CPU caching or other interrupt masking, but it should be less than you currently have.

 

I recommend you investigate the SAMX70 EVENT System for more precise time stamping in hardware.

 

I fully expect to see some jitter, however as you've stated, the size (and more importantly, the consistency/regularity) of the jitter just doesn't make sense. Despite all the above changes, testing, different approaches, etc., I'm still getting a very reliable, repeatable ~63us jitter between groups of 2-3 readings. There's something about the regularity of the jitter which makes me think there's something else going on here. If it was true 'interrupt latency, the processor is handling another interrupt, FreeRTOS is switching contexts, etc.' jitter, I'd expect the jitter to be a little more erratic. The fact that it's so repeatable, almost sinusoidal suggests that something else is going on here.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I've collected some more data, which I believe is starting to point me in the right direction.

 

As outlined above, I've set the PIOC interrupt handler priority to 0 (the highest NVIC priority), and am now fairly certain that FreeRTOS is not involved in this issue whatsoever. The updated PIOC interrupt handler looks like this:

 

__attribute__ ((section (".ramfunc"))) void PIOC_Interrupt_Handler(uint32_t id, uint32_t mask)
{
    timestamp = ((TC0->TC_CHANNEL[2].TC_CV & 0x3F) << 20) | ((TC0->TC_CHANNEL[1].TC_CV & 0x3FF) << 10) | (TC0->TC_CHANNEL[0].TC_CV & 0x3FF);

    uint8_t inputState = PIOC->PIO_PDSR & mask ? 1:0;

    if (inputState) {
	pio_set_pin_high(PIO_PD17);
    } else {
	pio_set_pin_low(PIO_PD17);
    }
}

As you can see from the snippet above, I've configured PD17 (a spare GPIO) as an output pin, and simply match the state of the input pin. I've also moved the interrupt function to RAM.

 

I've captured the below waveforms on my oscilloscope; the blue waveform is the 10Hz input signal, the red waveform is the PD17 output signal.

 

 

Nothing too exciting there, the two waveforms present as we'd expect. What is very interesting, is the waveform analysis for each of these two waveforms, as shown below.

 

Input signal:

 

Output signal:

 

As the above waveform analyses illustrate, the output signal and input signal are extremely closely aligned, with no noticeable jitter whatsoever. This confirms that the jitter I'm observing in the timestamp is NOT related to PIOC interrupt latency, or concurrent processing delays, or FreeRTOS, etc.; the issue is with the TC values themselves. The SAM E70/S70/V70/V71 datasheet states the following in Section 50.6.2 16-bit Counter:

 

The current value of the counter is accessible in real time by reading the Counter Value register (TC_CV).

So, we've confirmed that the interrupt is being handled with sub-microsecond latency as we'd expect, and we haven't violated the access of the TC value register, so why is there so much consistent, dependable jitter in the TC values?

Last Edited: Wed. Jan 5, 2022 - 11:51 PM
This reply has been marked as the solution. 
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I've fixed it. It turns out I was looking in the wrong area entirely.

 

I was actually going through my code line by line, and putting together a clean ASF project for the SAME70 XPLD board, to see if the problem persisted on a clean, purpose-built solution without FreeRTOS. In doing so, I noticed that my digital input pins were configured with the internal debounce filter option enabled. I wondered if perhaps that was causing issues (I'm not sure why it would, given that my above 'matched GPIO output signal' procedure outlined above worked perfectly without any jitter) with the reading of the TC values. I disabled internal debouncing, and now the TC values (and corresponding calculated frequencies) are perfect.

 

I note that there is nothing in the SAM x70/71 errata document pertaining to the TC, but the datasheet provides the following possible explanation regarding additional latencies:

 

Hopefully this thread helps someone in the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Does the SAMD70 implement events as does the SAMD21? If so, you can use externally triggered event to capture the value of a counter running at 1MHz or more. Configure it so that the event produces an interrupt, whose handler reads the captured value. The error is at most one or two CPU clock cycles and is unaffected by interrupt response times

I've used this with a SAMD21 to measure the quarz frequency using the PPS output of a GPS receiver - even cheap receivers have an error of only 10-50 nanosec

Best wishes, Jerry

Last Edited: Sun. Jan 9, 2022 - 08:54 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I was working on a similar project with SAMD51, and ran into a different issue with sub-us timings. It's more than an order of magnitude smaller than your errors, but could be relevant depending on how important us precision is.

 

Given that you are using a /150 prescaler to get a 1 MHz clock, it seems that you are using a DPLL to generate a 150 MHz clock. Certainly on the SAMD51, the PLL is very jittery. When using a 32 kHz REFCLK, the output of the PLL can jitter by 2-3 us (i.e. around 400 output clock cycles). When using a faster REFCLK, e.g. 2.5 MHz , the jitter is reduced, but can still be up to about 300-400 ns (about 20-30 output clock cycles).

 

In order to get reliable sub-us timing, I had to clock the TC direct from a 10 MHz clock source (no PLL). There is still around 1-2 ns of jitter, which I presume is due to ground bounce/other noise on the MCU's internal clock distribution networks. 

Last Edited: Mon. Jan 17, 2022 - 04:07 PM