Out of sequence time stamp....?

Go To Last Post
22 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm working on a simple logic probe project using the ATmega168 and for the most part it works. Except that every now an then I get a time stamp that is "earlier" than the previous time stamp. I have done various tests to determine that the sample data is being sent in order so I am down to the assumption either the sample data is being corrupted or incorrectly captured. With this in mind I have combed the code for possible causes and I'm now at a loss to explain what causes this.

The three byte time stamp is generated by the 16 bit counter and an eight bit overflow count.

As I've already mentioned 99.9% of samples have the correct time stamp but every now and then (I can see no pattern to it) one of the time stamps is earlier than the previous couple of samples. Note that this is not the counter rolling over.... here's the code:

Thanks,

Pete

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I didn't spend much time looking at the code. But at the first glance the timex thing looks suspicious. What happens if OVF occurs while you are inside the PCINT1 interrupt, just entering it. I.e., before you read TCNT1?

Eugene

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I think that should be ok. The problem I'm having is that every so often, seemingly with no associated pattern, the output time stamps received over the serial interface are out of sequence, for example:

127602
129543
123054 <<<<< Out of sequence
132301

I don't see how the OVF interrupt firing while the PCINT1 interrupt is being handled could cause this?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
I don't see how the OVF interrupt firing while the PCINT1 interrupt is being handled could cause this?

Well, I did not look past the interrupt handlers. I guess you are right, your out-of-sequence number does not look like it could be the one taken at the TCNT1 wrap-up. In which case it would have been a multiple of 0x10000 plus at little bit. But your number is nothing like that.

Oh, well. I don't know then.

BTW, how do all those register variables even work? You are using them in the context that requires volatile. But avr-gcc's register variables are not. I am surprised that the compiler did not optimize out most of your program.

Eugene

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I think I tried setting the register vars to volatile out of desperation and frustration in an attempt to resolve this issue!

If the code looks OK could it be the hardware between the AVR and receiving code? The AVR UART Tx & Rx feed in to a FTDI FT232 (serial to USB) and that is being handled by a linux USB to serial driver. I'm not using handshaking but the out of sequence error can occur at relatively slow rates so I doubt that the data rate is causing it.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Can you please show us more examples of the error?

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

What if the PCINT comes slightly before the timer overflows and the timer wraps up in the time between vectoring, entering the ISR and saving the context? The value read will be wrong, as timex will not be updated yet.

Looking at the numbers in hex...
01 FA07
01 E0AE
02 04CD

... it might well the case that actually happens.

Also, I'd use the input capture feature of 16 bit timer for this application. Now the timestamping accuracy depends on interrupt latency, which can quite vary. With ICP it's only ~2 system clock cycles due to the synchronizer on the input pin. You still would need to correct for the possible still unhandled overflow flag. Jack Ganssle wrote an article about this:

http://www.ganssle.com/articles/asynchf.htm

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yes I can although I wont be able to post them till this evening.

I can say that I have spent some time checking the erroneous time stamps and the only pattern I could see between them was that out of sequence stamp was always two slots away from where it should have been. For example in the following trace the out of sequence time stamp should have been between the 100 and 150:

100
150
200
125 << Out of sequence
250
300

This aspect is consistent.

To verify that the samples were being sent in order I replaced the pin sample data being sent with the time stamp with the data sample array index. The received indexes were always in sequence and the out of sequence time stamps occurred randomly relative to the index.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You could check in the PCINT ISR if the new stamp is higher then the previous one (toggle an output when this happens or so and watch with a scope). That's allows you to nail down where the discrepancy occurs.

I'm fairly sure it's not on the outpu side.

edit: you could also toggle a pin in the overflow ISR and see if the two pulses correlate somehow.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Jayjay, good call, I'll try that as soon as I get back and post the results.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

A race condition between writing and reading data[]. Reading is not properly guarded.

Stealing Proteus doesn't make you an engineer.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ArnoldB wrote:
A race condition between writing and reading data[]. Reading is not properly guarded.
This is true but the result is quite unlikely be this.

We still don't know what the "output sequence" means. It may be a PC printout after scaling, and then the "jump" might quite well be the 0x10000 predicted by ezharkov, as result of OVF occuring during external interrupt being serviced (which is indeed one of the many shortcomings of the presented piece).

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ArnoldB, please explain further.

Read and write use separate indexes and an overflow should be detected. If a PCINT1 occurs while data[] is being read it's not clear to me how that could affect the data being read via the read index? I'm confident that this is not related to overflows as I've seen it happen using a 1/2Hz signal.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It's a little bit tricky to expand a timer in software.
You must check, if an overflow occur near the reading.
Following a working code:

http://www.mikrocontroller.net/a...

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Here are some real samples of the output captured on the PC:

time:666016
time:696140
time:726241
time:756350
time:720922 <<< Out of sequence
time:816561
time:846682
time:876789
time:906885
......
time:8464820
time:8494954
time:8525075
time:8555161
time:8519704 <<< Out of sequence
time:8615333
time:8645428
time:8675533
time:8705622
time:8735742
time:8765884
.........
time:13319890
time:13349990
time:13380096
time:13410168
time:13440222
time:13470323
time:13434886 <<< Out of sequence
time:13530523
time:13560606
time:13590707
time:13620809
time:13650925

After looking at these it does indeed look like the 16 bit counter has overflowed while in the PCINT1 handler. I will make some changes based on your suggestions and post the results.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for the advice. Adding the following prior to saving the timerx value solved the problem.

//
// Catch timer overflow that may have occurred
// during sample
//
tifr1 = TIFR1;
if (tifr1 & _BV(TOV1))
timex++;

Again many thanks!

Pete

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

pspreadb wrote:

      tifr1 = TIFR1;
      if (tifr1 & _BV(TOV1))
          timex++;


I do not think this completely solves the problem. Firstly, timex will get incremented two times. Here in PCINT1, and later in OVF (the OVF interrupt will fire as soon as you will get out of PCINT1). Secondly, if you look at danni's example, he has "&& !(val & 0x80)" there. Which takes care of the case when the timer overflows after reading TCNT, but before TIFR. You need something similar, probably something like "&& !(timeh & 0x80)". Regarding the first issue. I would not touch timex in PCINT1 and leave it to OVF. Instead, in PCINT1, I would define a local variable "Uint8 timexLocal = timex" and use it in the rest of the PCINT1 code.

Eugene Zharkov

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thank you Eugene, the changes you suggested helped.

Pete

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

When I want to extend an AVR timer, I keep the old value around.
If the new value is less than the old value, an overflow occurred.
The new value becomes the old value.
If necessary, ISRs and main can use separate timer extension routines and data.
Neither timer interrupts nor disabling interrupts is necessary.

Iluvatar is the better part of Valar.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

skeeve wrote:
When I want to extend an AVR timer, I keep the old value around.
If the new value is less than the old value, an overflow occurred.

Yes.
But how you get the right time stamp in this case?

I check only two bits, the interrupt flag and the MSB of the timer and then I add the timer overflow value.

I have never seen an easier and more efficient way as this simple two bit checking.

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

danni wrote:
skeeve wrote:
When I want to extend an AVR timer, I keep the old value around.
If the new value is less than the old value, an overflow occurred.

Yes.
But how you get the right time stamp in this case?

I check only two bits, the interrupt flag and the MSB of the timer and then I add the timer overflow value.

I have never seen an easier and more efficient way as this simple two bit checking.

Peter


uint32_t time1_main(void)
{
// time returned is the time as of the read of TCNT1L
static uint16_t overflows, old_count;
// needed only if interrupt might use high byte buffer
uint8_t sreg=SREG;
cli();
uint16_t new_count=TCNT1;
SREG=sreg;
if(new_count< old_count) ++overflows;
uint32_t retval= (((uint32_t)overflows)<<16) + new_count;
old_count=new_count;
return retval;
} // time1_main

If no ISR accesses TCNT1 or ICR1,
you won't need to worry about the atomicity of reading TCNT1.

extern uint16_t overflows_isr, old_count_isr;

static inline uint32_t timer1_isr(void)
{
// time returned is the time as of the read of TCNT1L
uint16_t new_count=TCNT1;
if(new_count< old_count_isr) ++overflows_isr;
uint32_t retval= (((uint32_t)overflows_isr)<<16) + new_count;
old_count_isr=new_count;
return retval;
} // time1_isr

Iluvatar is the better part of Valar.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Contrary to my prior belief, overflows_isr and old_count_isr do not need to be extern.
They can be static inside their function.
They do not need the _isr suffix.

In case it wasn't obvious, time1_main should never be called,
directly or otherwise, from an interrupt handler and
time1_isr should only be called with interrupts off.

Iluvatar is the better part of Valar.