Capturing and decoding a fast (but short) Manchester signal

Go To Last Post
25 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Gents,

First, I'm not 100% sure were this thread needs to be, so admins fill free to move if I screwed this up!

Ok. I am TRYING to decode data from a Kawasaki Industrial M21 absolute serial encoder from an AC servo motor. Of course, the protocol is proprietary. The goal is to be able to interrogate these encoders with a 20MHz atmega of some kind. Here is what I DO know about the protocol from various sources:

      1 Mbaud data rate
      48 bits sent in total
      RS485 link (dealt with)
      Half duplex
      Data is polled from slave encoder
      13 bit M-pattern data, 8192 bits per revolution absolute position
      15 bits signed for shaft revolution potion, +/-32,768 revolutions max
      2 bit magnetic incremental encoder for armature phasing
      3 bits of alarm data

With it hooked up to a PC via RS485 I have been able to get it to send out a pulse stream to me that I have captured via logic analyzer. The stream "looks" like this:

From were the flags are located, that is 48us between the start and stop positions, so that does match the timing specs I have found. If I assume that there is a 2us "mark" period at the start (AND that that is counted as 2 bits of "data"), that will give me the 48 bits of data IF the data is Manchester encoded (which it does appear to be). If you add up the list of bits send above, there is only 33 bits of actual data which leaves 15 bits of pad/sync/other stuff.

I've done quite a bit of research into Manchester coding and the basics are not that difficult. The problem I am having is dealing with the speed of this signal with a 20MHz atmega. As I am only trying to figure out this protocol at this point, I don't have any issues in capturing the stream and then decoding it in post processing (either on atmega or on a PC).

Where I have hit a bit of a road block is how can I capture a stream of this speed and buffer everything with only 10 clocks between each ts/2 period? The USART is out as it cannot accept a 48 bit frame of any kind (not even the AT90PWM3 which CAN decode Manchester). Does this need to be bit banged? Can a different peripheral do the job (such as the SPI)? Anyone have any experiences capturing bit streams such as this?

At this point, I'm not really sure how to attack it. My gut says to throw faster MCU at it (high clock rates) such as an xMega or perhaps a PIC32 or ARM. But I suspect there has to be a way to do this with an atmega as well and as I have a bit more experience programming those guys, that is what I would like to stick with if possible. So any help would be greatly appreciated!

 

Clint

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If you had a cpld to recover the clock, you could shift the datastream in via spi. A beaglebone black with its pru will do it easily

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

When sampling, you'll be doing this asynchronously, so you really want to sample at several times the signalling rate. From the looks of your trace, you're looking at about 2MHz signalling rate. You won't be able to bit-bang quickly enough.

The SPI might work - the SPI data register will shift in 8 samples at a time. You simply need to read the data register and store it somewhere in memory for each 8 samples.

One more possible approach - use the input capture of one of the timers.

- S

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
with a 20MHz atmega of some kind.

If you want to waste a lot of time then use some ATMega that has USART capable of working in Master SPI mode. Then you can clock it at F_CPU/2 (10Msps at F_CPU==20MHz).
You can also use a regular tiny in polling mode (20Msps at F_CPU==20MHz) but for triggering you need some external delay line ($0.1 shifter will do).

Otherwise - if it is manchester then buy a chip that understands manchester. Or CPLD.

No RSTDISBL, no fun!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

That beaglebone black is interesting. It definitely should be fast enough to do what I need and the price is not bad.

 

Clint

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If it is connected to PC via RS485, is it in any way possible that the serial transmission is UART compatible? If you invert what you see on the screen, does it look like UART data? You say something about sync bytes, does it look like there are always start and stop bits between some amount of bits? UART frames can have 5 to 9 bits of data plus optional parity, between start and stop bits.

If just raw capture for later use is needed, SPI at some high clock rate could work. You could even use a timer to generate SPI clock in software, if you connect a OC pin to SPI clock input pin.

Other possibility is to measure time between changes with a timer. Use input capture feature, pin change interrupt or just normal external interrupt pin on both edges. But you have to be pretty fast decoding each bit in 20 clock cycles.

Perhaps if you dedicate the AVR (another AVR?) to just sit in a loop counting the time between edges and decoding the bits, transmitting them to somewhere else over some other interface.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

[edit] Important edits below. [/edit]

RRRoamer wrote:
Where I have hit a bit of a road block is how can I capture a stream of this speed and buffer everything with only 10 clocks between each ts/2 period?
Use a mega with ICP, like the 328P.

Connect your signal to ICP1.

Something like this is a good start:

#include 

#define CAPTURE_EDGE_PAIRS 64

int __attribute__ ((__OS_main__)) main(void) {

  uint8_t capture_buf[CAPTURE_EDGE_PAIRS*2];
  uint8_t capture_idx = 0;
  uint8_t edge;
  uint8_t i;

  // Mode 0, no prescaler (50 ns resolution @ 20 MHz),
  // ICF1 set on rising edge
  TCCR1A = 0;
  TCCR1B = (1<<ICES1) | (1<<CS10);

  // Clear pending flag
  TIFR1 = (1<<ICF1);

  // Loop and capture
  while (capture_idx < (sizeof(capture_buf)/sizeof(*capture_buf))) {
    // Wait for rising edge
    while (!(TIFR1 & (1<<ICF1)));
    // Latch
    edge = ICR1L;
    // Clear flag
    TIFR1 = (1<<ICF1);
    // Store timestamp
    capture_buf[capture_idx++] = edge;
    // Configure for falling edge
    TCCR1B = (1<<CS10);
    // Wait for rising edge
    while (!(TIFR1 & (1<<ICF1)));
    // Latch
    edge = ICR1L;
    // Clear flag
    TIFR1 = (1<<ICF1);
    // Store timestamp
    capture_buf[capture_idx++] = edge;
    // Configure for rising edge
    TCCR1B = (1<<ICES1) | (1<<CS10);
  }

  // Prevent elimination of capture_buf[] by optimiser, so we can examine
  // .lss file (or output of avr-objdump -S)
  for (i=0; i<(sizeof(capture_buf)/sizeof(*capture_buf)); i++) {
    GPIOR1 = capture_buf;
  }

  while(1);
  
}

However, no matter what optimisation options I threw at it (mind you, I'm no expert), the best I could get was minimum 12 cycles for the 1st half of loop (post-falling-edge code), and 15 cycles for the 2nd half of loop (post-rising-edge code);

Using the compiler's best as a template, I whittled it down:

#include 

#define CAPTURE_EDGE_PAIRS 64

int __attribute__ ((__OS_main__)) main(void) {

  uint8_t capture_buf[CAPTURE_EDGE_PAIRS*2];
  uint8_t capture_edge_pairs_remaining;
  uint8_t edge;
  uint8_t i;

  // Prepare to fill buffer
  capture_edge_pairs_remaining = CAPTURE_EDGE_PAIRS;

  // Mode 0, no prescaler (50 ns resolution @ 20 MHz),
  // ICF1 set on rising edge
  TCCR1A = 0;
  TCCR1B = (1<<ICES1) | (1<<CS10);
  TIFR1 = (1<<ICF1);

  // Loop
  do {

    __asm__ __volatile__ (
    
    /*               */ "wait_rise%=:                              \n\t" // 
    /* Wait for rise   */ "sbis   %[tifr1],          %[icf1]       \n\t" // 1/2
    /*                 */ "rjmp   wait_rise%=                      \n\t" // 2
    /* Latch ICR1L     */ "lds    %[edge],           %[icr1l]      \n\t" // 2
    /* Clear ICF1      */ "out    %[tifr1],          %[clricf1]    \n\t" // 1
    /* Store           */ "st     %a[ptr]+,%[edge]                 \n\t" // 2
    /* Config for fall */ "sts    %[tccr1b],         %[falling]    \n\t" // 2
    // = 9 cycles minimum
                                                                           
    /*               */ "wait_fall%=:                              \n\t" // 
    /* Wait for fall   */ "sbis   %[tifr1],          %[icf1]       \n\t" // 1/2
    /*                 */ "rjmp   wait_fall%=                      \n\t" // 2
    /* Latch ICR1L     */ "lds    %[edge],           %[icr1l]      \n\t" // 2
    /* Clear ICF1      */ "out    %[tifr1],          %[clricf1]    \n\t" // 1
    /* Store           */ "st     %a[ptr]+,%[edge]                 \n\t" // 2
    /* Config for rise */ "sts    %[tccr1b],         %[rising]     \n\t" // 2
    // = 9 cycles minimum

                        :
    /* Working reg     */ [edge]    "=d" (edge)

                        :
    /* TIFR1           */ [tifr1]    "I" (_SFR_IO_ADDR(TIFR1)),
    /* ICF1            */ [icf1]     "I" (ICF1),
    /* Clear ICF1      */ [clricf1]  "r" ((uint8_t)(1<<ICF1)),
    /* ICR1L           */ [icr1l]    "M" (_SFR_MEM_ADDR(ICR1L)),
    /* TCCR1B          */ [tccr1b]   "M" (_SFR_MEM_ADDR(TCCR1B)),
    /* capture_buf     */ [ptr]      "x" (capture_buf),
    /* Falling         */ [falling]  "r" ((uint8_t)(1<<CS10)),
    /* Rising          */ [rising]   "r" ((uint8_t)((1<<ICES1) | (1<<CS10)))

                         );

  } while (--capture_edge_pairs_remaining);
  
  // Prevent elimination of capture_buf[] by optimiser, so we can examine
  // .lss file (or output of avr-objdump -S)
  for (i=0; i<(sizeof(capture_buf)/sizeof(*capture_buf)); i++) {
    GPIOR1 = capture_buf[i];
  }

  while(1);
  
}

This gets us down to 9 cycles for each half of the loop, but the loop mechanism adds 3 cycles to the 2nd half, for a total of 21 cycles for every 2 edges. That's just 1 cycle too slow. However, due to the nature of the capture mechanism, this code would work just fine if you could guarantee no more than 8 or 9 short (500 ns) pulses in a row. It's unlikely you'll be able to ensure that, so we need better.

[edit] There's actually a flaw in this code anyway. The [ptr] operand should be an input/output operand. It compiled correctly anyway, but that was dumb luck. The operand is fixed in the code below [/edit]

Compiling with -funroll-loops unrolls the loop 16-fold, giving each unfolded iteration a total of 18 cycles minimum, 9 cycles per pulse (low or high), 1 less than our budget, just what we needed. Every 16 iterations (32 edges), the loop mechanism adds 3 cycles. This is fine, as it results in an average of 9.09375 cycles per pulse. Again, because of the nature of the capture mechanism, this is tolerable.

I tried compiling with max-unroll-times, max-unrolled-insns, and max-average-unrolled-insns, but the result was always only 16-fold unrolling. I don't know why, but it's not important as 16 is enough for your purposes.

Be aware that unfolding by 16 adds about 512 bytes to the loop code.

Note also that since we're latching only the lower 8 bits of ICR1, that we can only measure pulse-widths from 10 cycles to 264 cycles, or 500 ns to 13.2 µs. Yes, I said 264. That's because a capture value of 0 through 8 implies a roll-over of ICR1L, since our code has a minimum execution time of 9 cycles. You could account for this in post-processing. It might simply be safer to limit your max width to 255 cycles. Sounds like the widest you expect to see is 20 cycles anyway.

There are other limitations. Since capture_edge_pairs_remaining is 8-bits, we can only capture at most 255 pairs, or 510 single edges. This is enough to capture 255 bits minimum, which seems to fit your requirement.

If you really need to be able to capture more than 255 bits, you could declare capture_edge_pairs_remaining as uint16_t. This will add only 1 cycle to the loop mechanism, bringing our average capture time to 9.125 cycles, still acceptable. Note that as you change CAPTURE_EDGE_PAIRS, the loop unrolling will change. Make sure to study the .lss file to confirm the timing constraints are still met. Powers of two are good choices for CAPTURE_EDGE_PAIRS, as the loop unrolling won't involve a remainder.

But what if the bitstream contains fewer edges than our loop is configured to capture? In fact, that will almost always be the case. Since we want to be prepared to capture as many edges as we [i]might see, it stands to reason that many if not most messages will contain fewer.

One way might be to use the WDT with WDIE to implement a timeout. However, if our capture code has to monitor WDIF to keep from getting caught in an endless loop, it will push us over our 10-cycle budget.

However, this:

#include 
#include 
#include 
#include 
#include 

#define CAPTURE_EDGE_PAIRS 64

volatile uint8_t *capture_buf_address;
volatile uint8_t captured_edges;

// Main
int __attribute__ ((__OS_main__)) main(void) {

  uint8_t capture_buf[CAPTURE_EDGE_PAIRS*2];
  uint8_t *capture_buf_ptr;
  uint8_t capture_edge_pairs_remaining;
  uint8_t captured_edges_local;
  uint8_t i;

  // Global interrupts
  sei();

  // Debug serial (call whatever is appropriate for your debugging,
  // but make sure you don't use any other interrupts)
  serial_configure();
  printf_P(PSTR("\033cRESET\r\n"));

  // Communicate capture buffer address to WDT ISR
  capture_buf_address = capture_buf;

  // Prepare to fill buffer
  capture_buf_ptr = capture_buf;
  capture_edge_pairs_remaining = CAPTURE_EDGE_PAIRS;

  // Preset captured edges count for a full buffer, WDT ISR will change
  // if captures fall short.
  captured_edges = CAPTURE_EDGE_PAIRS * 2;

  // Configure TIMER 0 for a 50% duty PWM, for testing.
  // Don't forget to connect OC0B/PD5 and ICP1/PB0.
  DDRD |= (1<<PD5);
  TCCR0A = (1<<COM0B1) | (1<<WGM01) | (1<<WGM00);
  TCCR0B = (1<<WGM02) | (1<<CS00);
  OCR0A = 23; // 24 cycles period
  OCR0B = 11; // 12 cycles half-period

  // Mode 0, no prescaler (50 ns resolution @ 20 MHz),
  // ICF1 set on rising edge
  TCCR1A = 0;
  TCCR1B = (1<<ICES1) | (1<<CS10);

  // Wait for line to become idle LOW
  while (PINB & (1<<PB0));

  // Clear capture flag
  TIFR1 = (1<<ICF1);

  // Wait for rising edge denoting mark before bitstream.
  // Note that first high pulse should be longer than normal pulse width
  // to accommodate WDT configuration and loop setup code.
  while (!(PINB & (1<<PB0)));

  // Configure WDT for timeout, 15 ms
  wdt_reset();
  WDTCSR = (1<<WDCE) | (1<<WDE);
  WDTCSR = (1<<WDIF) | (1<<WDIE);

  // Loop
  do {

    __asm__ __volatile__ (

    /*               */ "wait_rise%=:                              \n\t" //
    /* Wait for rise   */ "sbis   %[tifr1],          %[icf1]       \n\t" // 1/2
    /*                 */ "rjmp   wait_rise%=                      \n\t" // 2
    /* Clear ICF1      */ "out    %[tifr1],          %[clricf1]    \n\t" // 1
    /* Latch ICR1L     */ "lds    __tmp_reg__,       %[icr1l]      \n\t" // 2
    /* Store           */ "st     %a[ptr]+,__tmp_reg__             \n\t" // 2
    /* Config for fall */ "sts    %[tccr1b],         %[falling]    \n\t" // 2
    // = 9 cycles minimum

    /*               */ "wait_fall%=:                              \n\t" //
    /* Wait for fall   */ "sbis   %[tifr1],          %[icf1]       \n\t" // 1/2
    /*                 */ "rjmp   wait_fall%=                      \n\t" // 2
    /* Clear ICF1      */ "out    %[tifr1],          %[clricf1]    \n\t" // 1
    /* Latch ICR1L     */ "lds    __tmp_reg__,       %[icr1l]      \n\t" // 2
    /* Store           */ "st     %a[ptr]+,__tmp_reg__             \n\t" // 2
    /* Config for rise */ "sts    %[tccr1b],         %[rising]     \n\t" // 2
    // = 9 cycles minimum

                        :
    /* capture_buf > X */ [ptr]     "+&x" (capture_buf_ptr)

                        :
    /* TIFR1           */ [tifr1]    "I" (_SFR_IO_ADDR(TIFR1)),
    /* ICF1            */ [icf1]     "I" (ICF1),
    /* Clear ICF1      */ [clricf1]  "r" ((uint8_t)(1<<ICF1)),
    /* ICR1L           */ [icr1l]    "M" (_SFR_MEM_ADDR(ICR1L)),
    /* TCCR1B          */ [tccr1b]   "M" (_SFR_MEM_ADDR(TCCR1B)),
    /* Falling         */ [falling]  "r" ((uint8_t)(1<<CS10)),
    /* Rising          */ [rising]   "r" ((uint8_t)((1<<ICES1) | (1<<CS10)))

                         );

  } while (--capture_edge_pairs_remaining);

  // Disable timeout interrupts
  cli();
  wdt_disable();
  TIMSK1 = 0;

  // Debug output of captures
  captured_edges_local = captured_edges;
  printf_P(PSTR("captured_edges = %03u\r\n"), captured_edges);
  for (i=1; i - capture_buf[i-1]));
  }

  while(1);

}

// Timeout
ISR(WDT_vect) {

  uint8_t *ptr;

  // Save state of pointer register used by capture code
  __asm__ __volatile__ (
  /* Save X          */ "movw   %[ptr],            r26           \n\t" // 1
                      :
  /* Working reg     */ [ptr]     "=d" (ptr)
                      :
                       );

  // Store the number of edges that were actually captured
  captured_edges = (ptr - capture_buf_address) / sizeof(*capture_buf_address);

  // Reconfigure timer for fast interrupt
  TCCR1A = 0;
  TCCR1B = (1<<WGM12) | (1<<CS10);
  OCR1A = 0x01;
  TIMSK1 = (1<<OCIE1A);

  // Configure analog comparator to trigger ICF1, connect AIN0 to Vbg
  ACSR = (1<<ACBG) | (1<<ACIC);
  DDRD |= (1<<PD7) | (1<<PD6);

  // Disable WDT interrupt
  wdt_disable();

}


// Forces ICF1 to set
ISR(TIMER1_COMPA_vect, ISR_NAKED) {

  // Toggle both AIN0 and AIN1 back and forth, which will set ICF1
  __asm__ __volatile__ (
  /* Toggle          */ "sbi    %[pind],           %[ain1]       \n\t" // 2
  /* Toggle          */ "sbi    %[pind],           %[ain1]       \n\t" // 2
  /* RETI            */ "reti                                    \n\t" // 4
                      :
                      :
  /* PIND            */ [pind]     "I" (_SFR_IO_ADDR(PIND)),
  /* AIN1            */ [ain1]     "I" (PD7)

                       );

}

... should do the trick.

Basically, after the WDT timeout, the WDT ISR configures a timer interrupt that repeatedly arranges a means of triggering ICF1 by using the analog comparator, the internal bandgap voltage reference, and AIN1/PD7. If a 15 ms timeout is too long, a separate timer can be used for the timeout.

Note that although one might suspect that simply manipulating ICES1 in the timer ISR would be enough to trigger false edge detections and captures like you can do with some other edge-detection-based flags, the capture mechanism doesn't work that way. Using the analog comparator is a neat workaround.

This is a software solution to the problem of how to break out of the capture loops without needing to monitor WDIF within the loop. A hardware solution might use a high-frequency signal, either generated externally, or with a PWM output from another timer connected to AIN1/PD7. This would eliminate the need for the timer interrupt, and would speed exit from the capture loops after the timeout (no ISR overhead).

In reality we never actually break out of the loops, but we force them all to complete. capture_buf[] is [i]always filled. The number of meaningful captures before the WDT timeout kicked in in saved in the global volatile captured_edges. The remaining captured timestamps are the result of the action of the timer ISR and the analog comparator, and are not meaningful.

You should test this with a 1 MHz signal, to ensure that it can indeed capture 500 ns pulses. Perhaps configure another timer for PWM and connect a PWM output to ICP1.

[edit] Fixed some coding erros and added PWM to test the capture. Seems it doesn't work properly after all. See my next post. [/edit]

You will no doubt need to make significant changes to adapt this to your needs.

A better solution might be to go with an XMEGA device. With it's event system and DMA, this should all be a very simple matter.

JJ

///////////////////////////////////////////////////////////////////////////////

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

Last Edited: Tue. Oct 8, 2013 - 06:20 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

10/20 clocks should be plenty with a bit of ASM trickery.

Just remember with manchester you are looking for a state change to adjust your clock.

Don't try count the time between state changes. Just look for either a H/L or a L/H transition in the middle of the bit period.

You'll be right :)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

So I took a look at my previous post again, and added some code to use timer 0 as a PWM test source, with a 10-cycle half period.

It doesn't work :(

Apart from a few coding errors, which I've corrected in my previous post, It appears as though 9 cycles is not short enough to capture a 10-cycle-wide pulse. Damned if I know why. I've taken pencil to paper and drawn a number of charts, they all say it should work. Best I can reliably do is 12 or 13 cycles. Ah well...

You could try it with an AVR that has two 16-bit timers with capture to get things down to 7 cycles, which should be good for a 10 cycle pulse. However, I don't know of any that have analog comparators that can trigger ICFn on more than one timer, so breaking out of the loop would be challenging.

As suggested by @andrewm1973 you could simply sample in the middle of every bit period, every 10 cycles, but the challenge there is knowing where the bit periods begin and end. If the two clocks aren't exactly synchronised, you will get errors.

Oversampling would solve that. You could simply take 256 samples in rapid succession:

#include 
#include 
#include 

// Main
int __attribute__ ((__OS_main__)) main(void) {

  uint8_t capture_buf[256];
  uint16_t i = 0;

  // Debug serial (call whatever is appropriate for your debugging,
  // but make sure you don't use any interrupts)
  serial_configure();
  printf_P(PSTR("\033cRESET\r\n"));
    
  // Configure TIMER 0 for a 50% duty PWM, for testing.
  // Don't forget to connect OC0B/PD5 and PB0.
  DDRD |= (1<<PD5);
  TCCR0A = (1<<COM0B1) | (1<<WGM01) | (1<<WGM00);
  TCCR0B = (1<<WGM02) | (1<<CS00);
  OCR0A = 19; // 20 cycles period
  OCR0B = 9; // 10 cycles half-period
  
  // Wait for line to become idle LOW
  while (PINB & (1<<PB0));

  // Wait for line to begin mark HIGH
  while (!(PINB & (1<<PB0)));

  // Sample as fast as we can
  __asm__ __volatile__ (
  
  /*               */ "sample_loop%=:                            \n\t" // 
  /* Sample PINB     */ "in     __tmp_reg__,       %[pinb]       \n\t" // 1
  /* Store           */ "st     %a[ptr]+,__tmp_reg__             \n\t" // 2
  /* Count samples   */ "dec    __zero_reg__                     \n\t" // 1
  /* Count samples   */ "dec    __zero_reg__                     \n\t" // 1
  /* Sample PINB     */ "in     __tmp_reg__,       %[pinb]       \n\t" // 1
  /* Store           */ "st     %a[ptr]+,__tmp_reg__             \n\t" // 2
  /* Loop            */ "brne   sample_loop%=                    \n\t" // 1/2
  // = 5 cycles per 2 samples

                      :
  
                      :
  /* TIFR1           */ [pinb]     "I" (_SFR_IO_ADDR(PINB)),
  /* X, Y, or Z      */ [ptr]      "e" (capture_buf)

                       );

  // Debug output of captures
  for (i=0; i

This is a far simpler solution than my well-meaning but ill-fated attempt to capture edge timestamps.

A sample is taken exactly every 5 cycles. Should be a simple matter to post-process.

JJ

///////////////////////////////////////////////////////////////////////////////

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

andrewm1973 didn't suggest just sampling in the middle of the bit period.

He said "you are looking for a state change to adjust your clock"

One of Manchester encodings reasons for existing is to be self clocking

You should look for the transition H/L or L/H. If it is sooner than you expect you are too slow and should speed up. If it is later than you expect you are too fast and should slow down.

How about if you try this

    blah
    blah
    blah

    clr   r16         ; clear the working register        (t/2)-5clocks
    sbrc  PINn, m     ; skip next instruction if pin low  (t/2)-4clocks
    sbr   r16, (1<<7) ; Set a bit in the working register (t/2)-3clocks
    sbrc  PINn, m     ; skip next instruction if pin low  (t/2)-2clocks
    sbr   r16, (1<<6) ; Set a bit in the working register (t/2)-1clocks
    sbrc  PINn, m     ; skip next instruction if pin low  (t/2) 0clocks
    sbr   r16, (1<<5) ; Set a bit in the working register (t/2)+1clocks

    blah
    blah
    blah

That has taken 7 of your total 20 clocks allowed and you have now got one of 8 possible values in r16

000xxxxx A L/H and you are slow
001xxxxx A L/H and you are on time
010xxxxx Something has gone wrong
011xxxxx A L/H and you are fast
100xxxxx a H/L and you are fast
101xxxxx Something has gone wrong 
110xxxxx A H/L and you are on time
111xxxxx A H/L and you are slow

You now have 14 clocks left (5 before and 9 after) to do your decision, bit shifting and loop counting.

Should be enough for anyone. Especially seeing as you can actually be up to 4 (or more) clocks slower on the (n*8)th bit and you will catch up before your next byte change at ((n+1)*8)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The other technique of decoding mancheaster is by fat and skinny pulses. You have four possibilities - fat/skinny and high/low. Feed these into a fsm and out pops the data. If you can classify these fast enough, you can then post process.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

andrewm1973 wrote:
andrewm1973 didn't suggest just sampling in the middle of the bit period.

He said "you are looking for a state change to adjust your clock"

Talking about ourselves in the 3rd person, are we? ;)

The OP isn't asking how to decode Manchester:

RRRoamer wrote:
I've done quite a bit of research into Manchester coding and the basics are not that difficult.
He's asking how to capture a high-speed bitstream:
RRRoamer wrote:
The problem I am having is dealing with the speed of this signal with a 20MHz atmega. As I am only trying to figure out this protocol at this point, I don't have any issues in capturing the stream and then decoding it in post processing (either on atmega or on a PC).

Where I have hit a bit of a road block is how can I capture a stream of this speed and buffer everything with only 10 clocks between each ts/2 period?


I daresay, however, that your sbrc/sbr approach is a very nice start on a PLL. I had not imagined it would be possible to do the clock recovery on the fly, and was simply trying to capture state for post processing as the OP had suggested.

Nicely done!

JJ

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:


The OP isn't asking how to decode Manchester:

He's asking how to capture a high-speed bitstream:

Hence why I started by reminding that manchester is about capture of an EDGE-stream rather than a BIT-stream. When you have that hint the rest gets a lot easier.

joeymorin wrote:

I daresay, however, that your sbrc/sbr approach is a very nice start on a PLL. I had not imagined it would be possible to do the clock recovery on the fly, and was simply trying to capture state for post processing as the OP had suggested.

Nicely done!

JJ

In reality I would not use sbrc/sbr to do this. I would probably use the SPI in master mode to read the state of the pin. That would leave enough spare clocks you could do a look up rather than a code table.

    out  SPDR, r16        ; Do this at T-9

    blah
    blah
    blah
                          ; T-0
    blah
    blah
    blah

    in   r17, SPDR        ; T+9

    *jmp back_to_start

I was just showing the idea that you can clock recover and read the edge in one logical step. SBRC/SBR seemed easier to use to explain this.

Using SPI you only use 2 clocks rather than 7. This is at the expense of having to use an LPM to decode result. It should save flash by not needing a code-table for an FSM though.

OP questions was not of the

"Please Sirs, my company i work for has asked me to complete
. I urgently need complete code for this by 2:30pm"

So I was just nudging in the direction rather than doing it all.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

RRRoamer wrote:

At this point, I'm not really sure how to attack it. My gut says to throw faster MCU at it (high clock rates) such as an xMega or perhaps a PIC32 or ARM. But I suspect there has to be a way to do this with an atmega as well and as I have a bit more experience programming those guys, that is what I would like to stick with if possible. So any help would be greatly appreciated!

Small micros tend to be more nimble.
I would follow the SPI capture approach some have suggested.
Especially as this seems to be burst in nature, with good pauses.

We have done continual SPI capture (logic probe style) at up to Fsys/2 using an AT89LP51Rx2 into XDATA, so that can capture 8192 samples per 1K of RAM.

The 'engine room' is just 2 lines in the LP51RB2

  XCH     A,SPDAT     ;2 XCH -> Capture _and_ reload SPI 
  MOVX    @DPTR,A     ;2 DPTR++ Enabled 

and the rest is counting & time packers.

So it can be done on a small uC and with plenty of margin.
[eg @ 10MHz sample rate is 10 samples per bit, 480 samples in 60 bytes, an overkill for your needs]

I think AVR SPI ports have some quirk going gapless, from memory, they may need 9 clocks per bit frame * ?

Even with that effect, you can still get enough samples as you just need to resolve wide/narrow.

If you chose any of these
20MHz/4 -> 5 samples per us
20MHz/6 -> 3.33 samples per us
20MHz/8 -> 2.5 samples per us
or maybe
16MHz/4 -> 4 samples per us (easy to check)

or if the AVR needs 9 clocks/byte, you might use
18MHz/4 -> one byte per 2 bit-slots.
(the missing sample is 444ns once between captures, others are 222ns, and 444ns once is still ok)

You probably cannot rely on aligning the samples, so code that tolerates a little baud creep would be best, but using a nibble per us would make things easy to check.
- your whole 48 slot frame then captures in 24 bytes.

* I found this in another thread:

Quote:
The fastest serial bus on the AVR is the USART in SPI mode. 8 bits per byte, max 2 cycles per bit so a 16MHz bus is possible (or 25MHz if overclocked).

It's faster than the SPI peripheral because it doesn't require an extra "dead" cycle between bytes thanks to buffering. You can also DMA data to it.

Last Edited: Thu. Oct 10, 2013 - 07:32 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
One of Manchester encodings reasons for existing is to be self clocking
That was never in dispute.
Quote:
You should look for the transition H/L or L/H.
Quote:
Hence why I started by reminding that manchester is about capture of an EDGE-stream rather than a BIT-stream. When you have that hint the rest gets a lot easier.
Gee, I wonder why I started by trying to capture edges... ;)

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:
Gee, I wonder why I started by trying to capture edges... ;)

Yes, you did start by capturing edges with the ICP.

However you then said that _I_ recommended sampling the bit mid-period (every 10 cycles). That was what I was correcting. I said you should look for a transition and adjust the clock.

If OP read the hint I gave and then read your text saying

joeymorin wrote:
the challenge there is knowing where the bit periods begin and end. If the two clocks aren't exactly synchronised, you will get errors.

Then the OP may have walked away ignoring the semi useful hint. You can get the value AND sync the clocks with MAGIC.

If you look for the transition and adjust your clock then you have enough time to decode and store the 48 bits on the fly. No storing 100s of samples and decoding later. Just save 48bits/6Bytes to RAM.

That is something you would be hard pressed to do with ICP. As the ICP can only look for one polarity of clock at a time. With ICP you need to look for EVERY edge AND reconfigure the ICP every edge.

My suggestion lets you look for a H/L or a L/H with the same code so you only ever have to look at the important transition (in the middle of the bit period)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

andrewm1973 wrote:
However you then said that _I_ recommended sampling the bit mid-period (every 10 cycles). That was what I was correcting. I said you should look for a transition and adjust the clock.
My mistake. Will you forgive me? :)
Quote:
You can get the value AND sync the clocks with MAGIC.
Hardly magic. No need to be patronising...
Quote:
My suggestion...
[sigh]
joeymorin wrote:
... my well-meaning but ill-fated attempt to capture edge timestamps...
joeymorin wrote:
Nicely done!
Where was I not clear ;)

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Wow! While I was trying to finish up a theoretical fluids homework set, a lot has been going on here! Thanks for everyone that took time to look into this and provide advice (and a lot more code than I ever expected!)

At this point, I need to put together a bit of hardware (I don't have a 20MHz board put together) and do more digging into everything that has been presented here. You have all given me a LOT to think about and definitely better ideas on how to attack this problem of mine.

Of course, that is going to have to wait until after the digital controls midterm next week... (masters mechanical engineering student)

And just to clarify, this isn't for any class or job. I picked up several AC servos with these Kawasaki M21 encoders on them a few years ago and I have been working (in fit's and starts) on talking to the encoder ever sense.

 

Clint

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I guess at this stage you are just trying to decode and analyse the stream so any AVR will do. I would say that when you actually want to start driving 3-half-bridges to drive the motor you should probably step up to an XMega or an AT90PWM chip.

It certainly sounded in your first post that you where not just asking for an answer rather some help in the right direction :)

If you decide you want to try go with the SPI to measure the transition idea and have any problems - pipe up and I will lend a hand with code. It's not too hard so you should be OK :)

Read 8 bits with the SPI (possibly discard 5 of them)
Have a look up table that tells you
    Is this a H/L or a L/H
    Am I too-slow, too-fast, on time
Decide if this is a 1 or a 0
Save this bit to the buffer
Act on the clock sync by
    Add and extra TICK if you are too fast (loop = 21 ticks)
    Skip a TICK if you are too slow (loop = 19 ticks)
    Don't do anything if you are on time.

If you can spare the 256 bytes for the look up table then 8 bits will give you some noise immunity. If you are short on flash then use 3 bits.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

andrewm1973 wrote:
I guess at this stage you are just trying to decode and analyse the stream so any AVR will do.

Exactly! The encoders are very good encoders (these servos were originally used on 100K$ industrial robots and mine came from one that sat in a university research lab for over a decade collecting dust before they decided to part it out and reclaim the space...) For some strange reason, Kawasaki just doesn't want to tell me ANYTHING about these encoders, so I have figure it out for myself. Which takes a lot longer, but it is also a lot funner!

andrewm1973 wrote:
It certainly sounded in your first post that you where not just asking for an answer rather some help in the right direction :)

That was the plan. I'm not a professional programmer by any stretch, but I DO know a lot more about AVRs than your average ME and electronics in general for that matter... Of course, that just puts me at the point were I know enough to know I don't know what I need to know to do this, so it was time to ask for a bit of guidance.

 

Clint

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Clint,
I reached out to a client of mine that knows about the encoders, Here is what he emailed me earlier today:

Quote:
Jim,
I have done some asking around and didn't come up with anything that could help on the Kawasaki M21 encoder. I am pretty sure Mitchell Electronics can talk to them but no detailed info could be found. Trying one more source that repairs them to see if they will give up any info.

So I have one more avenue from him to see what comes back. Otherwise try Mitchell Electronics:
http://www.mitchell-electronics....

Apparently information on these things is quite scarce. Should you be able to decode the stream you could become very popular. ;)

I would rather attempt something great and fail, than attempt nothing and succeed - Fortune Cookie

 

"The critical shortage here is not stuff, but time." - Johan Ekdahl

 

"Step N is required before you can do step N+1!" - ka7ehk

 

"If you want a career with a known path - become an undertaker. Dead people don't sue!" - Kartman

"Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?"  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB, RSLogix user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

RRRoamer wrote:

That was the plan. I'm not a professional programmer by any stretch, but I DO know a lot more about AVRs than your average ME and electronics in general for that matter... Of course, that just puts me at the point were I know enough to know I don't know what I need to know to do this, so it was time to ask for a bit of guidance.

If you are comfortable with ASM, you should be fine.

Using the SPI to sample (or USART in SPI mode) can slow down the decision process to something practical.

There, you have two broad choices.

andrewm1973 has given an on-the-fly framework, but the time budget may be tight. (all code branches need to be 20cy, with +/- choice for phase tracking )
If you align the SPI samples to be 16 cycle wide, centered on a 20cy bit-window, you can extract data and phase, either in a Table, or with some IF ELSEIF tests.

If that does not fit, you can capture the steam into a sampled array, and scan that later to decode.
Coding for that is less critical, but works best with moderate refresh rates on the Sensor bursts.
The Array approach is easier to debug, as you can create test arrays.

How often do those frames arrive, or do you ask for one ?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This thread was sticking in the back of my mind as a fun challenge, especially since people were saying that a 20 MHz AVR might need hardware assistance. So I whipped up a decoder for the example stream and tested it in simavr using a CPU clock of 20 MHz and a Manchester clock period of 1.0 us and the code properly decodes the stream as below (errors are probably me transcribing it into the simulator):

46: 0010111101111110110111011111100010000111100011

I then varied the clock period. It worked down to 0.8 us and up to 1.2 us, failing any lower/higher. The code re-synchronizes to the stream on each transition. It currently writes each bit as a byte in the output buffer.

#define __SFR_OFFSET 0
#include 

#define M_BIT PINB,0

    .section .text
    
    .global read_manchester

#define zero r1
#define one r18

read_manchester:
    movw r26,r24
    clr zero
    ldi one,1
while_low:
    sbis M_BIT      ; wait for sync pulse
    rjmp while_low
while_high:         ; wait for end of sync pulse
    sbic M_BIT
    rjmp while_high
    
low_mid:
                    ; -0.35
    st  X+,zero
                    ; -0.25
    sbic M_BIT      ; Wait for high on clock
    rjmp high
    sbic M_BIT
    rjmp high
    sbic M_BIT
    rjmp high
    sbic M_BIT
    rjmp high
    sbic M_BIT
    rjmp high
    
                    ; +0.25
low:
    sbic M_BIT      ; Wait for high between clocks
    rjmp high_mid
    sbic M_BIT
    rjmp high_mid
    sbic M_BIT
    rjmp high_mid
    sbic M_BIT
    rjmp high_mid
    sbic M_BIT
    rjmp high_mid
    sbic M_BIT
    rjmp high_mid
low_low:            ; +0.85

done:
    sub r26,r24
    sub r27,r25
    movw r24,r26
    ret

high_mid:
                    ; -0.35
    st  X+,one
                    ; -0.25
    sbis M_BIT      ; Wait for low on clock
    rjmp low
    sbis M_BIT
    rjmp low
    sbis M_BIT
    rjmp low
    sbis M_BIT
    rjmp low
    sbis M_BIT
    rjmp low
                    ; +0.25
high:
    sbis M_BIT      ; Wait for low between clocks
    rjmp low_mid
    sbis M_BIT
    rjmp low_mid
    sbis M_BIT
    rjmp low_mid
    sbis M_BIT
    rjmp low_mid
    sbis M_BIT
    rjmp low_mid
    sbis M_BIT
    rjmp low_mid
error:              ; +0.85
    sub r24,r26     ; negative number of bits received
    sub r25,r27
    ret
extern int read_manchester( char buf [] );

int main( void )
{
    DDRB = 0;
    
    char buf [256];
    int len = read_manchester( buf );
    printf( "%d: ", len );
    for ( int i = 0; i < len; i++ )
        printf( "%d", (int) buf [i] );
    printf( "\n" );
}
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I had talked to an engineer at Mitchell Electronics some time ago. Basically, I was told the information was proprietary and they would not be able to help. But they WOULD be willing to sell me $3000 in hardware and license (for one year...) so I could test the encoders.

Also, I've been up to my eyebrows working on my Master's for the last several weeks, so I have not been able to dig into this all lately. Hopefully, after school is out I'll have some time to work on this over Christmas and New Years. I hate being this close to a solution and not being able to scratch that itch!

 

Clint

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm new to this thread but will add this:
Just for a test I would do some thing like this:
make a kind of logic analyzer with the mega:

init code
wait for trigger (loop on portpin) 
IN R24,PORTB
ST Z+ ,R24
IN R24,PORTB
ST Z+ ,R24
IN R24,PORTB
ST Z+ ,R24
....
repeat 400 times
send the data.

this will sample with 6 2/3 MHz for 60 uS.
and then send it to your PC, and look at it.
If this all your AVR need to do, it's not a problem to decode on the fly, if you can't live with any delays, but you will need to write it in ASM.
If you are in sync and know there are no errors in the data, you only need to read with 1MHz. for the decoder.