Fast 6 Digit BCD Counter Code

Go To Last Post
23 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I needed a fast BCD counter for a project I'm working on. And by fast I mean that I wanted it to be able to increment at 1MHz on a system with only a 32MHz clock. I achieved my goal; the worst case is when the value is 999999, which takes 26 clocks excluding jump/return. I thought I'd share the code in hopes that someone else might find it useful. This is set up for the Xmega, but it's a simple task to put this on another µC.

;===============================================================================
; BCDCounter.asm
; (C) Phillip Slawinski 2011
; 6-Digit Packed-BCD Counter
;===============================================================================

.def LNI = R16                           ; low nibble increment value
.def HNI = R17                           ; high nibble increment value
.def MAX = R18                           ; max value for packed BCD
           
.def LOB = R20                           ; low byte
.def MDB = R21                           ; middle byte
.def HIB = R22                           ; high byte

.include "ATxmega128A1def.inc"

main:

ldi R16, 0x01                           ; set up the sleep control register
sts SLEEP_CTRL, R16                     ;

ldi LNI, 0x0A                           ; lower nibble increment value
ldi HNI, 0x10                           ; upper nibble increment value
ldi MAX, 0xA0                           ; max value for packed BCD

; Initialize pointer registers to the addresses of the output ports.
ldi XL, LOW(PORTA_OUT)                  ; load address of port A
ldi XH, HIGH(PORTA_OUT)
ldi YL, LOW(PORTB_OUT)                  ; load address of port B
ldi YH, HIGH(PORTB_OUT)
ldi ZL, LOW(PORTC_OUT)                  ; load address of port C
ldi ZH, HIGH(PORTC_OUT)

; debug code for counting how many clocks the routine takes ....................
rjmp bcd_counter
nop

;*******************************************************************************
; bcd_counter
; Purpose:
;           This subroutine increments a 6 digit / 3 byte BCD counter
; Functional Description:
;           The three byte routines are basically repeated with only changes
;           to the registers that are used.
;           Upon entry into the subroutine the count values are written to 
;           the mapped ports.  This ensures that zero is set on the first run.
;           
;           After writing the values we start computing the new count for the
;           first byte.  We first subtract 9 from the count, then we check to
;           see if the half carry flag is set, if it isn't, then the value we
;           subtracted from is 9, so we need to increment the second nibble. 
;           Otherwise we increment the lower nibble and exit the routine.
;
;           When we get to the code to increment the second nibble, the 9 in the
;           lower nibble has aleady been cleared thanks to the subtraction.  So
;           we add 0x10 to the byte to increment the upper nibble.  After adding
;           we check to make sure the byte isn't equal to 0xA0.  If it is, then
;           we passed 0x99 (max for packed BCD byte) and we need to increment
;           the next byte.  If this happens on the last byte we need to reset 
;           it. Otherwise we exit the routine.
; Registers Used:
;           X, Y, Z - Used to hold address of output ports.
;           LNI -  low nibble increment value       (R16)
;           HNI -  high nibble increment value      (R17)
;           MAX -  max value for packed BCD         (R18)
;                 
;           LOB - low byte                          (R20)
;           MDB - middle byte                       (R21)
;           HIB - high byte                         (R22)
;*******************************************************************************
bcd_counter:
    ; Storing using ST and NOT STS, sts takes two clock cycles ST takes 1
    st X, LOB                           ; store port A
    st Y, MDB                           ; store port B
    st Z, HIB                           ; store port C

    ; Lowest Digits ############################################################
    subi LOB, 0x09                      ; has lower nibble reached 9 already?
    brhc bcd_counter_tens               ; if so, go to next
    add LOB, LNI                        ; add increment amount
    rjmp bcd_counter_exit               ; jump to exit
   
    bcd_counter_tens:
    add LOB, HNI                        ; add increment amount
    cpse LOB, MAX                       ; have we reached max? if so skip next
    rjmp bcd_counter_exit               ; jump to exit

    ; Middle Digits ############################################################
    bcd_counter_hundreds:
    clr LOB                             ; clear lower digits
    subi MDB, 0x09                      ; has lower nibble reached 9 already?
    brhc bcd_counter_thousands          ; if so, go to next
    add MDB, LNI                        ; add increment amount
    rjmp bcd_counter_exit               ; jump to exit
   
    bcd_counter_thousands:
    add MDB, HNI                        ; add increment amount
    cpse MDB, MAX                       ; have we reached max? if so skip next
    rjmp bcd_counter_exit               ; jump to exit

    ; Higest Digits ###########################################################
    bcd_counter_ten_thousands:
    clr MDB                             ; clear lower digits
    subi HIB, 0x09                      ; has lower nibble reached 9 already?
    brhc bcd_counter_hundred_thousands  ; if so, go to next
    add HIB, LNI                        ; add increment amount
    rjmp bcd_counter_exit               ; jump to exit
   
    bcd_counter_hundred_thousands:
    add HIB, HNI                        ; add increment amount
    cpse HIB, MAX                       ; have we reached max? if so reset
    rjmp bcd_counter_exit               ; jump to exit
    clr HIB                             ; reached max count, so reset
    bcd_counter_exit:
reti

busy_loop:                              ; sleep while we're not doing anything
    sleep
    rjmp busy_loop
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
I needed a fast BCD counter for a project I'm working on. And by fast I mean that I wanted it to be able to increment at 1MHz on a system with only a 32MHz clock.
I'm curious as to exactly why you needed a BCD counter this fast. Usually a BCD counter would be for human readable situations, and humans certainly can't read at 1MHz.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This is for a high precision timer. I wanted µS resolution, hence the 1MHz. The timer isn't supposed to be viewed in real time by a human, but rather through high speed video. I plan to tie the subroutine you see above to a timer ISR. This way I can count at any time increment down to 1µS. The idea is to have a signal trigger the timer at the start of an event, and it will count until the signal goes low.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

but rather through high speed video.

Does even high-speed video go at 1M frames/second? Perhaps, according to a quick Google search.

What display technology is used that will change fast enough to make the low digits something other than a blur?

Quote:

I wanted it to be able to increment at 1MHz on a system with only a 32MHz clock. I achieved my goal; the worst case is when the value is 999999, which takes 26 clocks excluding jump/return.

Ummmm--26 clocks worst case for the "routine", perhaps.

Why isn't the cycles for RETI important?
How long does it take to awaken from sleep?
Then you have to take the vector, and get to your routine.

In apps like this, you can achieve somewhat better max rate by sitting in a tight polling loop during the critical operation.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The display doesn't have to run at 1MHz, just the timer. When it stops, display the current value. While it's running, display every tenth second or so.

Chuck Baird

"I wish I were dumber so I could be more certain about my opinions. It looks fun." -- Scott Adams

http://www.cbaird.org

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:
Quote:

but rather through high speed video.

Does even high-speed video go at 1M frames/second? Perhaps, according to a quick Google search.

What display technology is used that will change fast enough to make the low digits something other than a blur?

Quote:

I wanted it to be able to increment at 1MHz on a system with only a 32MHz clock. I achieved my goal; the worst case is when the value is 999999, which takes 26 clocks excluding jump/return.

Ummmm--26 clocks worst case for the "routine", perhaps.

Why isn't the cycles for RETI important?
How long does it take to awaken from sleep?
Then you have to take the vector, and get to your routine.

In apps like this, you can achieve somewhat better max rate by sitting in a tight polling loop during the critical operation.

The code listed above is not the fully functioning timer (obviously). Cycles from JMP/RETI are certainly important! I just don't remember how many cycles it adds. In my test above with the jmp/reti the counter would be at 30/31 clocks (can't remember exactly) after the RETI (which would put the PC at "main:")

I don't plan to use sleep mode on the final project. I am going to this routine tied to a timer overflow interrupt. Upon exiting the interrupt it will return to a wait loop.

As to display technology. I plan to use seven segment LED displays. LEDs have fast turn on times, every one I've seen has a turn on time well under 1µS.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
As to display technology. I plan to use seven segment LED displays. LEDs have fast turn on times, every one I've seen has a turn on time well under 1µS.

But the point is that the human eye doesn't.

Chuck Baird

"I wish I were dumber so I could be more certain about my opinions. It looks fun." -- Scott Adams

http://www.cbaird.org

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
In my test above with the jmp/reti the counter would be at 30/31 clocks

Your 1us ISR is not optimized for speed(worst case), but for flash size. Wonder why, especially if that is a 128k chip.
Quote:
The code listed above is not the fully functioning timer (obviously).

Another thing is, your chip will not be able to do anything more, because you hogged all pointers. Not mentioning you do not store SREG on entering ISR. There is no time for that.

Third remark is an ISR jitter. In your case that is 1 clk. Not needed.
Fourth one is a skew between digits. You write multibyte data to three ports so the snapshot (single frame) is useless when you do not know adjacent snapshots. But perhaps you have some latching LED driver device? Cannot see you are driving it from code.

Are you aware of CPLDs? That is what these are usd for.

No RSTDISBL, no fun!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
But the point is that the human eye doesn't.

It's not for human eyes. It's for high speed video.

Quote:
Your 1us ISR is not optimized for speed(worst case), but for flash size. Wonder why, especially if that is a 128k chip.

That's news to me... but hey, I'll bite. Any suggestions to make it faster?

Quote:
Another thing is, your chip will not be able to do anything more, because you hogged all pointers. Not mentioning you do not store SREG on entering ISR. There is no time for that.

A) The pointers are hogged when performing this function only... the code above is not finished, by any means, but I would have set the pointers to the ports in an initialization routine. For other functions the pointers can still be used, because the other functions are not going to be happening while this is running. Also this routine does not have a fixed execution time. Usually the execution time is going to be very short, only on the worst case (999999 overflow) does the execution time approach 1µS. That won't happen very often ... given the fact that this timer isn't meant to time things over a long interval.

B) Of course I'm not saving the CPU status register. Why would I? When this routine is running, it's the only thing that will be happening at the time.

Quote:
Fourth one is a skew between digits. You write multibyte data to three ports so the snapshot (single frame) is useless when you do not know adjacent snapshots. But perhaps you have some latching LED driver device? Cannot see you are driving it from code.

I'll be using a BCD to seven segment driver. Don't hold me to it, but I'm pretty sure it is latching.

Quote:
Are you aware of CPLDs? That is what these are usd for.

Yes, I'm aware. I contemplated using one, but It's far easier to use a µC. The Xmega should be more than capable.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'll chime in with the others; why do you think your display needs to update so fast? I'm unfamiliar with how high "high speed" video is, but the first couple Google hits turned up a maker of some cameras with an impressive (well, to me anyway) speed of 1500 frames/sec. But that's still 670uS per frame..

So let an 8-bit timer count away at whatever prescaler gives you 1MHz increment rate, hook an ISR to it that fires when it overflows, and, in the ISR, increment some variables to track the higher-numbered bits; tack on a snapshot of the timer value, and decompose the resulting binary number to BCD to update your display with.

Or, if this is for some sort of timestamping of the video record that the highspeed camera incomprehensibly doesn't offer as builtin feature (bearing in mind that they're developed for doing measurements with), why do you need a decimal display? Make the researcher do his own bally conversion from hexadecimal. Tons easier to decode.

Other points:

    When performance is so obviously at a premium, why are you bothering to do packed BCD arithmetic? What AVR did you find where saving three registers or RAM locations would be worth the extra pain of the instructions?
    I've not used the most modern AVRs or (Xmega?) yet, but in all the AVRs I'm familiar with, ST takes two clock cycles just like STS.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'd use either a CPLD, or simply six 4510 BCD counter ICs of $0.56 each.

The 4510 can easily outperform your xMEGA solution by 5 to 10 times, with a few thousands times less transistors too.

All this technology abuse :roll:

Last Edited: Sat. Sep 24, 2011 - 09:59 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

To count fast to 999999 it's easier to count 2 bytes with the 16 bit timer and a 3th byte inside the timer interrupt.
And only to display the result, convert it into decimals.
A fast bin-dec conversion use the subtraction method.

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I think the Double dabble might be faster.

But the OP wants to update the display every count, so directly counting in BCD is better then.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Like the other I don't see the point.
And I'm not a XMEGA expert but here are some general thougths:
1:
I would try with 3 counters that count to 100 and make a LUT, from bin to 2 BCD
For speed place the LUT in RAM (perhaps not needed on a XMEGA).
2:
To save compares I would count from 99 and down so you can brance on the dec instruction.
3:
To speed up worst case (but slow AVG) I would make a flag to indicate that (last time)
The four last digits was 9999. and in the beginning of the interrupt have a check on flag if set update high 2 digit zero the last 4. (and clr flag).

Have fun
Jens

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

I don't plan to use sleep mode on the final project. I am going to this routine tied to a timer overflow interrupt. Upon exiting the interrupt it will return to a wait loop.


You do realize that this may even be worse for timing, as with the SLEEP approach you have predictable cycles before hitting the first instruction of the ISR.

But that assumes you have enough time to get to sleep. Consider that your spending so far is 30 cycles out of a budget of 32 after considering the RETI. But you need to do at least one instruction between each ISR, so budget for the RJMP+SLEEP.

When the ISR hits, you have 4 cycles plus the JMP/RJMP to the ISR.

As you see, you are way over worst-case budget. Now,that isn't necessarily the end of the world if you can "catch up" next time. But I don't think you have that luxury.

As I mentioned, if you invert so that the fast app is a polling loop then you might get the overhead down to SBIS/RJMP loop and then a CBI when it hits.

Quote:

LEDs have fast turn on times, every one I've seen has a turn on time well under 1µS.

OK--but what about the turn-OFF times? After all, most of us here commonly "feather" LEDs by varying the duty cycle, and in addition run multiplexed displays. Now, I indeed realize that some of the effect is the human eye persistence.

And won't you have to hit the LED pretty hard with current to get the low turn-on times? And that increases the turn-off time, right?

I guess you can prove me wrong with a simple experiment. Rig up a single digit. Hook to an AVR port. Don't even worry about the BCD part. Count at your 1MHz or some reasonable rate.

Use the fastest camera you can get your hands on. Beg, borrow, steal. I suspect that the transitions will be blurred much before 1MHz. As an uneducated guess, more like 1kHz?

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
I've not used the most modern AVRs or (Xmega?) yet, but in all the AVRs I'm familiar with, ST takes two clock cycles just like STS.

The datasheet specified 1 clock cycle. I noticed only one clock per ST in AVR sim.

Quote:
I'd use either a CPLD, or simply six 4510 BCD counter ICs of $0.56 each.

The 4510 can easily outperform your xMEGA solution by 5 to 10 times, with a few thousands times less transistors too.

I looked at the 4510 before writing this code. What made me choose the code over the 4510 is that this device isn't going to limited solely to counting. I'd like to be able to do other things with it as well. The point is, I want to be able to display a number directly to the display at times, and just using a counter alone does not give this capability.

Okay, so I could just use the disable line on the counters and have some BCD->7 Seg drivers wired to the displays. I'd like to keep the amount of hardware down though. Also, it's more fun to do this with a little AVR.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Can you answer the question above about this video camera? Does it really record 1 million frames per second. I guess we all want to know as the whole premise of this discussion may be based on something that sounds hugely unlikely?

What are you doing? Recording the travel of a rifle bullet or something? Even if you were does that really require 1M frames per second? They used to say that Concorde flew faster than a rifle bullet and it flew at something like 1400 miles per hour. I'm almost moved to work out how far a bullet travels in 1 millionth of a second.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:
as with the SLEEP approach you have predictable cycles before hitting the first instruction of the ISR.

Brutte wrote:
Third remark is an ISR jitter. In your case that is 1 clk. Not needed.

I think it is enough to put "rjmp pc" in main.

I attached a BCD up-counter code for t2313. It is an example - only three BCD digits implemented, just to give the idea of optimization for speed. Can be easily extended to 6 digits. Or more.

Anyway, it ticks every 3 clocks - over 10 times faster than in your example. It does not have jitter.

Tested on Simulator2.
I have no idea how to translate it into XMegas but with some modifications - doable I think.

Considering some chips(like t2313) have spi or USART, you can run first two bits of the least significant nibble with it. bit0 with CLK (F_CPU/2), bit1 with data and the rest of the bits in software.

Attachment(s): 

No RSTDISBL, no fun!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
As I mentioned, if you invert so that the fast app is a polling loop then you might get the overhead down to SBIS/RJMP loop and then a CBI when it hits.

I'm certainly not committed to using an ISR. All I've written so far is the routine to do the BCD counting. A polling loop would be fine. Let me also point out that the worst case will almost never happen the way this is intended to be used. The timer only rolls over after a second of running, by that time any event I'd be timing with the timer will be over.

Quote:
Use the fastest camera you can get your hands on. Beg, borrow, steal. I suspect that the transitions will be blurred much before 1MHz. As an uneducated guess, more like 1kHz?

I was thinking about using a photo diode or similar to test the turn on/off time of the 7-segment display. I'm not really that worried though. I have fiber optic TXs that utilize the same kind of red LED you'd find in a 7-segment display. I frequently use those to send signals with rather short durations (2-300µS).

I plan for this timer to have a selectable precision, and most of the time I'm not going to be using the µS setting, but it's nice to have that option available.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:

What are you doing? Recording the travel of a rifle bullet or something? Even if you were does that really require 1M frames per second? They used to say that Concorde flew faster than a rifle bullet and it flew at something like 1400 miles per hour. I'm almost moved to work out how far a bullet travels in 1 millionth of a second.

It's for high speed videos of spark propagation. Sparks propagate much faster than bullets.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Can you answer the question above about this video camera? Does it really record 1 million frames per second.

Exact this was the question!

To solve a task, you must first determine the requirements.
And in this case the counting speed was not relevant.
With the timer up to 16MHz can be achieved easily.
No need for heavy CPU load.

But you need at first to determine the needed display speed. :!:

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
The pointers are hogged when performing this function only... the code above is not finished, by any means, but I would have set the pointers to the ports in an initialization routine.
But how could the rest of the program use them without changing them? And what if the ISR happens while the rest of the program is using them? Surely you will have to save them off at the start of the ISR, load them with the values you need for the ISR, then restore the original values at the end of the ISR.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi!

Koshchi wrote:
And what if the ISR happens while the rest of the program is using them?

OP can and will disable this ISR, when main() plays with pointers and SREG. When the BCD counter is running, main() can be "rjmp PC" - and nothing bad happens.

No RSTDISBL, no fun!