Help! Lowering clock cycles for sequential/random led controller

Go To Last Post
12 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi,

 

I am attempting to make a 8 channel (for now) sequential LED controller code. On atmega328p running at 16MHz. I am using bit angle modulation (BAM) for 8 pwm outputs. So far using timer0 with overflow interrupt dedicated to BAM code, I am getting about 240hz refresh rate for leds, with nothing else running. BAM is taking about 10% of CPU usage (measured with oscilloscope). Interrupt routine takes 1.6us time and the time spent in empty while loop is 14.4us. Measuring with pin toggle.

 

 

Here is the spaghetti code that I wrote. This is taking too much time about 30-52us. I certainly can't squeeze 52us worth of code execution time into 14.4us window to maintain 240hz. The minimum refresh I am looking for is 200hz. I don't know if it's possible or not with this code. I would just like to know like where my big pitfalls are, like i don't really have great sense for what could take lot of cycle and what could not. Just pointing towards it would be of great help.

 

Thanks.

 

 

#include <avr/io.h>

#define MAXSTEPS 2 // Total number of Led patterns or steps.

volatile uint8_t Led[] = {0,0,0,0,0,0,0,0}; // Used to store duty cycle of 8 pwm channels running on bit angle modulation. 

volatile uint8_t stepCounter = 0;	// Used to know which step of pattern is currently running.
volatile uint8_t pwmStep = 0;		// To count the pwm resolution, 0 to 255, 8 bit resolution.
volatile uint8_t pwmState = 0;		// To check if a channel is already on the given duty cycle.

const __flash uint8_t ledEffect[MAXSTEPS][3] = {
	{0b11111111,0b11111111,0},	// For storing N number of patterns. 2 bits are dedicated to 1 channel, so 2 bytes (16 bits) used for 8 channel. 3rd byte for speed haven't implemented yet.
	{0b00000000,0b00000000,0},
	/*......................
	........................
	........................
	........................
	........................
	........................
	........................
	........................
	........................
	........................
	could be as high as 1000
	patterns or more
	*/
};

const __flash uint8_t ledMask[] = {	// Mask to read 2 bits at a time for the above LedEffect.
	0b00000011,
	0b00001100,
	0b00110000,
	0b11000000
};

const __flash uint8_t a[] = {	// Duty cycle table to bring led to 20 % duty cycle from 100 % duty cycle or vice versa.
	255,	254,	253,	252,	251,	251,	250,	249,	248,	247,	247,	246,	245,	244,	243,	243,
	242,	241,	240,	239,	239,	238,	237,	236,	235,	235,	234,	233,	232,	231,	231,	230,
	229,	228,	227,	227,	226,	225,	224,	223,	223,	222,	221,	220,	219,	219,	218,	217,
	216,	215,	215,	214,	213,	212,	211,	211,	210,	209,	208,	207,	207,	206,	205,	204,
	203,	203,	202,	201,	200,	199,	199,	198,	197,	196,	195,	195,	194,	193,	192,	191,
	191,	190,	189,	188,	187,	187,	186,	185,	184,	183,	183,	182,	181,	180,	179,	179,
	178,	177,	176,	175,	175,	174,	173,	172,	171,	171,	170,	169,	168,	167,	167,	166,
	165,	164,	163,	163,	162,	161,	160,	159,	159,	158,	157,	156,	155,	155,	154,	153,
	152,	151,	151,	150,	149,	148,	147,	147,	146,	145,	144,	143,	143,	142,	141,	140,
	139,	139,	138,	137,	136,	135,	135,	134,	133,	132,	131,	131,	130,	129,	128,	127,
	127,	126,	125,	124,	123,	123,	122,	121,	120,	119,	119,	118,	117,	116,	115,	115,
	114,	113,	112,	111,	111,	110,	109,	108,	107,	107,	106,	105,	104,	103,	103,	102,
	101,	100,	99,	99,	98,	97,	96,	95,	95,	94,	93,	92,	91,	91,	90,	89,
	88,	87,	87,	86,	85,	84,	83,	83,	82,	81,	80,	79,	79,	78,	77,	76,
	75,	75,	74,	73,	72,	71,	71,	70,	69,	68,	67,	67,	66,	65,	64,	63,
	63,	62,	61,	60,	59,	59,	58,	57,	56,	55,	55,	54,	53,	52,	51,	51
};

const __flash uint8_t b[] = {	// Duty cycle table to bring led to 20 % duty cycle from 0 % duty cycle or vice versa.
	51,	50,	50,	50,	50,	50,	49,	49,	49,	49,	49,	48,	48,	48,	48,	48,
	47,	47,	47,	47,	47,	46,	46,	46,	46,	46,	45,	45,	45,	45,	45,	44,
	44,	44,	44,	44,	43,	43,	43,	43,	43,	42,	42,	42,	42,	42,	41,	41,
	41,	41,	41,	40,	40,	40,	40,	40,	39,	39,	39,	39,	39,	38,	38,	38,
	38,	38,	37,	37,	37,	37,	37,	36,	36,	36,	36,	36,	35,	35,	35,	35,
	35,	34,	34,	34,	34,	34,	33,	33,	33,	33,	33,	32,	32,	32,	32,	32,
	31,	31,	31,	31,	31,	30,	30,	30,	30,	30,	29,	29,	29,	29,	29,	28,
	28,	28,	28,	28,	27,	27,	27,	27,	27,	26,	26,	26,	26,	26,	25,	25,
	25,	25,	25,	24,	24,	24,	24,	24,	23,	23,	23,	23,	23,	22,	22,	22,
	22,	22,	21,	21,	21,	21,	21,	20,	20,	20,	20,	20,	19,	19,	19,	19,
	19,	18,	18,	18,	18,	18,	17,	17,	17,	17,	17,	16,	16,	16,	16,	16,
	15,	15,	15,	15,	15,	14,	14,	14,	14,	14,	13,	13,	13,	13,	13,	12,
	12,	12,	12,	12,	11,	11,	11,	11,	11,	10,	10,	10,	10,	10,	9,	9,
	9,	9,	9,	8,	8,	8,	8,	8,	7,	7,	7,	7,	7,	6,	6,	6,
	6,	6,	5,	5,	5,	5,	5,	4,	4,	4,	4,	4,	3,	3,	3,	3,
	3,	2,	2,	2,	2,	2,	1,	1,	1,	1,	1,	0,	0,	0,	0,	0
};

void startEffect(void)
{

	uint8_t ledNum = 0; // For counting led number
	uint8_t index = 0;	// For reading pattern stored in array elements.
	do
	{
		uint8_t shift = 0; // LED mask
		do
		{
                        switch ( (ledEffect[stepCounter][ index]) & (ledMask[shift]) ) // To read 2 bits at a time to know whether its 00, 11 , or anything else To determine which duty cycle to bring led to.
			{
				case 0:	// bring led towards 0% duty cycle

				if(Led[ledNum] != 0)
				{
					if(pwmStep == 0)
					{
						if(Led[ledNum] != 255)
						{
							pwmState |= (1 << ledNum);
						}
					}
					else
					{
						if(pwmState &= ~(1 << ledNum))
						{
							Led[ledNum] = b[pwmStep];
						}
						else
						{
							Led[ledNum] = (255-pwmStep);
						}
					}
				}

				break;

				case 3: case 12: case 48: case 192:	// bring led towards 100% duty cycle

				if(Led[ledNum] != 255)
				{
					if(pwmStep == 0)
					{
						if(Led[ledNum] != 0)
						{
							pwmState |= (1 << ledNum);
						}
					}
					else
					{
						if(pwmState &= ~(1 << ledNum))
						{
							Led[ledNum] = a[255-pwmStep];
						}
						else
						{
							Led[ledNum] = pwmStep;
						}
					}
				}

				break;

				default: // bring led towards 20% duty cycle

				if(Led[ledNum] != 51)
				{
					if(pwmStep == 0)
					{
						if(Led[ledNum] != 255)
						{
							pwmState |= (1 << ledNum);
						}
					}
					else
					{
						if(pwmState &= ~(1 << ledNum))
						{
							Led[ledNum] = b[255-pwmStep];
						}
						else
						{
							Led[ledNum] = a[pwmStep];
						}
					}
				}
			}
			ledNum++;	// Used to go to next LED
						// Loop runs 8 times for 8 leds, incrementing or decrementing one led at a time.
		}while (++shift != 4);
	}while (++index != 2);

	if(++pwmStep == 0)	// For going to next LED pattern.
	{
		stepCounter++;
	}

}

int main(void)
{

	while(1)
	{
		if(stepCounter >= MAXSTEPS) // Reset pattern to 0
		{
			stepCounter = 0;
			pwmStep = 0;
			pwmState = 0;
		}
		startEffect();
	}
}

 

 

 

 

This topic has a solution.
Last Edited: Mon. Mar 9, 2020 - 10:53 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The first thing to try is remove the volatile qualifier from your global variables. It's presence will negatively affect execution speed by preventing the optimiser from doing it's magic.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

N.Winterbottom wrote:
remove the volatile qualifier

Indeed - there don't seem to be any interrupts involved.

 

Do they even need to be global at all?

Reducing scope can also help optimisation ...

 

N.Winterbottom wrote:
the optimiser

What optimisation setting is currently being used?

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Heisen wrote:

...atmega328p running at 16MHz...I am getting about 240hz refresh rate for leds, with nothing else running. BAM is taking about 10% of CPU usage (measured with oscilloscope). Interrupt routine takes 1.6us time and the time spent in empty while loop is 14.4us. Measuring with pin toggle.

 

Something feels wrong with your numbers. 240Hz is nearly 4200us but you only have 14.4us left in main()?

 

Perhaps you can explain more?  Plus, you say you are running with interrupts but don;t show any code.

#1 Hardware Problem? https://www.avrfreaks.net/forum/...

#2 Hardware Problem? Read AVR042.

#3 All grounds are not created equal

#4 Have you proved your chip is running at xxMHz?

#5 "If you think you need floating point to solve the problem then you don't understand the problem. If you really do need floating point then you have a problem you do not understand."

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

"1 << N" is a relatively expensive operation. At the very least consider doing it with a lookup table:

uint8_t masks[] = { 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80 };

int main(void) {
    int n = rand() % 8;
    
    PORTB = (1 << n);
    PORTB = masks[n];
}

I think you'll find the second is quicker than the first.

 

Even if you aren't going to do it like that then at least calculate "1 << ledNum" once at the top of the ledNum++ loop and hold the shifted value in a variable. (also do that if you are using the masks[] lookup too!)

This reply has been marked as the solution. 
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In-for-a-penny; in-for-a-pound.

For blistering speed why not unroll the LED loop. It will be painful to write, but all those 1<<N statements will become constants. Hey! The compiler may even do the inherent pgm-read-byte() for you if it the constants allow the lookup table read to be determined at compile time.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

N.Winterbottom wrote:

The first thing to try is remove the volatile qualifier from your global variables. It's presence will negatively affect execution speed by preventing the optimiser from doing it's magic.

 

Okay, I removed volatile from these statements.

 

volatile uint8_t stepCounter = 0;
volatile uint8_t pwmStep = 0;
volatile uint8_t pwmState = 0;	

It helped, time is reduced to 49.6us from 52us.

 

awneil wrote:

there don't seem to be any interrupts involved.

 

Interrupt is involved I forget to mention, this variable is used in interrupt and as well as in main.

 

volatile uint8_t Led[] = {0,0,0,0,0,0,0,0};

awneil wrote:
What optimisation setting is currently being used?

 

The default one, Optimize (-O1)

 

 

Brian Fairchild wrote:
Something feels wrong with your numbers. 240Hz is nearly 4200us but you only have 14.4us left in main()?

 

Ah, I see where the confusion is, I forgot to mention the interrupt code, here it is.

 

#include <avr/io.h>
#include <avr/interrupt.h>

volatile uint8_t Led[] = {0,0,0,0,0,0,0,0};
volatile uint8_t Interval = 0;
volatile uint8_t Toggle[256];

ISR (TIMER0_OVF_vect)
{
	PORTB = 0b00000001; // For measuring time
	/*-------------------------------------------------------------*/

          //BAM CODE

	/*-------------------------------------------------------------*/
	PORTB = 0b00000000;
}

void BamEnable(void)
{
	TCCR0B = (1 << CS00);
	TIMSK0 = (1 << TOIE0);
	sei();
}

int main(void)
{
	DDRD = 0b11111111;
	DDRB = 0b00000001;
	BamEnable();		// Start timer0 with overflow interrupt

	Led[0] = 128;		// Set PD0 pin to 50% duty cycle

    /* Replace with your application code */
    while (1)
    {

    }
}

 

Here is the refresh rate at pin PD0 for Led[0] running at 244hz

 

 

And here is me probing pin PB0 for ISR time and empty while loop.

 

 

ISR time 1.6us and While loop time 14.35us.

 

clawson wrote:
At the very least consider doing it with a lookup table

 

I used the following lookup table.

 

uint8_t masks[] = { 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80 };

 

and replaced

 

pwmState |= (1 << ledNum);

 

with this

 

pwmState |= masks[ledNum];

 

This didn't help, it increased time for some reason. Checked in simulator. Weird this shouldn't happen I guess. I must be doing something wrong.

 

N.Winterbottom wrote:
For blistering speed why not unroll the LED loop.

 

I will try this now and see what happens.

 

Last Edited: Tue. Mar 10, 2020 - 11:08 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

N.Winterbottom wrote:

 

For blistering speed why not unroll the LED loop.

 

 

It worked. Thanks. execution time dropped to 19us(worst case),  from 52us just by unrolling the loop. Had no idea this would effect this much.

Thanks everyone else for their replies. 

Last Edited: Mon. Mar 9, 2020 - 11:05 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just wanted to say I love blue bimmers.  (I once had an E46 330Ci in Topaz Blue Metallic.)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Heisen wrote:
I once had an E46 330Ci in Topaz Blue Metallic

Man I love E46 tail lights, such a timeless design. Looks so good in blue.

Last Edited: Tue. Mar 10, 2020 - 11:13 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
"1 << N" is a relatively expensive operation. At the very least consider doing it with a lookup table:

 

And it seems to be used in all conditions of the switch...case, so I would calculate it only once before the switch...case and place it on a temporary variable.

 

edit: it's possible the compiler is already doing this automatically, in which case there will be no difference.

Last Edited: Tue. Mar 10, 2020 - 12:23 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

El Tangas wrote:
edit: it's possible the compiler is already doing this automatically, in which case there will be no difference.
yeah when I said:

clawson wrote:
Even if you aren't going to do it like that then at least calculate "1 << ledNum" once at the top of the ledNum++ loop and hold the shifted value in a variable.
I had the same thought: "smart compiler will hopefully spot that and do it anyway" - but it doesn't hurt to give it a hint!

 

BTW I am very surprised a mask loop costs more than a shift calculation - perhaps that is because it did NOT cache the value in that case but really did do the lookup every time? Though I did also say:

clawson wrote:
(also do that if you are using the masks[] lookup too!)