using LSS output in ASM project

Go To Last Post
18 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hey all,

 

This thread is a splinter from this one:

https://www.avrfreaks.net/forum/...

 

In that thread a solution was had to get the LED to blink based on the 'pressure' variable.  And that all works just fine.

 

Here's the Rub....

 

The actual application is written in assembler many years ago.  In my warped way of thinking I did 'blinky' in C as it was pretty easy to do >>with the help of Joeymorin and the Freaks<<.  Now the task of massaging that into the original program.

 

I am thinking of simply using the .lss file output and in looking at it I can see where I may have some issues.  For example:

			tick = 0;	//clear tick
 1ae:	10 92 08 01 	sts	0x0108, r1	; 0x800108 <tick+0x1>
 1b2:	10 92 07 01 	sts	0x0107, r1	; 0x800107 <tick>
			//run while tick is less than 15 seconds long
			while(tick <= 15000)

the code is storing something in RAM locations 108 and 107.  The original code may have things stored at those locations  Would it be better if I declared a variable like I have in the .DSEG such as:

.dseg
dserial:			.byte	 10
dpasscode:			.byte 	  6		;passcode for configuration
dpassent:			.byte     6		;passcode entered
dnodeaddr:			.byte  	  1
rxbuffer:			.byte 	 13
dsetmode:			.byte	  4		;module setmode values
dsensor:			.byte	  2		;16bit sensor data
dgetstatus:			.byte	  1		;sensor status
pingmask:			.byte	  1		;ping mask byte
checksum:			.byte	  2		;ping checksum
bytecount:			.byte	  2		;readburst byte counter
byteccntref:		        .byte	  2		;readburst reference to compare against
tick                            .byte     2             ;tick counter

 

So the assembler can decide where in RAM 'tick' belongs.

Then when I need to store to tick(in this case clear it, I would write:

                ldi		YH,high(tick)
		ldi		YL,low(tick)
		st		Y,r1		;store value to tick

or something along those lines.

 

Jim

 

EDIT:

While wading through the .LSS file I noticed something about the interrupts:

In the C version I have the following for the Timer0 overflow ISR:

ISR(TIMER0_OVF_vect)
{
	// Reinitialize Timer 0 value
	TCNT0=0x44;
	tick++;		//increase tick counter by one
	// Place your code here

}

 

But in the .LSS file I have this for the ISR:

00000090 <__vector_16>:

Its empty.

I have the variable 'tick' declared volatile.  So how is the ISR empty?

 

 

I would rather attempt something great and fail, than attempt nothing and succeed - Fortune Cookie

 

"The critical shortage here is not stuff, but time." - Johan Ekdahl

 

"If you want a career with a known path - become an undertaker. Dead people don't sue!" - Kartman

"Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?"  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

Last Edited: Thu. Feb 14, 2019 - 06:06 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

jgmdesign wrote:
The actual application is written in assembler many years ago. In my warped way of thinking I did 'blinky' in C as it was pretty easy to do ... Now the task of massaging that into the original program.
Ouch.  Well, having to shoehorn that into an existing assembler project is going to be a different kettle of fish I expect.

 

Does the existing code employ a systick already?  If so, what's the time base?

 

By the way, I noticed in your test code from the other thread that you were reloading TCNT0 in the ISR instead of using CTC mode.  How old is this project? ;-) ... are you trying to cram this into a AT90S1200? ;-)  ... I suppose not, since you're referring to __vector_16.  That vector as the overflow vector for timer0 suggests something like the ubiquitous ATmega48 family, or a smattering of other.

 

jgmdesign wrote:
But in the .LSS file I have this for the ISR: 00000090 <__vector_16>: Its empty.
Are you sure?  Post the .lss here if you like.
jgmdesign wrote:
I have the variable 'tick' declared volatile. So how is the ISR empty?
It can't be.  Even if tick were completely optimised away, TCNT0 is volatile.

 

jgmdesign wrote:
So the assembler can decide where in RAM 'tick' belongs. Then when I need to store to tick(in this case clear it, I would write:
One of a dozen ways to be sure.  I think you might have a harder time porting the functionality provided by map().

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:
Does the existing code employ a systick already? If so, what's the time base?

No time base needed in the primary app.  In fact it would only be used for the led sub-app

 

joeymorin wrote:
That vector as the overflow vector for timer0 suggests something like the ubiquitous ATmega48 family, or a smattering of other.

Bingo!  Mega48 and this sub-app will probably bring the FLASH to 80% full so the MEga168 will not be far behind

 

joeymorin wrote:
Are you sure? Post the .lss here if you like. jgmdesign wrote: I have the variable 'tick' declared volatile. So how is the ISR empty? It can't be. Even if tick were completely optimised away, TCNT0 is volatile.

I have attached the .lss file from the other thread for review

 

joeymorin wrote:
One of a dozen ways to be sure. I think you might have a harder time porting the functionality provided by map().

From what I see in the .lss its pretty straight forward.

 

 

Attachment(s): 

I would rather attempt something great and fail, than attempt nothing and succeed - Fortune Cookie

 

"The critical shortage here is not stuff, but time." - Johan Ekdahl

 

"If you want a career with a known path - become an undertaker. Dead people don't sue!" - Kartman

"Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?"  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Doesn't look empty:

00000090 <__vector_16>:

int32_t map(const int32_t x, const int32_t in_min, const int32_t in_max, const int32_t out_min, const int32_t out_max)
{
	uint16_t y = x*16;
	return (y - in_min) * (out_max - out_min) / (in_max - in_min) + out_min;
}
  90:	1f 92       	push	r1
  92:	0f 92       	push	r0
  94:	0f b6       	in	r0, 0x3f	; 63
  96:	0f 92       	push	r0
  98:	11 24       	eor	r1, r1
  9a:	8f 93       	push	r24
  9c:	9f 93       	push	r25
  9e:	84 e4       	ldi	r24, 0x44	; 68
  a0:	86 bd       	out	0x26, r24	; 38
  a2:	80 91 07 01 	lds	r24, 0x0107	; 0x800107 <tick>
  a6:	90 91 08 01 	lds	r25, 0x0108	; 0x800108 <tick+0x1>
  aa:	01 96       	adiw	r24, 0x01	; 1
  ac:	90 93 08 01 	sts	0x0108, r25	; 0x800108 <tick+0x1>
  b0:	80 93 07 01 	sts	0x0107, r24	; 0x800107 <tick>
  b4:	9f 91       	pop	r25
  b6:	8f 91       	pop	r24
  b8:	0f 90       	pop	r0
  ba:	0f be       	out	0x3f, r0	; 63
  bc:	0f 90       	pop	r0
  be:	1f 90       	pop	r1
  c0:	18 95       	reti

Prologue

TCNT0 = 0x44;

tick++;

Epilogue

 

Although I grant that the source mapping into the LSS is a bit wonky.  What were the build options i.e. full command line.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

jgmdesign wrote:
From what I see in the .lss its pretty straight forward.
Just be aware of the GCC ABI.  Called functions will be using call used registers without regard for preserving/restoring their contents.

00000256 <__divmodsi4>:
00000274 <__divmodsi4_neg2>:
00000282 <__divmodsi4_exit>:
00000284 <__negsi2>:
00000294 <__muluhisi3>:
000002aa <__udivmodsi4>:
000002b6 <__udivmodsi4_loop>:
000002d0 <__udivmodsi4_ep>:
000002ee <__umulhisi3>:

 

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:
Although I grant that the source mapping into the LSS is a bit wonky.

I was under the impression that the text you highlighted was used for the MAP function!  Wonky indeed!  Now I have to look at how the MAP function is laid out. - UGH!!

 

joeymorin wrote:
What were the build options i.e. full command line

Heres where my naivete shows.  I don't know as all I have ever done is hit F7 in studio and sit back.  Whatever the defaults are.

 

JIm

I would rather attempt something great and fail, than attempt nothing and succeed - Fortune Cookie

 

"The critical shortage here is not stuff, but time." - Johan Ekdahl

 

"If you want a career with a known path - become an undertaker. Dead people don't sue!" - Kartman

"Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?"  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Heres where my naivete shows.  I don't know as all I have ever done is hit F7 in studio and sit back.  Whatever the defaults are.

And here's where my abandonment of Windows a decade ago shows.  I haven't the faintest idea how to guide you towards digging out the build commands from the depths of AS7.

 

Given that your goal is to adapt compiled C into ASM, might rethink the C code a bit to help minimise your work.

 

For example, the map() function as it is won't optimise much.  Although it has been inlined by the compiler (even though we didn't declare it as 'static') It will call 32-bit multiply and divide routines.  We can do a bit better.  Powers of two can be a friend, especially when dividing.

 

If we change the range between in_min and in max to be a power of two, then division can be a simple matter of a few shifts.  We're nearly there as is, since the range is 0-16,383.  I'll work with my own test code for now.  With 16,383:

    pressure = ADC << 4;
 156:	c0 91 78 00 	lds	r28, 0x0078	; 0x800078 <__TEXT_REGION_LENGTH__+0x7e0078>
 15a:	d0 91 79 00 	lds	r29, 0x0079	; 0x800079 <__TEXT_REGION_LENGTH__+0x7e0079>
    half_period_ms = map(pressure, 0, 16383, 500, 50);

    now = millis();
 15e:	cf df       	rcall	.-98     	; 0xfe <millis>
 160:	6b 01       	movw	r12, r22
 162:	7c 01       	movw	r14, r24
    if ((now - last_toggle_ms) > half_period_ms) {
 164:	de 01       	movw	r26, r28
 166:	a2 95       	swap	r26
 168:	b2 95       	swap	r27
 16a:	b0 7f       	andi	r27, 0xF0	; 240
 16c:	ba 27       	eor	r27, r26
 16e:	a0 7f       	andi	r26, 0xF0	; 240
 170:	ba 27       	eor	r27, r26
 172:	2e e3       	ldi	r18, 0x3E	; 62
 174:	3e ef       	ldi	r19, 0xFE	; 254
 176:	4f ef       	ldi	r20, 0xFF	; 255
 178:	5f ef       	ldi	r21, 0xFF	; 255
 17a:	2e d0       	rcall	.+92     	; 0x1d8 <__muluhisi3>
 17c:	a5 01       	movw	r20, r10
 17e:	94 01       	movw	r18, r8
 180:	0f d0       	rcall	.+30     	; 0x1a0 <__divmodsi4>
 182:	ba 01       	movw	r22, r20
 184:	a9 01       	movw	r20, r18
 186:	4c 50       	subi	r20, 0x0C	; 12
 188:	5e 4f       	sbci	r21, 0xFE	; 254
 18a:	6f 4f       	sbci	r22, 0xFF	; 255
 18c:	7f 4f       	sbci	r23, 0xFF	; 255
 18e:	c6 01       	movw	r24, r12
 190:	80 1b       	sub	r24, r16
 192:	91 0b       	sbc	r25, r17
 194:	48 17       	cp	r20, r24
 196:	59 07       	cpc	r21, r25
 198:	a8 f6       	brcc	.-86     	; 0x144 <main+0x2e>

 

If we force 16,384:

    pressure = ADC << 4;
    half_period_ms = map(pressure, 0, 16384, 500, 50);

... then the size of the (my) executable drops by 110 bytes.  That's because the calls to __div*** have been dropped:

    pressure = ADC << 4;
 14a:	00 91 78 00 	lds	r16, 0x0078	; 0x800078 <__TEXT_REGION_LENGTH__+0x7e0078>
 14e:	10 91 79 00 	lds	r17, 0x0079	; 0x800079 <__TEXT_REGION_LENGTH__+0x7e0079>
    half_period_ms = map(pressure, 0, 16384, 500, 50);

    now = millis();
 152:	d5 df       	rcall	.-86     	; 0xfe <millis>
 154:	6b 01       	movw	r12, r22
 156:	7c 01       	movw	r14, r24
 158:	f6 2f       	mov	r31, r22
 15a:	ed 2d       	mov	r30, r13
    if ((now - last_toggle_ms) > half_period_ms) {
 15c:	d8 01       	movw	r26, r16
 15e:	a2 95       	swap	r26
 160:	b2 95       	swap	r27
 162:	b0 7f       	andi	r27, 0xF0	; 240
 164:	ba 27       	eor	r27, r26
 166:	a0 7f       	andi	r26, 0xF0	; 240
 168:	ba 27       	eor	r27, r26
 16a:	2e e3       	ldi	r18, 0x3E	; 62
 16c:	3e ef       	ldi	r19, 0xFE	; 254
 16e:	4f ef       	ldi	r20, 0xFF	; 255
 170:	5f ef       	ldi	r21, 0xFF	; 255
 172:	1d d0       	rcall	.+58     	; 0x1ae <__muluhisi3>
 174:	97 fd       	sbrc	r25, 7
 176:	16 c0       	rjmp	.+44     	; 0x1a4 <main+0x8e>
 178:	dc 01       	movw	r26, r24
 17a:	cb 01       	movw	r24, r22
 17c:	2e e0       	ldi	r18, 0x0E	; 14
 17e:	b5 95       	asr	r27
 180:	a7 95       	ror	r26
 182:	97 95       	ror	r25
 184:	87 95       	ror	r24
 186:	2a 95       	dec	r18
 188:	d1 f7       	brne	.-12     	; 0x17e <main+0x68>
 18a:	8c 50       	subi	r24, 0x0C	; 12
 18c:	9e 4f       	sbci	r25, 0xFE	; 254
 18e:	af 4f       	sbci	r26, 0xFF	; 255
 190:	bf 4f       	sbci	r27, 0xFF	; 255
 192:	cc 1a       	sub	r12, r28
 194:	dd 0a       	sbc	r13, r29
 196:	8c 15       	cp	r24, r12
 198:	9d 05       	cpc	r25, r13
 19a:	70 f6       	brcc	.-100    	; 0x138 <main+0x22>

 

However given the actual requirements i.e. cue the user with speed of flashing v.s. super-accurate mapping of pressure onto half-period, this may be a case where a series of if() statements, or a LUT will simplify matters on the assembler side of things.

 

Attached is an Excel file which maps in the reverse sense.  It yields the following:

 

Which lets me do this:

half_period = 50;
if (pressure <= 15290) { half_period += 30; }
if (pressure <= 14198) { half_period += 30; }
if (pressure <= 13106) { half_period += 30; }
if (pressure <= 12014) { half_period += 30; }
if (pressure <= 10922) { half_period += 30; }
if (pressure <= 9829) { half_period += 30; }
if (pressure <= 8737) { half_period += 30; }
if (pressure <= 7645) { half_period += 30; }
if (pressure <= 6553) { half_period += 30; }
if (pressure <= 5461) { half_period += 30; }
if (pressure <= 4368) { half_period += 30; }
if (pressure <= 3276) { half_period += 30; }
if (pressure <= 2184) { half_period += 30; }
if (pressure <= 1092) { half_period += 30; }
if (pressure <= 0) { half_period += 30; }

Using that instead of the expression in map() in my own test code:

static uint16_t map_pressure_to_half_period(uint16_t pressure)
{
  uint16_t half_period = 50;
  if (pressure <= 15290) { half_period += 30; }
  if (pressure <= 14198) { half_period += 30; }
  if (pressure <= 13106) { half_period += 30; }
  if (pressure <= 12014) { half_period += 30; }
  if (pressure <= 10922) { half_period += 30; }
  if (pressure <= 9829) { half_period += 30; }
  if (pressure <= 8737) { half_period += 30; }
  if (pressure <= 7645) { half_period += 30; }
  if (pressure <= 6553) { half_period += 30; }
  if (pressure <= 5461) { half_period += 30; }
  if (pressure <= 4368) { half_period += 30; }
  if (pressure <= 3276) { half_period += 30; }
  if (pressure <= 2184) { half_period += 30; }
  if (pressure <= 1092) { half_period += 30; }
  if (pressure <= 0) { half_period += 30; }
  return half_period;
}

... and then:

    pressure = ADC << 4;
    half_period_ms = map_pressure_to_half_period(pressure);

This has actually increased flash usage to more than even our very first attempt, even with -Os (I think the compiler is trying to be a bit too clever), but in assembler we can do a lot better.  Note that I've also included pressure/256 in the Excel file, so you need only compare the high byte.  The inaccuracy shouldn't matter to the naked ape:

; r25/r24: half-period
; r17: pressure MSB
	ldi	r25, 0	; quarter-period of 25 ms
	ldi	r24, 25	; we work with quarter period to stay under 16-bits

	cpi	r17, 63
	brsh	skip_0
	subi	r24, -15
skip_0:
	cpi	r17, 59
	brsh	skip_1
	subi	r24, -15
skip_1:
	cpi	r17, 55
	brsh	skip_2
	subi	r24, -15
skip_2:
	cpi	r17, 51
	brsh	skip_3
	subi	r24, -15
skip_3:
	cpi	r17, 46
	brsh	skip_4
	subi	r24, -15
skip_4:
	cpi	r17, 42
	brsh	skip_5
	subi	r24, -15
skip_5:
	cpi	r17, 38
	brsh	skip_6
	subi	r24, -15
skip_6:
	cpi	r17, 34
	brsh	skip_7
	subi	r24, -15
skip_7:
	cpi	r17, 29
	brsh	skip_8
	subi	r24, -15
skip_8:
	cpi	r17, 25
	brsh	skip_9
	subi	r24, -15
skip_9:
	cpi	r17, 21
	brsh	skip_A
	subi	r24, -15
skip_A:
	cpi	r17, 17
	brsh	skip_B
	subi	r24, -15
skip_B:
	cpi	r17, 12
	brsh	skip_C
	subi	r24, -15
skip_C:
	cpi	r17, 8
	brsh	skip_D
	subi	r24, -15
skip_D:
	cpi	r17, 4
	brsh	skip_E
	subi	r24, -15
skip_E:
	cpi	r17, 0
	brsh	skip_F
	subi	r24, -15
skip_F:
	lsr	r24	; double the quarter-period for the half-period
	lsr	r25

I think I got the comparisons the right way round.

 

The entirety of code to compute the half-period from the pressure is 104 bytes.  Of course, that doesn't count the systick, nor the I2C code.  That's not necessarily better than just using the linear transformation in map(), with the power-of-two tweak above, but it is possibly easier to deal with, and certainly uses fewer registers.

 

EDIT:  fixed code

Attachment(s): 

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

Last Edited: Fri. Feb 15, 2019 - 12:34 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Joey,

Thanks for all the effort but I think you are over thinking....

 

I just looked at teh build output and as-is, with the ADC code >which is not needed< the test program takes up less than 800 bytes.  Thats not a problem so no need to go nutz trying to reduce code size.

 

As far as time to pressure accuracy, again thats not mission critical either.  If its off by 150 milliseconds NO ONE will notice.

 

so using the (map) is fine with me.  I just gotta get things in the right spots - smiley

 

Jim

I would rather attempt something great and fail, than attempt nothing and succeed - Fortune Cookie

 

"The critical shortage here is not stuff, but time." - Johan Ekdahl

 

"If you want a career with a known path - become an undertaker. Dead people don't sue!" - Kartman

"Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?"  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

jgmdesign wrote:
Thanks for all the effort
Minimal effort.

 

jgmdesign wrote:
I just looked at teh build output and as-is, with the ADC code >which is not needed< the test program takes up less than 800 bytes. Thats not a problem so no need to go nutz trying to reduce code size.
Fair enough.

 

jgmdesign wrote:
As far as time to pressure accuracy, again thats not mission critical either. If its off by 150 milliseconds NO ONE will notice.
I figured.  That's why binning the transformation into a 16 steps seemed good enough.

 

jgmdesign wrote:
I just gotta get things in the right spots
Have fun!

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

Last Edited: Fri. Feb 15, 2019 - 12:36 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Instead of trying to shoehorn things together, why not just take the concept  & add it to the existing assembler code?  Setting up the timers in asm is very straightforward (especially if you have a scope or logic ana), just a few values to the registers, maybe a few control bits to enable interrupts (which could update a tick count, or  toggle the led).   The timer can run at a constant ("fast", like 0.5ms) tick rate, then the desired amount of time can be counted off in some register (toggle blink in xx ticks)....changing xx then changes the blink rate.   Or the hardware timer rate itself (OCR values) can be adjusted, if this remains the timer's sole purpose. 

When in the dark remember-the future looks brighter than ever.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

avrcandies wrote:
Or the hardware timer rate itself (OCR values) can be adjusted, if this remains the timer's sole purpose.
That was the topic of Jim's other thread.  The problem with a hardware PWM approach is that the compare register is double-buffered, so feedback to the user for an adjustment made to the pressure could be delayed by a full period or half period.  At 1 Hz, that's as much as a second, which was deemed too slow for the application.  Using a non-PWM mode to toggle introduces other issues i.e. glitches due to missing a compare match.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I ginned up this code,...seems to be "ok" for a quicky...it fakes steady ON by toggling super fast at the setpoint, which actually gives a slight shift in brightness & you can tell you hit the mark!

As you vary  the pot (ADC3  vs Vcc), the blink rate varies nicely (though might need rough re-scaled for a different setpoint.

 

note:  rjmp TIM1_COMPA ;NOT USED  ....Timer1 Compare A Handler IS used, the comment is outdated

 

This code blinks the "red" colored led on PB0
 

;by Hoyt Clagwell 2/2019
;set for tiny44a, with adc on
;freely usable by anyone

.equ my_setpoint = 600  ;desired setpoint (0-1023 adc counts)
.equ my_setpoint_window = 20 ;exceeding this delta, start flashing
.equ tick_period=4 ;4ms tick timer, lower is faster blinking (5ms, 4, 3.7, 2.5, etc)

		 ; usage: InReg reg, addr 

.macro InReg 

    .if @1 < 0x40
        in @0, @1
    .elif ((@1 >= 0x60) && (@1 < SRAM_START))
        lds @0,@1
    .else
       .error "InReg: Invalid I/O register address"
    .endif 

.endmacro 

; usage: OutReg addr, reg 

.macro OutReg 

    .if @0 < 0x40
        out @0, @1
    .elif ((@0 >= 0x60) && (@0 < SRAM_START))
        sts @0,@1
    .else
       .error "OutReg: Invalid I/O register address"
    .endif 

.endmacro

.cseg			;code segment @ 0x0000
.org 0x0000		;set program address to ISR vector table

		rjmp RESET ; Reset Handler
		rjmp INT0IRQ ;NOT USED IRQ0 Handler
		rjmp PCINT1IRQ ;NOT USED  PCINT0 Handler
		rjmp PCINT1 ;NOT USED  PCINT1 Handler
		rjmp WDT ;NOT USED  Watchdog Interrupt Handler
		rjmp TIM1_CAPT ;NOT USED  Timer1 Capture Handler
		rjmp TIM1_COMPA ;NOT USED  Timer1 Compare A Handler
		rjmp TIM1_COMPB ;NOT USED  Timer1 Compare B Handler
		rjmp TIM1_OVF ;NOT USED  Timer1 Overflow Handler
		rjmp TIM0_COMPA ;NOT USED  Timer0 Compare A Handler
		rjmp TIM0_COMPB ;NOT USED  Timer0 Compare B Handler
		rjmp TIM0_OVF ;NOT USED  Timer0 Overflow Handler
		rjmp ANA_COMP ;NOT USED  Analog Comparator Handler
		rjmp ADCIRQ ;NOT USED  ADC Conversion Handler
		rjmp EE_RDY ;NOT USED  EEPROM Ready Handler
		rjmp USI_STR ;NOT USED  USI STart Handler
		rjmp USI_OVF ;NOT USED  USI Overflow Handler

INT0IRQ: ; IRQ0 Handler
PCINT0IRQ: ; PCINT0 Handler
PCINT1IRQ: ; PCINT1 Handler
WDT: ; Watchdog Interrupt Handler
TIM1_CAPT: ; Timer1 Capture Handler
;TIM1_COMPA: ; Timer1 Compare A Handler
TIM1_COMPB: ; Timer1 Compare B Handler
TIM1_OVF: ; Timer1 Overflow Handler
TIM0_COMPA: ; Timer0 Compare A Handler
TIM0_COMPB: ; Timer0 Compare B Handler
TIM0_OVF: ; Timer0 Overflow Handler
ANA_COMP: ; Analog Comparator Handler
ADCIRQ: ; ADC Conversion Handler
EE_RDY: ; EEPROM Ready Handler
USI_STR: ; USI STart Handler
USI_OVF: ; USI Overflow Handler

here:  rjmp here	;wait for the train if invalid irq

;**************Register Usage**********************
.def zero=r2			;used everywhere as zero as needed
.def led_ticks=r16		;led blink timer
.def led_rate=r17		;sets blink rate 0=const on...255=slowest blink
.def temp=r20			;general register
;NOTE XL=r26 XH=r27 YL=r28 YH=r29 ZL=r30 ZH=r31
;======================================
reset:		clr zero
			ldi ZL, high(ramend)
			out SPL, ZL
			ldi ZL, low(ramend)
			out SPL, ZL

		 	ldi ZL, $D0
            out DDRA, ZL          ;set porta data pin directions
			ldi ZL, (1<<PA_SWITCH)   ;turn on pullup

			out PORTA, ZL 

			ldi ZL, (0<<ADC7D)|(0<<ADC6D)|(0<<ADC5D)|(0<<ADC4D)|(1<<ADC3D)|(0<<ADC2D)|(0<<ADC1D)|(0<<ADC0D)
			outreg DIDR0, ZL  ;set pin for analog mode

.equ PA_AREF=PORTA0		;0 PA0=AREF, ADC ref filter
.equ PA_SWITCH=PORTA1	;0 PA1=USER SWITCH
.equ PA_SENSE3=PORTA2	;0 PA2=SENSE3, NOT USED
.equ PA_SENSE1=PORTA3	;0 ADC3=SENSE1, SENSOR INPUT
;
.equ PA_SCK=PORTA4		;1 PA4=SPARE/SCK/SCL
.equ PA_EXTCAL=PORTA5	;0 PA5=EXTERNAL CAL, MISO
.equ PA_xxx=PORTA6		;1 PA6=xxxx CONTROL, MOSI
.equ PA_yyy=PORTA7		;1 PA7=yyy PWM, OC0B

		 	ldi ZL, $07
            out DDRB, ZL          ;set portb data pin directions
			out PORTB, zero ;turn all off

.equ PB_LEDRED0=PORTB0	;1 PB0=LED INDICATOR, D7 inner edge of pcb		

.equ PB_LEDYEL1=PORTB1	;1 PB1=LED INDICATOR, D8 middle
.equ PB_LEDGRN2=PORTB2	;1 PB2=LED INDICATOR, D9 front corner of pcb

.equ PB_RESET=PORTB3	;0 PB3=RESET

;=========set up peripherals===================
		;ADC
			ldi ZL, (0<<REFS1)|(0<<REFS0)|(0<<MUX5)|(0<<MUX4)|(0<<MUX3)|(0<<MUX2)|(1<<MUX1)|(1<<MUX0) ;use Vcc ref, cha PA3,  arrangement for tiny44a
			OUTREG ADMUX, ZL
			ldi ZL, (0<<BIN)|(0<<ACME)|(0<<ADLAR)|(0<<ADTS2)|(0<<ADTS1)|(0<<ADTS0)
			OUTREG ADCSRB, ZL

		;TIMER1
			ldi ZL, (0<<COM1A1)|(0<<COM1A0)|(0<<COM1B1)|(0<<COM1B0)|(0<<WGM11)|(0<<WGM10) ;MODE: 4 CTC mode
			OUTREG TCCR1A, ZL
			ldi ZL, (0<<ICNC1)|(0<<ICES1)|(0<<WGM13)|(1<<WGM12)|(0<<CS12)|(0<<CS11)|(1<<CS10) ;clock timer at CLK/1
			OUTREG TCCR1B, ZL
			OUTREG TCCR1C, zero
			ldi ZL, (0<<ICIE1)|(0<<OCIE1B)|(1<<OCIE1A)|(0<<TOIE1)	;IRQs enable: Output compare A
			OUTREG TIMSK1, ZL
			OUTREG TCNT1H, zero  ;MUST write high byte first, must read LOW byte first
			OUTREG TCNT1L, zero
			ldi ZL, high(tick_period*1000)
			OUTREG OCR1AH, ZL	;MUST write high byte first, must read LOW byte first
			ldi ZL, low(tick_period*1000)
			OUTREG OCR1AL, ZL   ;set up tick

			clr led_ticks
			clr led_rate; gives constant on appearance
			sei			;enable global IRQ

loop:		rcall read_adc ;0-1023  returned in ZH:ZL
			ldi XL, high(my_setpoint)
			cpi ZL, low(my_setpoint)
			cpc ZH, XL
			brlo no_flash ;if <= setpoint, quit flashing

			ldi XL, high(my_setpoint+my_setpoint_window)
			cpi ZL, low(my_setpoint+my_setpoint_window)
			cpc ZH, XL
			brlo loop ;if within window, maintain (flash or no flash) 

			subi ZL, low(my_setpoint)  ;otherwise update a flash rate
			sbci ZH, high(my_setpoint) ;by calculating the delta

			com ZH  ;reverse the subtraction (delta)
 			neg ZL
			sbci ZH, 255   ;need bigger delta==>smaller number

			lsr ZH ;optional, scale the error into a freq (divide numb by 2)
			ror ZL  ;optional, scale the error into a freq (divide numb by 2)

			mov led_rate, ZL ;send result to update blink rate (bigger#-->slower blink)
			rjmp loop

no_flash:	clr led_rate ; flashes so fast appears steady on
			rjmp loop

;timer IRQ tick (approx every 4ms)
TIM1_COMPA:	push ZL		;save ZL register during interrupt
			in ZL, sreg	;save CPU status bits during interrupt

handle_leds:inc led_ticks
			cp led_ticks, led_rate
			brlo exit_irq	

			sbi PINB, PB_LEDRED0  ;toggle red led
			clr led_ticks

exit_irq:	out sreg, ZL	;return with pre-interrupt status bits
			pop ZL			;return ZL to pre-interrupt value
			reti			;return from timer0 compareA interrupt routine

;returns 10 bit single ADC reading in ZH:ZL
read_adc:	ldi ZL, (1<<ADEN)|(1<<ADSC)|(0<<ADATE)|(0<<ADIF)|(0<<ADIE)|(1<<ADPS2)|(1<<ADPS1)|(0<<ADPS0);start conv, divide clk by 64
			outreg ADCSRA, ZL

wait_conv:	inreg ZL, ADCSRA
			sbrc ZL, ADSC
			rjmp wait_conv ;stay here until conversion complete
			inreg ZL, ADCL ;low byte must be read first
			inreg ZH, ADCH
			ret

 

 

When in the dark remember-the future looks brighter than ever.

Last Edited: Fri. Feb 15, 2019 - 03:56 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

BTW can I make a shameless self plug and mention https://spaces.microchip.com/gf/... which is a possibly "better" way to steal C generated Asm than trying to hack opcodes out of LSS files !

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:

BTW can I make a shameless self plug and mention https://spaces.microchip.com/gf/... which is a possibly "better" way to steal C generated Asm than trying to hack opcodes out of LSS files !

 

THis looks promising, but I am a little confused at this part:

The utility presented here can be run as a post-build step on each .s file that is created by -g -save-temps and it then embeds the C source lines referred to by .loc's into a new file it generated (for file.s it produces file.source.s). It also strips out a lot of internal labels and other "housekeeping" to make the .source.s file much easier to read than the plain .s file.

 l looked in my project debug directory and see no .s file.  I am guessing that I need to tell Studio to  do this "-g -save-temps " along with soem other items, but it's not something I have done.

 

Any tips?

 

JIm

 

EDIT:

BTW can I make a shameless self plug and mention

By all means!

 

EDIT_EDIT:

I think I got it....I had to go into Project Properties>Toolchain>AVR/GNU C Compiler>Miscellaneous and click the "Do not delete temporary files(-save-temps)

 

Thanks

 

I would rather attempt something great and fail, than attempt nothing and succeed - Fortune Cookie

 

"The critical shortage here is not stuff, but time." - Johan Ekdahl

 

"If you want a career with a known path - become an undertaker. Dead people don't sue!" - Kartman

"Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?"  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

Last Edited: Wed. Feb 20, 2019 - 06:53 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

CLiffs little app seems to work well, but I am a little confused.  I have the .S file and I ran the utility on it and it and compared the output of the utility with the .lss.  In much of it, it mimics the .lss, but its missing some of the subfunctions called out in the MAP function.  They appear in the .lss, but not in the utility output.  I have attached both files if someone could take a look and maybe point to what I am missing.

 

JIm

Attachment(s): 

I would rather attempt something great and fail, than attempt nothing and succeed - Fortune Cookie

 

"The critical shortage here is not stuff, but time." - Johan Ekdahl

 

"If you want a career with a known path - become an undertaker. Dead people don't sue!" - Kartman

"Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?"  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

jgmdesign wrote:
  In much of it, it mimics the .lss, but its missing some of the subfunctions called out in the MAP function. 
Remember that your original LSS:

00000056 <__vector_16>:

int32_t map(const int32_t x, const int32_t in_min, const int32_t in_max, const int32_t out_min, const int32_t out_max)
{
	uint16_t y = x*16;
	return (y - in_min) * (out_max - out_min) / (in_max - in_min) + out_min;
}
  56:	1f 92       	push	r1
  58:	0f 92       	push	r0
  5a:	0f b6       	in	r0, 0x3f	; 63
  5c:	0f 92       	push	r0
  5e:	11 24       	eor	r1, r1

where source for map() appears to be within __vector_16 is the avr-objdump disassembler, that reads the ELF and produces the LSS, having a "bad hair day". That piece of C source does not relate to the __vector_16 code and those lines of annotation are placed in error. The actual code of _vector16 in the LSS is:

  56:	1f 92       	push	r1
  58:	0f 92       	push	r0
  5a:	0f b6       	in	r0, 0x3f	; 63
  5c:	0f 92       	push	r0
  5e:	11 24       	eor	r1, r1
  60:	8f 93       	push	r24
  62:	9f 93       	push	r25
  64:	84 e4       	ldi	r24, 0x44	; 68
  66:	86 bd       	out	0x26, r24	; 38
  68:	80 91 07 01 	lds	r24, 0x0107	; 0x800107 <tick>
  6c:	90 91 08 01 	lds	r25, 0x0108	; 0x800108 <tick+0x1>
  70:	01 96       	adiw	r24, 0x01	; 1
  72:	90 93 08 01 	sts	0x0108, r25	; 0x800108 <tick+0x1>
  76:	80 93 07 01 	sts	0x0107, r24	; 0x800107 <tick>
  7a:	9f 91       	pop	r25
  7c:	8f 91       	pop	r24
  7e:	0f 90       	pop	r0
  80:	0f be       	out	0x3f, r0	; 63
  82:	0f 90       	pop	r0
  84:	1f 90       	pop	r1
  86:	18 95       	reti

And the code in the annotated .s file is:

	.section	.text.__vector_16,"ax",@progbits
.global	__vector_16
	.type	__vector_16, @function
__vector_16:
//==> {
	push r1
	push r0
	in r0,__SREG__
	push r0
	clr __zero_reg__
	push r24
	push r25
/* prologue: Signal */
/* frame size = 0 */
/* stack size = 5 */
.L__stack_usage = 5
//==> 	TCNT0=0x44;
	ldi r24,lo8(68)
	out 0x26,r24
//==> 	tick++;		//increase tick counter by one
	lds r24,tick
	lds r25,tick+1
	adiw r24,1
	sts tick+1,r25
	sts tick,r24
/* epilogue start */
//==> }
	pop r25
	pop r24
	pop r0
	out __SREG__,r0
	pop r0
	pop r1
	reti
	.size	__vector_16, .-__vector_16

Unless my eyes fail me that is a 100% identical piece of source code (sure things like "clr r1" are disassembled as "eor r1,r1" but that's simple because "clr" is a "fake instruction")

 

What's more the annotated source is (IMAO anyway) much more "readable" than the disassembly too. What's more it can just be listed and used directly as avr-as source and, with a little massage, as Atmel-Asm source. Clearly you have:

ISR(whatever) {
    TCNT0 = 0x44;
    tick++;
}

(that does raise the question of whether timer0 might be better in CTC mode by the way)

 

The true use of map() in your code actually appears to be in:

				period_ms = map(pressure, 0, 16383, 500, 50);
 18a:	a0 91 03 01 	lds	r26, 0x0103	; 0x800103 <pressure>
 18e:	b0 91 04 01 	lds	r27, 0x0104	; 0x800104 <pressure+0x1>
 192:	a2 95       	swap	r26
 194:	b2 95       	swap	r27
 196:	b0 7f       	andi	r27, 0xF0	; 240
 198:	ba 27       	eor	r27, r26
 19a:	a0 7f       	andi	r26, 0xF0	; 240
 19c:	ba 27       	eor	r27, r26
 19e:	2e e3       	ldi	r18, 0x3E	; 62
 1a0:	3e ef       	ldi	r19, 0xFE	; 254
 1a2:	4f ef       	ldi	r20, 0xFF	; 255
 1a4:	5f ef       	ldi	r21, 0xFF	; 255
 1a6:	59 d0       	rcall	.+178    	; 0x25a <__muluhisi3>
 1a8:	a7 01       	movw	r20, r14
 1aa:	96 01       	movw	r18, r12
 1ac:	3a d0       	rcall	.+116    	; 0x222 <__divmodsi4>
 1ae:	da 01       	movw	r26, r20
 1b0:	c9 01       	movw	r24, r18
 1b2:	8c 50       	subi	r24, 0x0C	; 12
 1b4:	9e 4f       	sbci	r25, 0xFE	; 254
 1b6:	af 4f       	sbci	r26, 0xFF	; 255
 1b8:	bf 4f       	sbci	r27, 0xFF	; 255
 1ba:	90 93 0a 01 	sts	0x010A, r25	; 0x80010a <period_ms+0x1>
 1be:	80 93 09 01 	sts	0x0109, r24	; 0x800109 <period_ms>

which would appear to be this sequence in the annotated asm source:

//==> 				period_ms = map(pressure, 0, 16383, 500, 50);
	lds r26,pressure
	lds r27,pressure+1
	swap r26
	swap r27
	andi r27,0xf0
	eor r27,r26
	andi r26,0xf0
	eor r27,r26
	ldi r18,lo8(62)
	ldi r19,lo8(-2)
	ldi r20,lo8(-1)
	ldi r21,lo8(-1)
	rcall __muluhisi3
	movw r20,r14
	movw r18,r12
	rcall __divmodsi4
	movw r26,r20
	movw r24,r18
	subi r24,12
	sbci r25,-2
	sbci r26,-1
	sbci r27,-1
	sts period_ms+1,r25
	sts period_ms,r24

On this occasion, as there's a lot of asm for one line of C it's a pretty close call as to which of these two is the more "readable". Personally I think I still prefer the avr-as source.

 

BTW in both files the actual use of map() is split. While the above is the main invocation, earlier in LSS and .S you find:

				period_ms = map(pressure, 0, 16383, 500, 50);
 142:	0f 2e       	mov	r0, r31
 144:	cc 24       	eor	r12, r12
 146:	ca 94       	dec	r12
 148:	ff e3       	ldi	r31, 0x3F	; 63
 14a:	df 2e       	mov	r13, r31
 14c:	e1 2c       	mov	r14, r1
 14e:	f1 2c       	mov	r15, r1
 150:	f0 2d       	mov	r31, r0
//==> 				period_ms = map(pressure, 0, 16383, 500, 50);
	mov __tmp_reg__,r31
	clr r12
	dec r12
	ldi r31,lo8(63)
	mov r13,r31
	mov r14,__zero_reg__
	mov r15,__zero_reg__
	mov r31,__tmp_reg__

Which just shows how the optimiser works. Sometimes it will take a complex C statement and split it into two or more separate code blocks. It obviously suited it to get r12, r13, r14 and r15 set up in advance.

 

(interesting to note that it had to save/restore r31 simply so it could use it (LDI) in the setup of r13 - that doesn't look very "optimal" to me  - but presumably register pressure meant that none of the other LDI'able registers were available at the time?)

Last Edited: Thu. Feb 21, 2019 - 09:07 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Cliff
_divmodsi4 and _muluhisi3 which are called exist in the lss, but not in the output from the utility

Jim

I would rather attempt something great and fail, than attempt nothing and succeed - Fortune Cookie

 

"The critical shortage here is not stuff, but time." - Johan Ekdahl

 

"If you want a career with a known path - become an undertaker. Dead people don't sue!" - Kartman

"Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?"  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Correct. 

 

They are prebuilt in the library (libgcc.a) in binary form. The only place you can see their source is by getting a copy of the GCC source tree. The map file tells us that mega48 is "avr4" architecture so here:

C:\Program Files (x86)\Atmel\Studio\7.0\toolchain\avr8\avr8-gnu-toolchain\lib\gcc\avr\5.4.0\avr4>avr-nm libgcc.a | grep __muluhi
avr-nm: _mulqi3.o: no symbols
avr-nm: _mulhi3.o: no symbols
avr-nm: _mulqihi3.o: no symbols
avr-nm: _umulqihi3.o: no symbols
avr -nm :       _load_3.o: no symbols U __muluhisi3

00000000 T __muluhisi3
avr-nm:  _load_4.o: no symbols        U __muluhisi3

avr-nm: _lshrdi3.o: no symbols
avr-nm: _fractsfsq.o: no symbols
avr-nm: _fractsfusq.o: no symbols
avr-nm: _mulqq3.o: no symbols

In fact:

_muluhisi3.o:
00000000 T __muluhisi3
         U __umulhisi3

So in the GCC source tree there is source for _muluhisi3. In fact googling about a bit I hit:

 

https://github.com/gcc-mirror/gc...

 

But whether that is the code that is specifically in the libgcc.a that comes with the avr-gcc 5.4 in AS7 I couldn't be sure of without more research.

 

You'd find the same if you'd used other library functions like printf() or strcpy(). You wouldn't find the source of those in .s files in your local build because those ones come pre-built in libc.a. But once the .s files from your C are built then the .o files they create are linked with libXXX.a files from the installation to produce the ELF. The LSS is then a disassembly of everything that went into the ELF irrespective of whether you built it or the toolchain builder built it several years ago.