CPSE instruction

Go To Last Post
20 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Feel like a Friday question?

As gcc is slowly winning me over, I don't do as much with assembly anymore, but this past week I've had to build some fanatically optimized routines, and am coding them as separate .S gnu assembly sources. For about the umpty-umpth time in my AVR assembler experience, I found myself saying, "Boy, a 'Compare and skip if NOT equal' instruction would sure be ideal here". And, just as on every other occasion where I've felt that way, I stopped to ask myself if I could remember ever having a use for the 'Compare and skip if equal' instruction that the AVR *does* have. Nope. I don't think I've ever had a use for it.

Has anyone out there found CPSE useful for anything?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Has anyone out there found CPSE useful for anything?

One way to fnid examples would be to grep all the .lss files you may have and see if the compielr ever chose to use it and, if so, why.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

cpse let the flags in status register unaffected.
cpse cannot be interrupted (atomic)

RES

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

No RSTDISBL, no fun!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

CPSE zero

(since I have always a dedicated zero register in my projects) saves a instruction here and there.

It's about the only time I use CPSE.

-carl

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Here is a small artificial C program that shows how avr-gcc uses CPSE:

char volatile x;
#define I0 x;
#define I1 I0 I0 I0 I0 I0 I0 I0 I0 I0 I0
#define I2 I1 I1 I1 I1 I1 I1 I1 I1 I1 I1

int cpse (char c)
{
    if (!c) { I2 }
    I1
}

The jump offset is too big to be reached by a BR** instruction and avr-gcc 4.6 compiles this to a jumpity-jump

cpse:
	tst r24
	breq .+2
	rjmp .L2
	;; { I2 }
.L2

whereas 4.7 uses a micro-optimization with CPSE:

cpse:
	cpse r24,__zero_reg__
	rjmp .L2
	;; { I2 }
.L2

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Brutte wrote:
Some op-codes are rare as hen's teeth:
https://www.avrfreaks.net/index.p...

Many of the instructions mentioned there are synonyms of each other and just syntactic sugar:

CBR   is ANDI
SBR   is ORI
BRB*  is typically used through a different BR** mnemonic
BRSH  is BRCC
BRLO  is BRCS
BCLR  is some CL*, * in { C, Z, V, H, I, T, N, S }
BSET  is some SE*, * in { C, Z, V, H, I, T, N, S }

CLR x  is EOR x, x
LSL x  is ADD x, x
ROL x  is ADC x, x
TST x  is AND x, x
SER x  is LDI x, -1

What's actually a waste are the CL* and SE* with * not in C, I or T. Hard to imagine a sensible usecase for the remaining Z, V, H, N, S.

And the H flag is also a waste, a second T flag to move around mits would be more helful.

What's also a waste are IN and OUT because they are backed up by LDS and STS. IN/OUT can reduce the code size of small programs, but for big programs I/O access only makes a small contribution to overall code size and the small size increase would not matter.

The 2*64*32 = 4096 opcodes could be used differently, for example to implement X+offset addressing to render X, Y, and Z orthogonal.

An other instruction that is nice for assembler programming is atomic. For example, to atomically load a 32-bit value from memory there could be

atomic 4
LDS  r22, x
LDS  r23, x+1
LDS  r24, x+2
LDS  r25, x+3

this would also make it much easier to atomically change SP in function pro/epilogue, e.g. by means of an imaginary atomic 2.

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

add to RES
if you want a very fast interrupt routine you can avoid pushing anything if you use CPSE (and then you have to push if you want to do something).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Compare and skip if equal or NOT equal is easy to write it.

cpse r16,r17
rjmp to_if_not_equal
rjmp to_if_equal
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

an easy way to make a CPSNE is this way:

cpse r16,17
cpse r0,r0 ; skip
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yet another useless "rjmp pc+2". Or actually 32 of those.

No RSTDISBL, no fun!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

yes but no label needed!
and no changes if a rjmp pc+4 was needed

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
yes but no label needed!

That is especially important when you run out of labels :)

    AVR Memory Usage
      ----------------
      Device: atmega2560
      Program:  144098 bytes (55.0% Full)
      (.text + .data + .bootloader)
      Data:         24 bytes (0.3% Full)
      (.data + .bss + .noinit)
      Labels:   156443 labels (100.6% Full)
      (.here + .there) 

No RSTDISBL, no fun!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

No pipeline flush.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
No pipeline flush.

You are talking about timings for "jump not taken" v/s "jump taken" here, right Jay?

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

 

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yes. Probably not too interesting on an AVR but on more complex processors with longer pipelines it might be significant. Some processors execute the instruction after the jump while taking the jump (branch delay slot).

I always found the conditional execution of every instruction on the ARM very nifty.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 wrote:
an easy way to make a CPSNE is this way:

cpse r16,17
cpse r0,r0 ; skip

That's quite clever; thanks.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Another way to use cpse. Cpse sets state carry depending on filtered bit state.

 

 sec        ;preset carry
 lpm  temp, Z+     ;fetch (next) byte
 and  temp, bitMask   ;filter bit
 cpse temp, bitMask   ;if result bit = 0,
 clc        ;clear carry, else bit = 1

 ​ror data        ​;state carry is (next) bit

 

RES

Last Edited: Wed. Nov 15, 2017 - 11:18 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I hope the OP wasn't holding his/her breath while waiting for that ...

 

Top Tips:

  1. How to properly post source code - see: https://www.avrfreaks.net/comment... - also how to properly include images/pictures
  2. "Garbage" characters on a serial terminal are (almost?) invariably due to wrong baud rate - see: https://learn.sparkfun.com/tutorials/serial-communication
  3. Wrong baud rate is usually due to not running at the speed you thought; check by blinking a LED to see if you get the speed you expected
  4. Difference between a crystal, and a crystal oscillatorhttps://www.avrfreaks.net/comment...
  5. When your question is resolved, mark the solution: https://www.avrfreaks.net/comment...
  6. Beginner's "Getting Started" tips: https://www.avrfreaks.net/comment...
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I assume that bitMask is a byte with only one bit set.

and then it could be done with something like this:

 

lpm  temp, Z+     ;fetch (next) byte

and  temp, bitMask   ;filter bit

eor  temp, bitMask   ;flip bit

sub  temp, bitMask   ;calc C

​ror data        ​;state carry is (next) bit