How to enter opcodes in inline assembly?

Go To Last Post
38 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I am new to GCC for AVR so my question may seem a bit trivial but is stumping me. I am using an attiny5 and I am trying to access 1 byte of sram - which according to the datasheet I can do with STS/LDS. however the compiler is not generating the correct opcodes. BTW Most of my code is in assembler which is called from C code.

I use the following code to save a byte to $40
ldi r24,0x16
sts 0x40,r24

or in my c function:
asm("ldi r24,0x16");
asm("sts 0x40,r24");

the opcode generated for sts is:
80 93 40 00 which is not correct it should be 16 bit and more like: 80 a8

I have tried using .db 0x80,0x0a8 but the compiler does not seem to understand the .db.

So my question is - how can I directly enter opcodes inline in either C or asm - that way I can work around the compiler bug?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It sounds more like an avr-as bug in binutils. Are you sure the version you are using has full support for the brain dead tinys?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I am using AVR Studio 5 - which I guess uses gcc. I have selected attiny5. Everything else works.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The release notes say something like the brain dead support is "tentative ".

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Are you sure the version you are using has full support for the brain dead tinys?

Indeed-- see https://www.avrfreaks.net/index.p... and linked threads.

Perhaps dig into the new 4.7 and what it contains to see if the support is improved for that family?
https://www.avrfreaks.net/index.p...

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The release notes say something like the brain dead support is "tentative ".

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I will give 4.7 a try. If it still fails does anyone have suggestions on entering opcodes directly inline?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Can you .dw the opcode?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Tried that I get an "unknown pseudo-op: `.dw'" error.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Can you .dw the opcode?

If it were me, I'd think "Hmmm--I can hand-assemble this LDS, and make a macro. And for STS, too. But only for the one address space? And how many times will I need to repeat this as I find other gotchas?"

Quote:

brain dead support is "tentative ".

Might be good enough, if the problem areas have been addressed but not yet "proven".

{there are other toolchains with support.]

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Tried 4.7 it will not compile for the attiny5.
As I only access the memory in 2 places - one to write one to read. I guess it will be easiest to just insert 2 nops at each place and patch the hex file before writing to the chip.
I will look into the other tools when I get some time.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

gkarlsson wrote:
I use the following code to save a byte to $40
ldi r24,0x16
sts 0x40,r24

or in my c function:
asm("ldi r24,0x16");
asm("sts 0x40,r24");

the opcode generated for sts is:
80 93 40 00 which is not correct it should be 16 bit and more like: 80 a8

I have tried using .db 0x80,0x0a8 but the compiler does not seem to understand the .db.

The compiler should not reject any inline assembly.
Does it matter whether you have a leading blank?
A workaround would seem to be *((char*)0x40)=0x16 .
Also, your inline assembly should be a single statement and should indicate that it clobbers r24.
'Tis better to let the compiler pick the register.
asm(" sts 0x40, %0" : : "r" ((char)0x16)));

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Setting the pointer directly does work, though 12 bytes - when you only have 512 bytes to work with is expensive. You need to set up the Z register and then transfer indirectly. I could set the register up once but then I might as well use it to store my variable.

*((char*)0x40)=0x16;
  40:	80 e4       	ldi	r24, 0x40	; 64
  42:	90 e0       	ldi	r25, 0x00	; 0
  44:	26 e1       	ldi	r18, 0x16	; 22
  46:	e8 2f       	mov	r30, r24
  48:	f9 2f       	mov	r31, r25
  4a:	20 83       	st	Z, r18

And of course the inline asm still uses the 32bit sts instead of the 16bit.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

That code looks like -O0 which would be very unwise on a 512 byte chip in C!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

WrightFlyer wrote:
That code looks like -O0 which would be very unwise on a 512 byte chip in C!
It's gross even for -O0.
I'd have expected four instructions at most.

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
Tried [.dw to generate the instruction] and I get an "unknown pseudo-op: `.dw'" error.

The gnu assembler uses ".word" rather than ".dw"

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
The gnu assembler uses ".word" rather than ".dw"

Perfect, that works.

Thank you.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Your ticket was forwarded to the tools team this morning, so hopefully they'll be able to figure out a fix for the next release or so.

- Dean :twisted:

Make Atmel Studio better with my free extensions. Open source and feedback welcome!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Dean,

Is this a support ticket about making it easier to enter op-codes as explicit hexadecimal constants inside an inline assembly block (addressing the thread's initial topic)? Or is it a more fundamental support ticket about the fact that the GNU tool chain doesn't know how to automatically handle the different size/encoding of the LDS and STS op-codes used in reduced-core AVRs such as the ATtiny4/5/9/10 and the ATtiny20/40?

- Luke

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
Is this a support ticket about making it easier to enter op-codes as explicit hexadecimal constants inside an inline assembly block (addressing the thread's initial topic)? Or is it a more fundamental support ticket about the fact that the GNU tool chain doesn't know how to automatically handle the different size/encoding of the LDS and STS op-codes used in reduced-core AVRs such as the ATtiny4/5/9/10 and the ATtiny20/40?

I guess IMHO the latter as I opened a ticket for the wrong opcodes being generated for the ATtiny 4/5/9/10. If I had RTFM I would have seen that .byte and .word are used instead of .db and .dw, and this thread would never have existed.

However I think there is a deeper problem with supporting these chips.
Take changing the clock registers we all know that we have to do this:

CCP=0xD8;
CLKMSR=0x00;

The caveat is that the second instruction needs to occur within 4 clock cycles. The problem is that each command generates 6 ASM instructions (12 bytes). There is no way in which the 2nd instruction will be executed in time. The compiler never generates an 'out' instruction, everything moved to a register memory position uses the z register and st. Don't even get me started on what happens when you define ISR routines.

I rewrote all my functions in ASM to ensure that I was getting the correct code. Someone pointed me towards CodeVisorAVR, I can say that it at least generated what I expected but had a different annoying habit of using the y register to pass parameters to functions. It would also not compile my asm file, it only wants C source files, and I was not feeling up to taking my optimized ASM and converting it back to C.

I do know however what I will be using for my next ATtiny 5 project.

Thanks to everyone for their input, I have managed to complete the project I was working on. I do hope the ATtiny 4-10 chips are supported soon by gcc.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

gkarlsson wrote:
The caveat is that the second instruction needs to occur within 4 clock cycles.
In this particular case, it's quite unreasonable not to write asm.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

In this particular case, it's quite unreasonable not to write asm.

Which is how AVR-LibC manages to guarantee most of the 4 cycle sequences. If there's an error here then it's probably actually that AVR-LibC should have code added to cater for this particular sequence.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Anybody cares to add a feature request to avr-libc's tracker?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I don’t think the issue is just this one sequence. As I pointed out it appears as if every assignment of a register in I/O space is done with a store indirect instead of using an out. Point me in the right direction and I will write up the feature request.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

When you're talking about an explicit I/O register assignment in vanilla C code, the correct bug tracker would be for the compiler, GCC. Bug: GCC is using sub-optimal rules for code generation.

When you're talking about special timed sequences that are encapsulated within an existing avr-libc hardware abstraction API, then the correct bug tracker would be for avr-libc. Bug: The hardware abstraction API shouldn't depend upon compiler optimization, and therefore should have been written in assembly to begin with, not plain vanilla C.

My crystal ball tells me that in this case, there was no appropriate abstraction API available in avr-libc, so the OP was writing the code directly in vanilla C. So, in all likelihood we're probably dealing with the former case, not the latter.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

My crystal ball tells me that in this case, there was no appropriate abstraction API available in avr-libc, so the OP was writing the code directly in vanilla C. So, in all likelihood we're probably dealing with the former case, not the latter.

Luke,

I think it's two layered. Yes there may be a code generation fault in the C compiler and if so it deserves to reported/fixed but if the sequence being attempted was something that many users using that chip would like to be doing on a regular basis then it also deserves a hand crafted piece of Asm in AVR-LibC too that then saves the same wheel being reinvented N thousand times and all the associated issues of folks attempting it at -O0 even if the code generation fault is eventually fixed as it still probably could be made to work at -O0.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

gkarlsson wrote:
The compiler never generates an 'out' instruction, everything moved to a register memory position uses the z register and st.

Um, that's not true:

void setup() {                
  // initialize the digital pin as an output.
  // Pin 13 has an LED connected on most Arduino boards:
  PORTB = 0;
 12a:   15 b8           out     0x05, r1        ; 5
  pinMode(13, OUTPUT);
 12c:   8d e0           ldi     r24, 0x0D       ; 13
 12e:   61 e0           ldi     r22, 0x01       ; 1
 130:   0e 94 79 01     call    0x2f2   ; 0x2f2 
  PORTB = 0x12;  
 134:   82 e1           ldi     r24, 0x12       ; 18
 136:   85 b9           out     0x05, r24       ; 5
}
 138:   08 95           ret

I'm not sure what you need to do to get C to recognize that some things are usable via the OUT instruction, and certainly inline assembler is a better way to be SURE that things are done with the required cycle. I suspect that your construct:

*((char*)0x40)=0x16; 

was NOT what the compiler would have wanted. (though it does still generate an OUT instruction in the same atmega328 environment as the above example...)
This can probably be chalked up to "half-Tiny" chips support is a bit shaky still."

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
This can probably be chalked up to "half-Tiny" chips support is a bit shaky still."

I agree, I was only referring to the code being generated for the ATTiny 4 thru 10.

As I have not used any of the other processors yet I cannot make any comments about code being generated for them.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'd say, if lds/sts generates wrong binary, the assembler needs to be fixed first (i.e. it's a binutils bug and should go to binutils's (binutils' ?) tracker).

If I understand it correctly, the compiler deliberately does not generate lds/sts to avoid this bug; so that's the second to be fixed. That should go to gcc's tracker, but there's little point to do that until the first gets fixed.

Then there's an additional issue, as Cliff said, the library should provide prechewed asm code for the critical sequences; however, it also depends on the fixed assembler (OK it may use .byte/.word, but that's crude). That then should go to avr-libc's tracker.

And then there's Atmel's internal tracker, where you already "got a ticket" - except that we don't know for what exactly.

Welcome to the wonderfully organized world of GNU AVR tools.

JW

PS. IIRC Eric Weddington used to maintain one more list of bugs... :-) see sticky at top of this forum

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

gkarlsson wrote:
I don’t think the issue is just this one sequence. As I pointed out it appears as if every assignment of a register in I/O space is done with a store indirect instead of using an out. Point me in the right direction and I will write up the feature request.
You mean This? http://gcc.gnu.org/PR50448

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
You mean This? http://gcc.gnu.org/PR50448

Not Really - that code would use a STS which would also need to be fixed. I did try the 4.7 version and it did not resolve the sts issue, I did not check it against the other issues.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

gkarlsson wrote:
Quote:
You mean This? http://gcc.gnu.org/PR50448
Not Really - that code would use a STS which would also need to be fixed. I did try the 4.7 version and it did not resolve the sts issue, I did not check it against the other issues.
So why is STS wrong (with respect to compiler proper) when accessing RAM?

What is the sample code where compiler proper genenrates wrong code?

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
So why is STS wrong (with respect to compiler proper) when accessing RAM?

Because the STS it is generating is 32bit not 16 bit. ATtiny 4-10 can only use a 16 bit STS. I was answering in relation to the thread not in general.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

gkarlsson wrote:
Quote:
So why is STS wrong (with respect to compiler proper) when accessing RAM?

Because the STS it is generating is 32bit not 16 bit. ATtiny 4-10 can only use a 16 bit STS. I was answering in relation to the thread not in general.
So it's not a compiler issue, it's issue of binutils and thus of supplier of the distribution (Atmel?)

AFAIK this is a well known error in Atmel fork of GNU AVR tools.

The issue is straight forward to fix for anyone familiar with binutils, so I wonder why Atmel did not supply a fixed version up to now. Or did they?

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

SprinterSB wrote:
gkarlsson wrote:
Quote:
So why is STS wrong (with respect to compiler proper) when accessing RAM?

Because the STS it is generating is 32bit not 16 bit. ATtiny 4-10 can only use a 16 bit STS. I was answering in relation to the thread not in general.
So it's not a compiler issue, it's issue of binutils and thus of supplier of the distribution (Atmel?)

AFAIK this is a well known error in Atmel fork of GNU AVR tools.


... Keep in mind, though, that there is no solution to this problem at the head of the source tree for the official GNU version of binutils either. In fact, the GNU version of binutils simply doesn't acknowledge the existence of the reduced AVR CPU core in the ATtiny4/5/9/10 and ATtiny20/40 families at all.

Judging by the current state of the latest avr-bintutils patches from the FreeBSD ports system, I would venture to guess that it knows about the ATtiny10 (and its brethren), but it does not correctly implement LDS/STS for those devices either.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

lfmorrison wrote:
SprinterSB wrote:
gkarlsson wrote:
Because the STS it is generating is 32bit not 16 bit. ATtiny 4-10 can only use a 16 bit STS.
[...] it's issue of binutils and thus of supplier of the distribution (Atmel?)

AFAIK this is a well known error in Atmel fork of GNU AVR tools.

... Keep in mind, though, that there is no solution to this problem at the head of the source tree for the official GNU version of binutils either.
Why should there be a solution in binutils? There is not even support of Tiny, so how is anyone supposed to fix bugs that do never occur?

Tiny is implemented in Atmel's private port, so it's up to them to make bug fixes and release tools of reasonable quality. Atmel is independent of binutils or GCC's release cycles and can provide an upgrad at any time; they don't even need to have changes approved by someone else.

avrfreaks does not support Opera. Profile inactive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi, I patched the 16-bit LDS/STS bug. Hope it helps :-)
https://www.avrfreaks.net/index.p...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:
Perhaps dig into the new 4.7
This is obviously a binutils bug and not a compiler bug. Moreover, neither avr-gcc 4.7 nor binutils 2.23 have support for Tiny. It's all fun of Atmel's private port.

avrfreaks does not support Opera. Profile inactive.