Forum Menu




 


Log in Problems?
New User? Sign Up!
AVR Freaks Forum Index

Post new topic   Reply to topic
View previous topic Printable version Log in to check your private messages View next topic
Author Message
alank2
PostPosted: Feb 03, 2012 - 04:13 AM
Posting Freak


Joined: Jul 16, 2009
Posts: 1579


Hi,

I'm trying to understand how the assembly it generates works. What do the 3 "rcall .+0" commands do? Put bytes on the stack? Why?

Code:


void timeout_disp()
  {
    if (levels.current[0])
    dsprintf_P(lcd_chars,MsgPdMin,levels.current[0]);
    else strcpy_P(lcd_chars,MsgOff);
  }

compiles to:

00001b68 <timeout_disp.3332>:
    1b68:   20 91 26 02    lds   r18, 0x0226
    1b6c:   87 e1          ldi   r24, 0x17   ; 23
    1b6e:   91 e0          ldi   r25, 0x01   ; 1
    1b70:   22 23          and   r18, r18
    1b72:   e1 f0          breq   .+56        ; 0x1bac <timeout_disp.3332+0x44>
    1b74:   00 d0          rcall   .+0         ; 0x1b76 <timeout_disp.3332+0xe>
    1b76:   00 d0          rcall   .+0         ; 0x1b78 <timeout_disp.3332+0x10>
    1b78:   00 d0          rcall   .+0         ; 0x1b7a <timeout_disp.3332+0x12>
    1b7a:   ed b7          in   r30, 0x3d   ; 61
    1b7c:   fe b7          in   r31, 0x3e   ; 62
    1b7e:   31 96          adiw   r30, 0x01   ; 1
    1b80:   ad b7          in   r26, 0x3d   ; 61
    1b82:   be b7          in   r27, 0x3e   ; 62
    1b84:   12 96          adiw   r26, 0x02   ; 2
    1b86:   9c 93          st   X, r25
    1b88:   8e 93          st   -X, r24
    1b8a:   11 97          sbiw   r26, 0x01   ; 1
    1b8c:   89 e3          ldi   r24, 0x39   ; 57
    1b8e:   91 e0          ldi   r25, 0x01   ; 1
    1b90:   93 83          std   Z+3, r25   ; 0x03
    1b92:   82 83          std   Z+2, r24   ; 0x02
    1b94:   24 83          std   Z+4, r18   ; 0x04
    1b96:   15 82          std   Z+5, r1   ; 0x05
    1b98:   ad dd          rcall   .-1190      ; 0x16f4 <dsprintf_P.2591>
    1b9a:   8d b7          in   r24, 0x3d   ; 61
    1b9c:   9e b7          in   r25, 0x3e   ; 62
    1b9e:   06 96          adiw   r24, 0x06   ; 6
    1ba0:   0f b6          in   r0, 0x3f   ; 63
    1ba2:   f8 94          cli
    1ba4:   9e bf          out   0x3e, r25   ; 62
    1ba6:   0f be          out   0x3f, r0   ; 63
    1ba8:   8d bf          out   0x3d, r24   ; 61
    1baa:   08 95          ret
    1bac:   65 e3          ldi   r22, 0x35   ; 53
    1bae:   71 e0          ldi   r23, 0x01   ; 1
    1bb0:   80 c3          rjmp   .+1792      ; 0x22b2 <strcpy_P>


Thanks,

Alan
 
 View user's profile Send private message  
Reply with quote Back to top
Kartman
PostPosted: Feb 03, 2012 - 06:46 AM
Raving lunatic


Joined: Dec 30, 2004
Posts: 8770
Location: Melbourne,Australia

Historically, C passes the variables on the stack. It looks like the code cleverly allocates 6 bytes by doing three rcalls then cleans it up before returning. I dare say the dsprintf vars are passed on the stack. Decoding compiler output is a good way to go crazy.

Why did it not use the registers to pass the 6 bytes? The compiler has rules on what regs can be touched between calls and ones that are preserved. It also has rules on how to pass variables. Once you know these rules, it becomes a bit clearer. Do I know the rules for gcc? No, but they're documented somewhere!
 
 View user's profile Send private message  
Reply with quote Back to top
abcminiuser
PostPosted: Feb 03, 2012 - 07:39 AM
Moderator


Joined: Jan 23, 2004
Posts: 9826
Location: Trondheim, Norway

Quote:

Do I know the rules for gcc? No, but they're documented somewhere!


A summary of the AVR-GCC ABI (Application Binary Interface) for register usage can be found here:

http://www.nongnu.org/avr-libc/user-man ... _reg_usage

- Dean Twisted Evil

_________________
Atmel Studio 6.1 is now released, grab it here.
Report AS6/ASF bugs here.
 
 View user's profile Send private message Send e-mail Visit poster's website 
Reply with quote Back to top
SprinterSB
PostPosted: Feb 03, 2012 - 09:32 AM
Posting Freak


Joined: Dec 21, 2006
Posts: 1483
Location: Saar-Lor-Lux

alank2 wrote:
Put bytes on the stack? Why?
Look at the prototypes of your functions. We can just guess that dsprintf_P is varargs and not (const char*, int, int). As you see from the code, just after the pushes the stack pointer is loaded to Z and some values stored on the stack. You see parameter passing in action for call to dsprintf_P. That function is static in the module which is static in the module, right?

Moreover, the position of the prologue indicates that timeout_disp is inlined in it's called. Again, you get the answer by looking at how and where timeout_disp is defined and used.

Moreover, this is bit misleading:
Quote:
compiles to:
because the GNU tools work as loosely coupled tools. From that standpoint: What you posted is not the compiler output, it's assembler output or disassembly of some object.

If you look at the compile output of newer versions if avr-gcc with -dp switch turned on, you see comments on what these instructions are good for. Suppose
Code:
extern void print (const char*, ...);

void call (int val)
{
    print ("%d", val);
}
This compiles with -mmcu=atmega128 -Os -dp to
Code:
call:
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
   push r25
   push r24
   ldi r24,lo8(.LC0)
   ldi r25,hi8(.LC0)
   push r25
   push r24
   call print
    ; SP += 4
   pop __tmp_reg__
   pop __tmp_reg__
   pop __tmp_reg__
   pop __tmp_reg__
   ret
And with -mmcu=atmega128 -Os -dp -maccumulate-args
Code:
call:
    ; SP -= 4
   rcall .
   rcall .
/* prologue: function */
/* outgoing args size = 4 */
/* frame size = 0 */
/* stack size = 4 */
.L__stack_usage = 4
   ldi r18,lo8(.LC0)
   ldi r19,hi8(.LC0)
   in r31,__SP_H__
   std Z+2,r19
   std Z+1,r18
   std Z+4,r25
   std Z+3,r24
   call print
/* epilogue start */
    ; SP += 4
   pop __tmp_reg__
   pop __tmp_reg__
   pop __tmp_reg__
   pop __tmp_reg__
   ret
The -maccumulate-args is similar to the old-style stack management.

BTW, the stack-usage report yields
Code:
file.c:3:6:call   4   dynamic,bounded
resp.
Code:
file.c:3:6:call   4   static


Last edited by SprinterSB on Feb 03, 2012 - 09:38 AM; edited 1 time in total
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
sternst
PostPosted: Feb 03, 2012 - 09:33 AM
Raving lunatic


Joined: Jul 23, 2001
Posts: 2438
Location: Osnabrueck, Germany

Kartman wrote:
Why did it not use the registers to pass the 6 bytes?
Because dsprintf_P is apparently a variadic function.

_________________
Stefan Ernst
 
 View user's profile Send private message  
Reply with quote Back to top
clawson
PostPosted: Feb 03, 2012 - 09:38 AM
10k+ Postman


Joined: Jul 18, 2005
Posts: 62299
Location: (using avr-gcc in) Finchingfield, Essex, England

Just to note that the RCALL trick is pretty standard for creating temporary storage on the stack. It's quicker for small amounts than doing stack arithmetic:
Code:
#include <stdio.h>
#include <avr/io.h>

int x, y;

void foo(void) {
   char buff[4];
   sprintf(buff, "%d%d", x, y);
}

int main(void) {
   while(1) {
   }
}

buff[4] in this is created with:
Code:

void foo(void) {
  94:   df 93          push   r29
  96:   cf 93          push   r28
  98:   00 d0          rcall   .+0         ; 0x9a <foo+0x6>
  9a:   00 d0          rcall   .+0         ; 0x9c <foo+0x8>
  9c:   cd b7          in   r28, 0x3d   ; 61
  9e:   de b7          in   r29, 0x3e   ; 62

Another common thing here is SP then loaded into Y which the compiler uses as a "stack frame pointer" so it can easily index the variables it has created on the stack.

The two RCALLs have created [4] bytes in this case. For [5] you will see:
Code:
  98:   00 d0          rcall   .+0         ; 0x9a <foo+0x6>
  9a:   00 d0          rcall   .+0         ; 0x9c <foo+0x8>
  9c:   0f 92          push   r0

The extra byte being created by a "PUSH R0". For [6] you see three RCALLs.

However when you reach [7] it now becomes quicker/more compact to mess with the stack:
Code:
  98:   cd b7          in   r28, 0x3d   ; 61
  9a:   de b7          in   r29, 0x3e   ; 62
  9c:   27 97          sbiw   r28, 0x07   ; 7
  9e:   0f b6          in   r0, 0x3f   ; 63
  a0:   f8 94          cli
  a2:   de bf          out   0x3e, r29   ; 62
  a4:   0f be          out   0x3f, r0   ; 63
  a6:   cd bf          out   0x3d, r28   ; 61

The 0x07 in the SBIW creates 7 bytes.

_________________
 
 View user's profile Send private message  
Reply with quote Back to top
SprinterSB
PostPosted: Feb 03, 2012 - 09:46 AM
Posting Freak


Joined: Dec 21, 2006
Posts: 1483
Location: Saar-Lor-Lux

clawson wrote:
Just to note that the RCALL trick is pretty standard for creating temporary storage on the stack.
As you see from my code above, the standard is changing just now...

None of my example above sets up a frame pointer so that Y need not to be saved/restored (Second example uses Z instead and first example need no pointer register at all), whereas in your code a frame is set up even though no frame is needed.

The transition happens from 4.6 to 4.7.
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
clawson
PostPosted: Feb 03, 2012 - 09:59 AM
10k+ Postman


Joined: Jul 18, 2005
Posts: 62299
Location: (using avr-gcc in) Finchingfield, Essex, England

Quote:

whereas in your code a frame is set up even though no frame is needed.

I guess it's often the case that a stack frame is needed though? So in the majority of code, apart from trivial examples like these, one probably would see Y being loaded from SP anyway?

But it's a nice optimisation.

By the way GCC is often ridiculed for the overhead in an ISR() when it's not always needed. Any plans to make that only come into play when needed?

_________________
 
 View user's profile Send private message  
Reply with quote Back to top
alank2
PostPosted: Feb 03, 2012 - 01:37 PM
Posting Freak


Joined: Jul 16, 2009
Posts: 1579


Wow, I never cease to be surprised by the knowledge you guys have. Impressive replies!

It just seemed like a lot of assembly for what didn't look like that complicated a function. For flash space reasons it looks like I should go easy on the variadic functions...

The reason I ask is that I am working on a butterfly logger that has a complete UI on the butterfly and provides calibrated results for the ADC values it logs. I'm still building the UI and already at 8950 bytes used so I'm starting to get a little concerned I may run out of flash.

It is implemented as a huge state machine and I've tried to be size conscious. I've enclosed the project so far if anyone wants to look at the butterfly.c and give me any tips on how to optimize it.

Thanks,

Alan
 
 View user's profile Send private message  
Reply with quote Back to top
clawson
PostPosted: Feb 03, 2012 - 01:55 PM
10k+ Postman


Joined: Jul 18, 2005
Posts: 62299
Location: (using avr-gcc in) Finchingfield, Essex, England

Quote:

. I'm still building the UI and already at 8950 bytes used so I'm starting to get a little concerned I may run out of flash.

Eh? It's a 16K chip - you are only just over half full. If you have lots of tables for the FSM or messages perhaps consider over-spilling them into EEROM. You'll already be doing pgm_read()s of the data so that might as easily be eeprom_read()s instead.

_________________
 
 View user's profile Send private message  
Reply with quote Back to top
alank2
PostPosted: Feb 03, 2012 - 07:02 PM
Posting Freak


Joined: Jul 16, 2009
Posts: 1579


Good point clawson. I'm planning on storing settings in eeprom so the flash if fully available for log entries, so hopefully it won't come to that.

OT from the original question, but here is a menu of what I plan it to support:



Thanks,

Alan
 
 View user's profile Send private message  
Reply with quote Back to top
SprinterSB
PostPosted: Feb 03, 2012 - 08:02 PM
Posting Freak


Joined: Dec 21, 2006
Posts: 1483
Location: Saar-Lor-Lux

alank2 wrote:
dsprintf.c
Local functions is a really dreaded feature of GCC, I don't think you *really* want local functions?

avr-gcc does not support trampolines because on most AVR devices you have no executable stack.

So your code might work just by accident and because you have optimization on and the compiler can cope without generating trampolines...
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
alank2
PostPosted: Feb 03, 2012 - 08:32 PM
Posting Freak


Joined: Jul 16, 2009
Posts: 1579


Hi,

Thanks for looking at the code SprinterSB; I appreciate it.

I've never had any trouble with the dsprintf code, but do you think I should rewrite it to remove the local functions? I did it merely so I could have access to the local variables of dsprintf without having to make them global or passing them back and forth...

Thanks,

Alan
 
 View user's profile Send private message  
Reply with quote Back to top
clawson
PostPosted: Feb 03, 2012 - 09:28 PM
10k+ Postman


Joined: Jul 18, 2005
Posts: 62299
Location: (using avr-gcc in) Finchingfield, Essex, England

Quote:

I've never had any trouble with the dsprintf code, but do you think I should rewrite it to remove the local functions? I did it merely so I could have access to the local variables of dsprintf without having to make them global or passing them back and forth...

Write that bit in C++ perhaps Confused

_________________
 
 View user's profile Send private message  
Reply with quote Back to top
SprinterSB
PostPosted: Feb 03, 2012 - 10:22 PM
Posting Freak


Joined: Dec 21, 2006
Posts: 1483
Location: Saar-Lor-Lux

alank2 wrote:
I've never had any trouble with the dsprintf code, but do you think I should rewrite it to remove the local functions?
If nested functions work for you and you like them, ... go ahead. Skimming the sources was, well, interesting.

clawson wrote:
Write that bit in C++ perhaps Confused
Just a side note: C++ cannot do what local functions can do. Suppose
Code:
int foo (int var)
{
    extern void do_job (void(*)(void));

    void inc (void)
    {
        var++;
    }

    do_job (inc);

    return var;
}

And somewhere else you implement do_job:
Code:
void do_job (void(*job)(void))
{
    job();
}
Quite obvious, do_job will look something like
Code:
do_job:
   movw r30,r24
   ijmp
i.e. do_job knows nothing about var, not even it's address. Thus, var's address is not passed to job aka. inc.

So the $64 question is: How to implement inc?

Notice that inc need access to var and cannot be inline because we need inc's address.

Also notice that the address of var might change depending on the context in which foo is called. foo and/or inc might even run more than once and still must produce correct results because foo is reentrant.
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
alank2
PostPosted: Feb 03, 2012 - 10:35 PM
Posting Freak


Joined: Jul 16, 2009
Posts: 1579


Hi,

SprinterSB wrote:
Skimming the sources was, well, interesting.


I won't get my feelings hurt (well, prolly not too bad! Smile ) if you have any tips or improvements to suggest. If you have any comments about the code you want to post or pm to me, I appreciate the feedback!

Thanks,

Alan
 
 View user's profile Send private message  
Reply with quote Back to top
SprinterSB
PostPosted: Feb 04, 2012 - 02:07 PM
Posting Freak


Joined: Dec 21, 2006
Posts: 1483
Location: Saar-Lor-Lux

clawson wrote:
Quote:
whereas in your code a frame is set up even though no frame is needed.
I guess it's often the case that a stack frame is needed though? So in the majority of code, apart from trivial examples like these, one probably would see Y being loaded from SP anyway?
As far as my AVR applications are concerned, no functions get a frame — except one printf-linke function that I used in a project or then and when for logging. But it's not an essential part of the productive build.

If I saw a frame generated for majority of functions or even 10% of functions in my code, I would consider that a "bug". Either in my software layout (very likely) or in avr-gcc (unlikely).

Maybe my projects are considerably different to "majority of user code"
  • no variadic functions
  • no local arrays
  • no 64-bit arithmetic
  • no inlining like mad
  • no addresses of auto variables or parameters
  • etc.

I grepped my last project for the frame size: *No* function (94 C functions after inlining contributing to 13k .text) has a frame size > 0. The biggest stack size (# call-saved regs pushed in prologue) is 17 for an ISR with some 2000 lines of code. The next lower stack size is 10. And Registers 2...8 are fixed, i.e. are not available to avr-gcc for allocation!

I also scanned the temporary assembler files I found in some old project folder. There are 186 hits for "frame size=0" (must be divided by 2 because both prologue and epilogue reported frame size back then). Not a single function reports a frame != 0.


Quote:
By the way GCC is often ridiculed for the overhead in an ISR() when it's not always needed. Any plans to make that only come into play when needed?
If someone comes up with a striking concept of how this can be approached with a reasonable effort/gain ratio I am open for discussion.

And no. "don't save/restore unused registers" is not a concept. It's the goal. The concept should
  • Not change the ABI and so that existing (inline) assembler remains correct and need not to be rewritten.
  • Ensure that code outside of ISRs does not grow. And if, what code growth would be acceptable?
  • The approach does not need complete rewrite of the AVR code in GCC. If a complete rewrite is needed, then supply resources for that
But as always, discussion will lead to nowhere if noone is willing to do the work.

So answer the question: Who will do it?

Peolpe waste hours and days and weeks to complain again and again and again about things that is pretty well known to everyone.

Complains do not change anything. Changes will change things.

So if it *really* matters for your project, hire a contract GCC expert and let her implement whatever you desire. And yes, there are companies that support their own avr-gcc, for example to backport bugfixes to older, orphaned versions they are still wirking with.

At the moment I don't see a reason for a change. Just because peoble are aware of the 4 or 5 superfluous instructions in a particular place of generated code, completely rewrite the AVR-part in GCC? And with no clue what overall impact on code quality and stability will be? You won't know before you actually did the work and benchmarked the two compilers against each other.
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
clawson
PostPosted: Feb 04, 2012 - 02:21 PM
10k+ Postman


Joined: Jul 18, 2005
Posts: 62299
Location: (using avr-gcc in) Finchingfield, Essex, England

Quote:

no local arrays

In my world that's fairly unusual - but I guess variety is the spice of life! Wink
Quote:

Complains do not change anything. Changes will change things.

I'm not complaining. It's generally those using OTHER compilers for AVR that point to this as a GCC deficiency. If I personally were writing an ISR that were that time critical I'd either do the whole thing in Asm or "naked" and a bit of asm() wrapping perhaps.

_________________
 
 View user's profile Send private message  
Reply with quote Back to top
SprinterSB
PostPosted: Feb 04, 2012 - 03:53 PM
Posting Freak


Joined: Dec 21, 2006
Posts: 1483
Location: Saar-Lor-Lux

clawson wrote:
Quote:
no local arrays
In my world that's fairly unusual
Let me be more precise: I avoid local auto, i.e. alloca'ed arrays. There are function that use local static arrays.
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
skeeve
PostPosted: Feb 04, 2012 - 09:52 PM
Raving lunatic


Joined: Oct 29, 2006
Posts: 2651


SprinterSB wrote:
Quote:
By the way GCC is often ridiculed for the overhead in an ISR() when it's not always needed. Any plans to make that only come into play when needed?
If someone comes up with a striking concept of how this can be approached with a reasonable effort/gain ratio I am open for discussion.

And no. "don't save/restore unused registers" is not a concept. It's the goal. The concept should
  • Not change the ABI and so that existing (inline) assembler remains correct and need not to be rewritten.
  • Ensure that code outside of ISRs does not grow. And if, what code growth would be acceptable?
Is such growth even possible in an otherwise correct solution?
Quote:
  • The approach does not need complete rewrite of the AVR code in GCC. If a complete rewrite is needed, then supply resources for that
But as always, discussion will lead to nowhere if noone is willing to do the work.
Are the call-saved and call-used registers hard-coded for ordinary functions?
If not, supplying ISRs with different lists would seem an obvious thing to do.
Does the need to sometimes save SREG mess this up?

_________________
Michael Hennebry
Iluvatar is the better part of Valar.
 
 View user's profile Send private message  
Reply with quote Back to top
SprinterSB
PostPosted: Feb 05, 2012 - 01:02 PM
Posting Freak


Joined: Dec 21, 2006
Posts: 1483
Location: Saar-Lor-Lux

skeeve wrote:
SprinterSB wrote:
Ensure that code outside of ISRs does not grow. And if, what code growth would be acceptable?
Is such growth even possible in an otherwise correct solution?
Yes, I think so. But you don't know before you have implemented it.

Quote:
Are the call-saved and call-used registers hard-coded for ordinary functions?
If not, supplying ISRs with different lists would seem an obvious thing to do.
Problem is not the ABI because an ISR must save/restore anything it might clobber, anyways.

The issue in the current imlpementation is that R0 and R1 are neither call-saved nor call-clobbered, they are fixed and used implicitely, i.e. the compiler machinery does not manage their contents or even has an idea what their content is or if they are used at all. Same for T:

R0, T: fixed, used in an insn-clobbered way
R1: fixed, used in an insn-saved way

Moreover, whether R0, R1, T is actually needed or not, is not known before reload, i.e. before the non-strict RTL → strict RTL transition.

Soon after reload there is pro/epilogue generation. Afterwards, there are more optimization passes like text and RTL peephole, split passes, register renaming and propagations, machine dependent reorg, etc. That might change R0/R1/T usage.

How do you ship the information on R0/R1/T at all? As explicit RTL? As insn attributes? When do you generate that information? When and how do you use it? How do you ensure debug information is not currupt if you change prologue late after prologue generation? How do you cope with the needs of specific insns like emit different code depending on optimization switches and register classes used? You *really* don't want to quadruple the numer of constraint alternatives...
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
clawson
PostPosted: Feb 05, 2012 - 01:44 PM
10k+ Postman


Joined: Jul 18, 2005
Posts: 62299
Location: (using avr-gcc in) Finchingfield, Essex, England

Georg-Johann,

Where do you get all this knowledge from? I know there's a GCC "internals" manual - is it all from that, or from studying the code, or from talking to other developers? I'm sure there are others like myself who would LIKE to contribute to GCC but don't have the foggiest idea where to start reading about its operation.

Cliff

_________________
 
 View user's profile Send private message  
Reply with quote Back to top
SprinterSB
PostPosted: Feb 05, 2012 - 08:34 PM
Posting Freak


Joined: Dec 21, 2006
Posts: 1483
Location: Saar-Lor-Lux

clawson wrote:
Where do you get all this knowledge from? I know there's a GCC "internals" manual - is it all from that, or from studying the code, or from talking to other developers?
There are many ways how to get better understanding of GCC:

Communicate with experienced developers
The most valuable resource of information. Unfortunately, expert GCC developers are very rare and very busy. If you don't already have an understanding of how things work, an extert's answer won't help you nothing or even serves to confuse you even more. And most likely, the expert won't explain the whole story and write novels because it's simpl much tooo time consuming...

Read the internals
If you are interested in AVR, most interesting parts are Machine Description and Target Description and therein Standard Names. You won't understand everything reading it the first time. Just skim and get an idea what parts are there to describe a target.

In particular, one of the first things you'll want to learn is what an insn is, what its components are, how they are named and when and where and how they are used. If you don't know what a "predicate" or a "constraint" is, an answer from an expert won't help you.

Read compiler dumps
If you are interested in a backend, -da (same -fdump-rtl-all) lets RTL passes generate dumps. You can see how the compiled code evolves as the pass number increases. You can also dump the final RTL within asm with -save-temps -dP and use the assembler code side-by-side to RTL as Rosetta stone to decipher and learn RTL.

Similarly, the tree dumps can be generated with -fdump-tree-all. These dumps are C-like and you understand them immediately, but for backend they are not as important as RTL.

Learn more about passes
Get an understanding what passes are run and what they are good for. Is a pass inevitable and runs even at -O0 or is it "just" an optimization pass? What piece of information fom the backend is used where and how?

Read the sources
Try to figure out how a trivial program gets compiled and transformed. Try to identify the respective parts in the backend (for the RTL passes). If you work for AVR, you'll most likely end up in avr.md or avr.c.

Vice versa, try to learn what piece of backend is used/essencial in what pass.

If you have to solve a problem for avr, you can look at other backends and how they address similar problems. Avoid complicated backends like x86 or rs6000 (PowerPC).

Browse the source and build tree
Try to find your way through the sources files and the build directory: What is located where? Where is the AVR backend? And where libgcc? What files are auto-generated? What happens to avr.md as it is transformed to C?

Change the compiler
Try to fix a problem or just make the compiler generate the code that you desire or print some log output like
Code:
avr_edump ("%?: I'm here!\n");
Try to work out if your change covers all cases imagineable. Observe how the intermediate representation and dumps changes as you change the compiler.

Debug the compiler
Look at the compiler in action. Make sure you don't debug the driver xgcc but the compile proper cc1 resp. cc1plus. Learn how to display RTL and tree in gdb. Build the compiler with debug info and without optimization. Learn to run the compiler from build directory without installing it.

Don't get frustrated
GCC is very complex and it's lerning curve is steep. If you have problems to understand something or to follow what's happening, it's not because you are too dumb. It's simply because it's really complex and new and intertwined and historically grown and GCC is faster evolvong than its documentation. GCC is not a compiler for educational purposes. It's a real world compiler supporting more hardware than any other compiler.


Quote:
I'm sure there are others like myself who would LIKE to contribute to GCC but don't have the foggiest idea where to start reading about its operation.
Even if it's too time consuming or too complex to contribute to the compiler proper you can help avr-gcc. Just some examples:

File a Problem Report if you hit a problem
If a problem is not reported, it's unlikely it will be fixed.

Test new features
like new command line options, address spaces, built-in types, etc.

Help writing test programs
Due to lack of time, there is not a single test program for AVR-specific features like upcoming __flash or __memx in the GCC testsuite. Contributing to the testsuite does not require changing the compiler or understanding its internals. You just need to know some magic comments to advise the framework.

Extend avrtest
Similarly, there is not a single test program to test ISR code. One reason is that avrtest — the simulator used to run avr-gcc testsuite — has no IRQ support. One way could be to trigger soft-IRQ by special sequence like SEH SEH in the simulator when requested per option. Or use the I-flag and set IRQ frequency and burst by writing to some magic SFR.
 
 View user's profile Send private message Visit poster's website 
Reply with quote Back to top
clawson
PostPosted: Feb 05, 2012 - 09:23 PM
10k+ Postman


Joined: Jul 18, 2005
Posts: 62299
Location: (using avr-gcc in) Finchingfield, Essex, England

Great answer! Smile

_________________
 
 View user's profile Send private message  
Reply with quote Back to top
alank2
PostPosted: Feb 05, 2012 - 11:30 PM
Posting Freak


Joined: Jul 16, 2009
Posts: 1579


Hi SprinterSB,

I bow to your knowledge!!! Very impressive replies in this thread.

By the way, I've reworked my dsprintf using the state machine concept. I shaved a few hundred bytes of flash off of it, implemented it as a state machine, added hexidecimal output, added 64 bit integer support, and added tunable settings do you can include or exclude features might need.

Please take a look at it and let me know if you like it better than what I had! Again if you see a way to improve it, please let me know!!

Thanks,

Alan
 
 View user's profile Send private message  
Reply with quote Back to top
Display posts from previous:     
Jump to:  
All times are GMT + 1 Hour
Post new topic   Reply to topic
View previous topic Printable version Log in to check your private messages View next topic
Powered by PNphpBB2 © 2003-2006 The PNphpBB Group
Credits