Search |
 |
|
 |
| Author |
Message |
|
|
Posted: Oct 17, 2011 - 04:32 PM |
|


Joined: Dec 21, 2006
Posts: 1483
Location: Saar-Lor-Lux
|
|
|
TFrancuz wrote:
If we are talking about code size – why not to change ABI, and move zero_reg (R1) to some other register, e.g. R2? [...] but it should be easy to implement, right?
- It's an ABI change and it will render assembler programs and inline assembler incorrect. Thus you'd like to have an option like -mabi2.
- Because it's an ABI change, such an option must be a multilib option for libgcc and avr-libc at least.
- I doubt it is easy to implement in avr-libc because great deal of algorithms like floating point are hand-written assembler.
- In the compiler, you'd have to double or at least to rewrite if you bust the old ABI
- All patterns that involve multiplication including multiply-add and multiply-subtract and fmul* builtins.
- All libgcc implementation of multiplications and fmul* builtins
- Change R0/R1 from fixed to call-clobbered
- Rewrite ISR pro- and epilogue
- Review the backend for places that change/clear zero_reg and find a replacement not temporarily changing zero_reg.
- In (inline) assembler is's no more sufficient to clear zero_reg if it was used temporarily. Instead, such code parts must be rewritten to be atomic.
Regarding all these points I wouln't call it "easy to implement", not for avr-gcc, not for avr-libc and not for applications using (inline) assembler with C/C++ and parameterize depending on some #ifdef __AVR_ABI2 built-in macro. |
|
|
| |
|
|
|
|
|
Posted: Oct 17, 2011 - 06:49 PM |
|

Joined: Jan 10, 2003
Posts: 533
|
|
So it is interesting to calculate how often R1 is reloaded in real application because of mul – so it will give us an idea if the effort you mentioned, is balanced by some space savings.
Unfortunately my apps are not using complicated math equations, so mul is rarely used.
BTW, I noticed that you applied some patches needed to implement VTABLES in C++ to be stored in FLASH. Can you tell me if this feature will be finally implemented in reasonable future, or I should rather forget about that? |
|
|
| |
|
|
|
|
|
Posted: Oct 17, 2011 - 07:29 PM |
|


Joined: Dec 21, 2006
Posts: 1483
Location: Saar-Lor-Lux
|
|
|
TFrancuz wrote:
So it is interesting to calculate how often R1 is reloaded in real application because of mul
In my application from above of about 14k bytes flash occupied there are ~60 CLR instructions on zero_reg as of
Code:
grep 'clr[ \t]*__zero_reg__' *.s | wc -l
It's not "complicated math" but just (fixed-point)arithmetic using [F]MUL*.
Quote:
so it will give us an idea if the effort you mentioned, is balanced by some space savings.
There will of course be space saving and anyone just a bit familiar with avr-gcc will guess correctly on that.
Quote:
BTW, I noticed that you applied some patches needed to implement VTABLES in C++ to be stored in FLASH. Can you tell me if this feature will be finally implemented in reasonable future, or I should rather forget about that?
No, there are no changes "applied". I just built a version with some patches attached to the PR, but nothing of it is approved or upstream.
Named address spaces is no feature specified for C++. It's C only.
I cannot say if an "hidden" address space that'n not exposed to the user would do the trick by, e.g., tagging respective pointers with AS information. But as AS is no C++ feature, it's not very likely that this will work smooth and AS information is tracked consistently throughout the C++ part (of at all). GCC is far too complex for me to be familiar with that spot in the compiler world. I know my limits and at the moment I don't think I should touch it. Besides that, it's much more effort than to work in the avr sandbox because changing the front/middle-ends has side effects on any platfrom, not just avr.
As I am not interested in C++. The language is utter ugly IMHO and has so many backdraws that I avoid it and use Java or Python or whatever on PC. And on AVR, C is completely reasonable for me. Thus, I am not inclined to get into the C++ mess. |
|
|
| |
|
|
|
|
|
Posted: Oct 17, 2011 - 10:45 PM |
|

Joined: Jan 10, 2003
Posts: 533
|
|
| It’s a pity that Atmel don’t support avr-gcc active development (at least the support is not very big), instead of they produce a messy toolchain and completely crazy AS5. It seems that arm-gcc is more actively developed, so probably it’s time to dive into ARM-world. To me it is a conclusion of the whole topic. |
|
|
| |
|
|
|
|
|
Posted: Oct 17, 2011 - 11:25 PM |
|


Joined: Dec 21, 2006
Posts: 1483
Location: Saar-Lor-Lux
|
|
At least in official GCC, ARM gets much more notice than AVR and there are many contributors, e.g. from codesourcery. But I don't know if ARM is still involved in codesourcery.
Alternatively, guys here could improve AVR support in GCC. I think there are many very experienced programmers here — not only with respect to µC but also to host programming — that have excellent programming and analytical skills. |
|
|
| |
|
|
|
|
|
Posted: Oct 18, 2011 - 02:36 PM |
|

Joined: Aug 06, 2008
Posts: 141
Location: Montréal, QC
|
|
|
SprinterSB wrote:
Alternatively, guys here could improve AVR support in GCC. I think there are many very experienced programmers here — not only with respect to µC but also to host programming — that have excellent programming and analytical skills.
I wrote some compilers years ago (Pascal to 68000 for instance) using lex/yacc or flex/bison, I understand the way it works, but optimization is another story
Also I do not know the assembly enough on the AVR, I should go deeper into it before attempting to modify the gcc/config/avr/ folder
GCC 4.7.0 is the first one since 4.3.0 that improve my code size. If I can help you by doing some tests or whatever, just ask! |
|
|
| |
|
|
|
|
|
Posted: Oct 20, 2011 - 09:56 PM |
|


Joined: Dec 21, 2006
Posts: 1483
Location: Saar-Lor-Lux
|
|
|
Magister wrote:
SprinterSB wrote:
Alternatively, guys here could improve AVR support in GCC. I think there are many very experienced programmers here — not only with respect to µC but also to host programming — that have excellent programming and analytical skills.
I wrote some compilers years ago (Pascal to 68000 for instance) using lex/yacc or flex/bison, I understand the way it works, but optimization is another story
There are two "flavour" of optimization: Optimization algorithms that are already present in GCC. They might not produce best results for unimportant targets like AVR that are not in the center developers' focus. And there are mini-optimization in the AVR-only part like printing instructions smarter, working out better cost functions, do code cleanup, write test cases for AVR-specific features, etc.
The overall compiler infrastructure is already there so there is no need to bother with lexing/parsing/syntax/language specificatons... — except in the case one feels inclined to work in that area.
But no matter what field you pick: GCC is a real-world compiler and not a finger exercise from university. It's not easy to get a start and to understand what goes on where and why and what to change to achive this or that. The AVR back-end is not linear code but instead a zoo of target hooks called from somewhere at some time.
Quote:
Also I do not know the assembly enough on the AVR, I should go deeper into it before attempting to modify the gcc/config/avr/ folder
Yes, you need an idea of what good/bad/correct/incorrect code is and learn about GCC's internal representations of code. Assembler is only the very last step of "dumping" the internal information.
Quote:
GCC 4.7.0 is the first one since 4.3.0 that improve my code size. If I can help you by doing some tests or whatever, just ask!
If you like to hunt for errors, you will most likely find one in an area where there had been changes, e.g. some internal cleanup to progmem handling, using muliply-add like instruction sequences to save some ticks/bytes, tweak code expansion for switch/case, built-ins, ....
Above you find a list of fixed PRs that give you an idea what changed between 4.6 and 4.7. And the bugfixes for 4.6 mentioned above are part of 4.7, too, of course.
Then there is no good test coverage for the __pgm qualifier. There is no code in GCC test suite for that feature so that respective kines in AVR part is dead code from the test suite's perspective.
I never run code generated with __pgm feature — not even on a simulator — so you might be the first to try it out  |
|
|
| |
|
|
|
|
|
Posted: Oct 21, 2011 - 04:21 PM |
|

Joined: Aug 06, 2008
Posts: 141
Location: Montréal, QC
|
|
|
SprinterSB wrote:
The AVR back-end is not linear code but instead a zoo of target hooks called from somewhere at some time.
Saw this, I'm sure it takes weeks to understand how it works :-/ it's a full time job!
Quote:
If you like to hunt for errors, you will most likely find one in an area where there had been changes, e.g. some internal cleanup to progmem handling, using muliply-add like instruction sequences to save some ticks/bytes, tweak code expansion for switch/case, built-ins, ....
Trying some options I managed to have the hex code smaller than with 4.3.0. It has been almost 3 years since I have seen this (seems related to bug #49881), I uploaded the code on my board and runs every functions for a couple of days, everything works fine, so this 4.7 is not that bad really. I have a lot of interger maths in it and progmem (mainly for string printing).
I'll take a look at the bug list you mentionned and also the __pgm.
I saw that trying the -fmerge-all-constants make the compiler stop saying
Code:
confused by earlier errors, bailing out
I also have a couple of warning like
Code:
warning: uninitialized variable 'blabla' put into program memory area [-Wuninitialized]
|
|
|
| |
|
|
|
|
|
Posted: Oct 21, 2011 - 05:35 PM |
|


Joined: Dec 21, 2006
Posts: 1483
Location: Saar-Lor-Lux
|
|
First appears to be PR50739.
If the second is not PR50807, could you give an example? A one-liner will probably do already.
The current implementation of argument pushing using PUSHes is fine, but to get rid of there arguments is tedious and might lead to unpleasant code if there are many functions getting stack arguments.
There is ACCUMULATE_OUTGOING_ARGS which is not implemented and will take some time to do and test, in particular with -mcall-prologues but without frame pointer. Dunno if there is enough time to do it for 4.7. |
|
|
| |
|
|
|
|
|
Posted: Oct 21, 2011 - 06:30 PM |
|

Joined: Aug 06, 2008
Posts: 141
Location: Montréal, QC
|
|
Yup 1st one seems to be PR50739.
For the second one, I have a mystring.h
extern const prog_char str1[];
extern const prog_char str2[];
extern const prog_char str3[];
then in mystring.c I have:
const prog_char str1[] ="bla1";
const prog_char str2[] ="bla2";
const prog_char str3[] ="bla3";
In the main.c file I include the mystring.h and at compile it gives me the warning when it uses str1 (or 2 or 3). I guess the warning is valid but until 4.7.0 it was not here.
I also noticed that I can not use -ffreestanding else delay() can not compile because of missing fabs and fceil IIRC. |
|
|
| |
|
|
|
|
|
Posted: Oct 21, 2011 - 06:52 PM |
|


Joined: Dec 21, 2006
Posts: 1483
Location: Saar-Lor-Lux
|
|
Following code:
Code:
#define PROGMEM __attribute__((progmem))
extern const char PROGMEM str1[];
const char PROGMEM str1[] = "bla1";
compiles correct and without warning both with avr-gcc and avr-g++.
I did not look into prog_char because it's unspecified behaviour so you use it at your own risk.
With freestanding you are left alone on the silicon, most likely you do not want freestanding, e.g. you like library support. |
_________________ avr-gcc News • ABI • Options • 4.8-Windows • Inline Asm
|
| |
|
|
|
|
|
Posted: Oct 21, 2011 - 07:02 PM |
|

Joined: Aug 06, 2008
Posts: 141
Location: Montréal, QC
|
|
|
SprinterSB wrote:
I never run code generated with __pgm feature — not even on a simulator — so you might be the first to try it out
EDIT: misunderstood, it does not work as follow...
-----------------------------
It works, basic test shows it is working
Code:
const char* str;
if(a)
str=(const __pgm char*) "0123";
else
str=(const __pgm char*) "4567";
lcd_print(str);
Output the right string on my LCD. I have a lcd_print() and a lcd_print_P() function that I wrote. The second one use a pgm_read_byte() in it. Using the new address space means I can get rid of it, nice. Now to see if duplicate strings will be only once in flash.
In my project, changed all the progmem variable to __pgm, removed all the _P reference and all, seems to work fine. There is only strnlen_P which is linked (used internally by vfprintf). A fine diff is output of avr-size that now reports about 1K less text and 1K more data (is this normal or is it because all my data are copied into RAM?). avr-strings | sort | uniq -d shows no duplicate. |
Last edited by Magister on Oct 22, 2011 - 02:14 PM; edited 1 time in total
|
| |
|
|
|
|
|
Posted: Oct 21, 2011 - 10:38 PM |
|


Joined: Dec 21, 2006
Posts: 1483
Location: Saar-Lor-Lux
|
|
|
Magister wrote:
Code:
const char* str;
str = (const __pgm char*) "0123";
lcd_print (str);
Maybe you misunderstood how it works. The code above will put the string literals into RAM and access them from RAM (as your lcd_printf accesses RAM). The cast is superfluous and just serves confusion.
To access RAM, you write
Code:
char lcd_print (const char *str)
{
return *str;
}
and to access flash
Code:
char lcd_print_P (const char __pgm *str)
{
return *str;
}
Pointers are still 16 bits wide, but there is second flavour of pointers.
Calling these functions is straight forward. But in particular initializing pointers to flash with string literals that shall be located in flash is not straight forward and error prone:
Code:
#include <avr/pgmspace.h>
#define PGM_STR(X) ((const __pgm char[]) { X })
char const __pgm *gstr = PGM_STR ("123");
char const __pgm text[] = "abc";
void foo (const char __pgm *str2)
{
void lcd_print_P (const char __pgm*);
const char __pgm *str = PSTR ("0123");
static const char __pgm stext[] = "abc";
str2 = PSTR ("abc");
lcd_print_P (str);
lcd_print_P (str2);
lcd_print_P (gstr);
lcd_print_P (text);
lcd_print_P (stext);
}
All this will work.
However, the following does not work (resp. it works as doomed be the standard):
Code:
const char __pgm *str = "foo";
The new standard extension condemns string literal like above to live in default address space, which is .rodata, which is RAM.
String merging should work for strings in .rodata, but merging for strings in .progmem and string literals is not yet complete and is low priority. |
|
|
| |
|
|
|
|
|
Posted: Oct 22, 2011 - 02:12 PM |
|

Joined: Aug 06, 2008
Posts: 141
Location: Montréal, QC
|
|
| *oups* yes I misunderstood, I will redo some tests, thanks for all the info! |
|
|
| |
|
|
|
|
|
Posted: Oct 22, 2011 - 07:25 PM |
|


Joined: Dec 21, 2006
Posts: 1483
Location: Saar-Lor-Lux
|
|
Just compiled and run my first example.
Run on an ATmega168:
Code:
#include <stdio.h>
#define _PGM const __pgm
#define PGM_STR(X) ((_PGM char[]) { X })
typedef struct s_tree
{
char _PGM * val;
struct s_tree * left;
struct s_tree _PGM * right;
} tree_t;
tree_t A = { PGM_STR ("a"), NULL, NULL };
tree_t _PGM B = { PGM_STR ("b"), NULL, NULL };
tree_t C = { PGM_STR ("c"), NULL, NULL };
tree_t _PGM D = { PGM_STR ("d"), NULL, NULL };
tree_t AB = { PGM_STR ("A"), &A, &B };
tree_t _PGM CD = { PGM_STR ("C"), &C, &D };
tree_t _PGM H = { PGM_STR ("*"), &AB, &CD };
void print_tree_P (tree_t _PGM*);
void print_tree (tree_t*);
void print_tree_P (tree_t _PGM * t)
{
if (!t)
return;
printf ("[%c]", *t->val);
print_tree (t->left);
print_tree_P (t->right);
}
void print_tree (tree_t * t)
{
if (!t)
return;
printf ("[%c]", *t->val);
print_tree (t->left);
print_tree_P (t->right);
}
void testit (void)
{
printf ("\nStart\n");
print_tree_P (&H);
printf ("\nDone\n");
}
It's a tree structure that is scattered over Flash and RAM: Left childs (A, AB, C) are in RAM, Head and right childs (H, CD, B, D) are located in Flash.
printf is set up to print to UART. The output is:
Code:
Start
[*][A][a][b][C][c][d]
Done
So at least this small example works fine
At first I got garbage because I wrote
Code:
printf ("[%s]", t->val);
However, printf's %s will read from RAM but get pointer from flash so I changed it to
Code:
printf ("[%c]", *t->val);
|
|
|
| |
|
|
|
|
|
Posted: Oct 24, 2011 - 02:54 PM |
|

Joined: Aug 06, 2008
Posts: 141
Location: Montréal, QC
|
|
| @SprinterSB: as this thread is about 4.6.1 and LTO I posted a new thread about 4.7 and PGM |
|
|
| |
|
|
|
|
|
Posted: Nov 02, 2011 - 11:18 AM |
|

Joined: Aug 18, 2006
Posts: 137
|
|
|
abcminiuser wrote:
LTO and other optimizations should make for smaller binaries, which can lead to faster programs and/or cheaper designs.
- Dean
At the risk of being repetitive, if you want smaller binaries through smarter linking, just use standard C.
That is, #include your source files into a single compilation unit.
This mechanism is standard, cross-platform, portable, and has worked for the past 30+ years: R included the mechanism in the first versions of C, K& wrote about it. It works with all versions of WinAVR.
Personally, I'm a little irritated when I see people using non-standard, platform specific work-arounds for something that is already well supported in standard C: it makes my work as a cross-platform maintenance programmer just that extra bit more difficult. |
|
|
| |
|
|
|
|
|
Posted: Nov 02, 2011 - 11:43 AM |
|


Joined: Jul 18, 2005
Posts: 62228
Location: (using avr-gcc in) Finchingfield, Essex, England
|
|
|
Quote:
That is, #include your source files into a single compilation unit.
Why do that when --combine -fwhole-program exists? |
_________________
|
| |
|
|
|
|
|
Posted: Nov 02, 2011 - 12:06 PM |
|


Joined: Jul 23, 2001
Posts: 2437
Location: Osnabrueck, Germany
|
|
|
|
|
|
|
Posted: Nov 02, 2011 - 12:18 PM |
|


Joined: Jul 18, 2005
Posts: 62228
Location: (using avr-gcc in) Finchingfield, Essex, England
|
|
| The avr-gcc currently in widest use is the 4.3.3 in WinAVR so surely it's still there in that? No doubt folks will prefer the LTO solution but only when there's a general distribution widely available that contains it. |
_________________
|
| |
|
|
|
|
|
|
|
|