gcc copying anonymous literals to SRAM

Go To Last Post
9 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

When gcc-avr encounters an anonymous literal like printf("David"), the "David" is copied to SRAM so that printf() is always passed a SRAM address.

I would have assumed that:

printf("hello world"); // string in CODE
char string[] = "universe"; // string in DATA
char empty[10]; // string in BSS

The default behaviour means that using normal functions like strcmp() will always be looking at strings in SRAM.

The alternatives are having generic pointers or having different functions like strcmp() and strcmp_P().

However it does use up valuable SRAM.

Now I know that I can wrap literals with PSTR() etc.
And I can use modifiers to force stuff into CODE. And use pgm_read_byte() etc to access them.

My questions are:

1. is there a gcc-avr switch to leave literals in FLASH where they belong ?

2. is there a ld-avr switch to tell you when you have run out of SRAM ?

3. is there a tidy way of writing legible code that forces gcc-avr to put literals in CODE without switches ?

4. have I missed everything in the gcc-avr documentation ?

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Well, it seems you found the documention on *_P, but missed the *why* of it all.

Memory accesses to program flash require the compiler to generate different code. The C language specification requires particular semantics for char* parameter declarations (in particular the callee, in general, can modify the target of the char*). On separately compiled modules, the compiler has no knowledge of how data is declared in other modules. Stir those three facts together, and think for a while. When you have a more elegant solution than the _P functions and macros, let us know.

In the mean time, programming for AVR's requires thoughtful design. avr-gcc is very careful to implement standard C semantics -- a very good thing IMHO -- but it means that there will be warts because standard C is an awkward fit on Harvard architectures.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

David,

Methinks you need to read this:

https://www.avrfreaks.net/index.p...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
David,

Methinks you need to read this:

https://www.avrfreaks.net/index.p...

Thanks for your reply. Yes I have read both the tutorial and the threads.

I have solved my porting problem by wrapping every function that has anonymous literal parameters with macros eg:

#define D(x) GNU_PRINTF x
#define GNU_PRINTF(fmt, ...) printf_P(PSTR(fmt), __VA_ARGS__)

which is no great problem because I only use printf via the debugging macro. The macro is conditional to the platform and the source code is unaltered.

My other functions that expect to have pointers to CODE space have to be divided into macros to handle literals and pointers:

#define funclit(x)    func_P(PSTR(x))
#define funcptr(x)    func_P(x)

If I had more regular code that used printf directly, then you cannot wrap this because printf has already been encountered in .
ie it is not nice to #define printf GNU_PRINTF
to be able to put the PSTR() wrapper around the fmt.

My main point is that anonymous literals have no address and certainly should not be altered. If you try this under Unix you generally get memory exceptions.

I have no problem with addressable const data being copied to SRAM by default. It is up to the user to put it somewhere else if she wants to.

I await the magic bullet. Just one switch on the compiler.

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

dbc wrote:
Well, it seems you found the documention on *_P, but missed the *why* of it all.

Memory accesses to program flash require the compiler to generate different code. The C language specification requires particular semantics for char* parameter declarations (in particular the callee, in general, can modify the target of the char*).

Where in the C language does it specify that you can alter const data ? Especially const anonymous data.

I agree that there are ways of doing so, but this does not make it legal. It is also why you have memory protection schemes, the simplest of which is ROM.

Most popular microcontrollers have a Harvard architecture. They are also short of resources. Most compilers bear this in mind.

In my other reply, I explained how I have attempted to get round my problem.

I hope to get some insight into some better methods for doing this.

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
Where in the C language does it specify that you can alter const data ? Especially const anonymous data.

void foo (char *s){...}; // compiler must assume foo can modify data at s
void foo (const char *s) { ... }; // foo can not modify data at s

whether or not a literal is passed into foo via s is not, in general, known at compile-time for foo. Are you expecting the linker to do code selection? Remember, it takes different code to access ROM'ed constants.

Quote:
If you try this under Unix you generally get memory exceptions.

That's an artifact of having page tables and the option of having independant read,write, and execute protection flags in modern page table entries. It's a convenience, not part of the C spec.

What I hear you saying is that gcc doesn't compile C the way you want it too. Unfortunately for you, it compiles C the way it is supposed to (in most cases :)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

dbc wrote:
Quote:
What I hear you saying is that gcc doesn't compile C the way you want it too. Unfortunately for you, it compiles C the way it is supposed to (in most cases :)

I take your point.

C for microcontrollers inevitably involves compromises. I have found my own workaround. I hope to receive better solutions from this forum.

Without the gcc workarounds, my application takes about 7000 words with CodeVision and 8000 words with gcc. But more importantly it will run on an 8051 with 1k internal XRAM or a Mega16 with 1k SRAM. Using gcc requires 2k SRAM and hence a Mega32. Which is overkill for code size.

David

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The problem is, any workaround which you might use to make literals remain in Flash instead of being duplicated in SRAM will unavoidably cause those literals to be useless for any function that expects a char* argument type. Which clearly would be in contradiction of the C specification.

My typical workaround is to avoid the use of anonymous string literals in the first place. I just always use named string variables, and apply the PROGMEM attribute. This can be easily abstracted to some other form using macros if I ever need to port the code to some other compiler/architecture.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

lfmorrison wrote:
The problem is, any workaround which you might use to make literals remain in Flash instead of being duplicated in SRAM will unavoidably cause those literals to be useless for any function that expects a char* argument type. Which clearly would be in contradiction of the C specification.

My typical workaround is to avoid the use of anonymous string literals in the first place. I just always use named string variables, and apply the PROGMEM attribute. This can be easily abstracted to some other form using macros if I ever need to port the code to some other compiler/architecture.

Thanks to everyone for your comments and advice.

The semantics of printf(const char *format, ...) can be handled in different ways by microcontroller C compilers.

Now the const should mean that format is NOT modified by the printf() function. 99% of the time this format string is an anonymous literal. 1% (in my experience) of the time is format a built-up string.

I agree that forcing the user to differentiate between printf() and printf_P() is the downside of not having generic pointers. The default position would always be to use the common name which requires a SRAM pointer.

The CodeVision compiler uses printf(flash char *format, ...) because 99% of calls use a literal format. And you have to use the putsf("David") variant of puts(var_david) as compared to gcc using puts_P(PSTR("David")).

For the 1% of printf(format) calls, CV requires you to write your own printf_S(char *sram_fmt, ...) function.

My conclusions for portable code:

1. Generic pointers would be the most transparent solution but inefficient.

2. Copying anonymous data to SRAM to ensure transparent use of library functions is very wasteful of SRAM.

3. The CodeVision behaviour that forces you to use the specific func_P() variant of func() takes some getting used to. But is efficient in SRAM.

4. Gcc does not have a switch for literal storage.

5. Wrapping functions with macros is a practical solution. The macros can live in a common header file, leaving the source files looking conventional.

David.