[request for comment] zero-stripped clone of PSTR macro

Go To Last Post
11 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This is again a suggestion for a minor - related enhancement.

In small embedded programs, often strings of a fixed length are to be output, for example on a LCD display. Thus, some of the functions to accomplish this take as argument a string, but they ignore the trailing zero anyway.

To spare FLASH, the following definition/initialisation can be used for a string to be placed to FLASH:

char s[strlen("ABCD")] = "ABCD";

This occupies exactly 4 bytes in FLASH.

As an analogy to PSTR macro and memcpy_P() function, using this feature, I propose the following macros to be added to :

# define PSNZ(s) (__extension__({static char __c[__builtin_strlen(s)] PROGMEM = (s); &__c[0];}))
# define PSNZcpy(m, s) memcpy_P((uint8_t *)(m), PSNZ(s), __builtin_strlen(s))

So, now, if one has a function printing 4 bytes from FLASH, it is as easy to use as Print4BytesFromFlashToLCD(PSNZ("ABCD"));

These can of course be defined also for the "far" memory versions along the outlines sketched here:

# define PFSNZ(s)  (__extension__({static char __c[__builtin_strlen(s)] PROGMEM_FAR = (s);  GET_FAR_ADDRESS(__c[0]);}))
# define PS1SNZ(s) (__extension__({static char __c[__builtin_strlen(s)] PROGMEM_SEG1 = (s); PROGMEM_SEG1_BASE + (uint16_t)&__c[0];}))
# define PS2SNZ(s) (__extension__({static char __c[__builtin_strlen(s)] PROGMEM_SEG2 = (s); PROGMEM_SEG2_BASE + (uint16_t)&__c[0];}))
# define PS3SNZ(s) (__extension__({static char __c[__builtin_strlen(s)] PROGMEM_SEG3 = (s); PROGMEM_SEG3_BASE + (uint16_t)&__c[0];}))

# define  PFSNZcpy(m, s) memcpy_PF((uint8_t *)(m), PFSNZ(s),  __builtin_strlen(s))
# define PS1SNZcpy(m, s) memcpy_PF((uint8_t *)(m), PS1SNZ(s), __builtin_strlen(s))
# define PS2SNZcpy(m, s) memcpy_PF((uint8_t *)(m), PS2SNZ(s), __builtin_strlen(s))
# define PS3SNZcpy(m, s) memcpy_PF((uint8_t *)(m), PS3SNZ(s), __builtin_strlen(s))

Comments, please.

Jan Waclawek

PS. For those wondering whether this is legal - in C99 it is: I've been taught recently too...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This is interesting but I'm not sure how substantial the savings would be in an average program.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Jan said that it would be mostly for smaller AVRs with a lot of text stored. I could see it as a good tool for use in some cases.

My concern (and it is a small one) is that there is no protection of using the NZ strings with routines that expect null-terminated strings. Since this is all based on macros, there's no typing of any kind to help the newbies out.

Yeah, I know: that's heresy in some circles. C does seem to follow the attitude "throw 'em in the water and if the sharks get 'em, good". Ergo C++ and strong(er) typing.

On the whole, I see no problem with the routines.

Stu

Engineering seems to boil down to: Cheap. Fast. Good. Choose two. Sometimes choose only one.

Newbie? Be sure to read the thread Newbie? Start here!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

stu_san wrote:
Jan said that it would be mostly for smaller AVRs with a lot of text stored. I could see it as a good tool for use in some cases.

Only in particular cases, namely if there are multiple strings of the same length to be processed by a common routine which knows that length.

The PSNZcpy macro as given above is a "good bad" example: it spends 4 FLASH bytes to load the size in exchange of the 1 byte spared :-( . I'll rework it tomorrow, as "flash-string-copied-to-RAM" was the original incentive to start working on it.

I see the concern with these macros being of non-C-string-spirit, yet I believe they have some merit. It's late now here, but I have some ideas; I want/need to work on them anyway, will report back.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In attachment, zero stripping version of strcpy(), copying string without the trailing zero from near/far FLASH.

Here, the string in FLASH *is* zero padded, to provide the size information. This should produce shorter code when used on individual strings, as the setup cost of size parameter of memcpy() which would be used with the zero-stripped strings is much higher than one byte (the zero-stripped strings are still better if they are all of the same length and used in a function which KNOWS their size).

Usage example:

#define LCD_WIDTH 20
char buf[LCD_WIDTH];

memset(buf, ' ', LCD_WIDTH);
strcpynz_PF(buf, PFSTR("Hello world!"));

JW

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Jan, I am thinking of using this idea in my code. Not that I need to save space (I'm use a mega2560), but the waste of bytes irritates me a bit.

The case: Parse tables. My commands are built up from tokens (words):

#define TOKEN_TABLE(t) ctt_ ## t

#define TOKEN(t,s) \
	const char TOKEN_TABLE(t) [] PROGMEM = s; 

/* the following are in alphabetical order -- the token table is NOT */

TOKEN( ACCEL,    "acce" )	// acceleration
TOKEN( ACQUIRE,  "acqu" )	// acquire, acquisition, ...
TOKEN( ADCNV,    "adc " )
TOKEN( AGC,      "agc " )
TOKEN( ALL,      "all " )
...
TOKEN( GET,      "get " )
...
TOKEN( SAVE,     "save" )
...
TOKEN( SET,      "set " )

I currently have 169 "words" that the parser recognizes. Since the words are always 4 bytes long, a compare is pretty easy. Your copy (or, actually a size-limited compare) would work a treat in this case.

On the other hand, I'm a strong proponent of "If it ain't broken, don't fix it". It ain't broken, so...

Stu

Engineering seems to boil down to: Cheap. Fast. Good. Choose two. Sometimes choose only one.

Newbie? Be sure to read the thread Newbie? Start here!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Stu-san,

4 bytes... I know it's pervert, but wouldn't that be a nice case for (ab)using cast of the "strings" to uint32_t? ;-) Or better not.

Ain't broken but 'm'fraid it soon will... I am using a M2561 and today I've crossed the magic borderline of 128kB code (so you'll hear me ranting about that broken trampoline in linker - coz' it IS broken - soon), "segment" 3 is already occupied by config data and bootloader and the list of requested features discussed today was more than long... :-(

Jan

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Jan wrote:
4 bytes... I know it's pervert, but wouldn't that be a nice case for (ab)using cast of the "strings" to uint32_t? ;-) Or better not.
AAIIIEEE! (fingers in ears, eyes closed: La la la la la la!) :wink:

Jan wrote:
Ain't broken but 'm'fraid it soon will... I am using a M2561 and today I've crossed the magic borderline of 128kB code (so you'll hear me ranting about that broken trampoline in linker - coz' it IS broken - soon), "segment" 3 is already occupied by config data and bootloader and the list of requested features discussed today was more than long... :-(
Oh, it's not so bad here on the dark side. We have cookies! 8)

Send me a PM if you think I can help with something, although I've posted (and I think you've read) most of what I know.

Stu

Engineering seems to boil down to: Cheap. Fast. Good. Choose two. Sometimes choose only one.

Newbie? Be sure to read the thread Newbie? Start here!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

stu_san wrote:
Jan wrote:
4 bytes... I know it's pervert, but wouldn't that be a nice case for (ab)using cast of the "strings" to uint32_t? ;-) Or better not.
AAIIIEEE! (fingers in ears, eyes closed: La la la la la la!) :wink:
This cast?
(uch[0]+0x100*uch[1]+0x10000*uch[2]+0x1000000*uch[3])

Iluvatar is the better part of Valar.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Stu-san: DON'T LOOK NOW:

#define TOKEN_TABLE(t) ctt_ ## t

#define TOKEN(t,s) \
   const char TOKEN_TABLE(t) [4] PROGMEM = s;

/* the following are in alphabetical order -- the token table is NOT */

TOKEN( ACCEL,    "acce" )   // acceleration
TOKEN( ACQUIRE,  "acqu" )   // acquire, acquisition, ...
TOKEN( ADCNV,    "adc " )
TOKEN( AGC,      "agc " )

#define TOKEN_CMP(t, s) (pgm_read_dword((const PROGMEM uint32_t *)(TOKEN_TABLE(t))) == *(uint32_t *)(s))

s = xscanf("xyz");  // get user input
if (TOKEN_CMP(ACQUIRE, s)) {
  // action on ACQUIRE
}

:-)

Okay, frustration level lower now, thanks for the fun.

Jan

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You're welcome, Jan!

The rest of the parse code is pedestrian. I've thought about parse trees, lexical analyzers, and such, but this is fast enough for now. As you'd guessed, the token matching looks like:

    for (token = 0; token <= (uint8_t) LAST_CMD_TOKEN; token++ )
    {
        memcpy_P( &p, &CmdTokens[(portSHORT)token], sizeof( PGM_P ) );
        if ( strcasecmp_P( pTstring, p ) == 0 )
        {
            return token;
        }
    }

But I like the idea of upper-casing the string on input, then comparing against the uint32_t. I may play with that idea.

Thanks for the ideas, guys!

Stu

Engineering seems to boil down to: Cheap. Fast. Good. Choose two. Sometimes choose only one.

Newbie? Be sure to read the thread Newbie? Start here!