How useful is compressing contents in progmem section?

Go To Last Post
10 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

On one of those long nights where the mind wanders aimlessly while you're endlessly counting sheep and desperately trying to sleep, this idea popped into my mind. Given that user code accesses progmem data through library routines (before GCC 4.7 atleast), why not have the toolchain compress the contents and have the progmem routines decompress it on the fly?

So I hacked on the GNU Binutils linker and got a naive brute force algorithm working, which basically builds a dictionary of repetitive strings and replaces the occurrences of the string with the dictionary code. It only works with char data in the range of 0 - 127, and uses 128-255 for the dictionary codes. Having been forced to make a bunch of tradeoffs regarding memory/time, I thought I'd ask around before I go any further.

Would this actually be useful in practice? Is it the case that most of progmem textual data is ASCII English characters in (0-127) range? Are there decent real-life examples somewhere that I can test against?

Regards

Senthil

 

blog | website

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Rather depends if the data compression frees up more space than the decompression code. Otherwise your program gets the bends and takes more flash, not less.

Not sure it'd be much use for ascii text, unless you put a lot of it in flash. Might do well for graphic splash screens and the like.

The largest known prime number: 282589933-1

It's easy to stop breaking the 10th commandment! Break the 8th instead. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

An interesting thought certainly, but I would be surprised if, as a general solution, it was preferable. Depending upon the compressibilty of the data it may lead to greater flash usage, but it will also make data accesses take longer. In some applications it may be appropriate, but others... I don't know.

Data needed.

Martin Jay McKee

As with most things in engineering, the answer is an unabashed, "It depends."

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This might be useful to you:
http://excamera.com/sphinx/artic...

Dr. David Harris OpenLCB Development Team openlcb.org

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'd expect the progmem data to be anything, not just ASCII text. It could be fully ASCII text, fully binary data like graphics or tables, or a mixture of everything.

Nice idea though. Perhaps the data could be compressed using the algorithm that produces the smallest result, the user could try those with algorithm selection flags or something.

But in such cases, maybe the user would compress the data himself, instead of relying the compiler to do it. Hard to say.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The big eaters of flash are font tables, bitmap images and audio samples. I suppose there'd be some merit in compressing those but an ASCII compressor is not going to be of much use. For audio I have successfully used ADPCM and there was even time at 8kHz playback to decompress on the fly but the speed of decompression could be an issue. If the audio cannot be recovered in time for the playback rate or the image decompression leads to a "slow wipe" rather than fast image presentation it's not much use.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

There is a trade-off between flash size required and MIPS available, when you fix other factors(like price, package size, functionality, etc). On planar space that relation would be a monotonic, falling line.
GCC has some switches to move along this line (like packed structs, LUTs, called push/pops, etc..).

Now, if you would like to move beyond -Os reach, then that should be made because of a reason. One idea that comes into mind is that you do not need MIPS available, but would be interested in even higher "rom density".
The problem it is not worth the effort (in general), is because slower chips (10MIPS) are cheaper than those with 100MIPS so you can buy them with much bigger flash/rom for the price.
Another real-life reason is that your code is -Os and occupies 120% of rom.. Even then I think that a better idea would be to explicitly put a (jpg?) decompression library on-board than to rely on intelligence of the compiler*.

Another/better/cheaper/more flexible in many cases would be to stick an external rom with "strings" section from printfs. External flash is cheaper than program flash.

*AFAIK GCC is not smart enough to notice for example two consts of identical content residing in rom :(

EDIT: Monotonic, not monotonous.

No RSTDISBL, no fun!

Last Edited: Thu. Nov 15, 2012 - 01:48 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

AFAIK GCC is not smart enough to notice for example two consts of identical content residing in rom

That's fixed in the latest version I think.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
That's fixed in the latest version I think.

const char the_first[]="Ala ma kota.";
const char the_second[]="ma kota.";

Cannot wait to test it :)

No RSTDISBL, no fun!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

perhaps it's time for a contest :D
take this text:

Quote:
On one of those long nights where the mind wanders aimlessly while you're endlessly counting sheep and desperately trying to sleep, this idea popped into my mind. Given that user code accesses progmem data through library routines (before GCC 4.7 atleast), why not have the toolchain compress the contents and have the progmem routines decompress it on the fly?

So I hacked on the GNU Binutils linker and got a naive brute force algorithm working, which basically builds a dictionary of repetitive strings and replaces the occurrences of the string with the dictionary code. It only works with char data in the range of 0 - 127, and uses 128-255 for the dictionary codes. Having been forced to make a bunch of tradeoffs regarding memory/time, I thought I'd ask around before I go any further.

Would this actually be useful in practice? Is it the case that most of progmem textual data is ASCII English characters in (0-127) range? Are there decent real-life examples somewhere that I can test against?


1009 bytes.
have a 2313 to write the text at 100% speed @115200 baud with a 8MHz xtal, when boot.
Who can make the smallest footprint (most FF's at the end of the flash).

Add: no use of EEPROM