On one of those long nights where the mind wanders aimlessly while you're endlessly counting sheep and desperately trying to sleep, this idea popped into my mind. Given that user code accesses progmem data through library routines (before GCC 4.7 atleast), why not have the toolchain compress the contents and have the progmem routines decompress it on the fly?
So I hacked on the GNU Binutils linker and got a naive brute force algorithm working, which basically builds a dictionary of repetitive strings and replaces the occurrences of the string with the dictionary code. It only works with char data in the range of 0 - 127, and uses 128-255 for the dictionary codes. Having been forced to make a bunch of tradeoffs regarding memory/time, I thought I'd ask around before I go any further.
Would this actually be useful in practice? Is it the case that most of progmem textual data is ASCII English characters in (0-127) range? Are there decent real-life examples somewhere that I can test against?