Some of you who arguably speak English may have noticed that printable ASCII characters only require seven bits - the high bit is never set. Some may have also noticed that the AVR is fundamentally an eight-bit processor, and therefore, because 8x7 == 7x8, one could theoretically stuff eight ASCII characters into only seven bytes.
Nota bene: If you actually try to implement this in a functional gizmo, you are prematurely optimizing, and kindly Knock It Off Already.
With that out of the way, because I was bored, mostly, I have developed, implemented, and tested routines for compressing (and decompressing) eight (8) ASCII characters represented in seven bits into only seven (7) bytes. This might be marginally helpful for those of you who load up your memories with strings (which I do incessantly).
This particular application:
a) receives eight characters from the RS-232 serial port
b) stores them uncompressed in the EEPROM, consuming eight bytes
c) compresses them, and stores them again in the next seven bytes
d) stores a cheerful $00 in the sixteenth EEPROM byte (because why not?)
e) retrieves the seven compressed bytes from the EEPROM
f) decompresses them back into valid ASCII
f) and dumps them out the serial port, in reverse order.*
It also dumps the first sixteen bytes, in ASCII hexademcial, of the EEPROM at pretty much every stage of the operation.
It's written in AVR assembler, and I have programmed and tested it in an ATmega8535. Converting it to other chips is left up to you. Those of you who consider yourselves C coder wizards are welcome to rewrite it in C - who knows, perhaps your code might be even more efficient!
* Fixing that, if you care, is left as an exercise for the student.
Now for the fun part!
I have included all the supporting files in the attached .zip, but the fun bits I shall paste into a code window here. Enjoy!
; Compress and store in EEPROM <-- The fun bit starts here ; Load to registers r0-r7 ldi ZL, low(SRAM_START) ldi ZH, high(SRAM_START) ld r0, Z+ ld r1, Z+ ld r2, Z+ ld r3, Z+ ld r4, Z+ ld r5, Z+ ld r6, Z+ ld r7, Z+ ; and compress to r0-r6 ; Generally speaking, putting multiple instructions ; on one line is horrible practice. But in this ; case, I think it better shows what is actually going on. lsl r7 ; Toss carry lsl r7 rol r6 ; Toss carry lsl r7 rol r6 rol r5 ; Toss the carry lsl r7 rol r6 rol r5 rol r4 ; Toss the carry lsl r7 rol r6 rol r5 rol r4 rol r3 ; Toss the carry lsl r7 rol r6 rol r5 rol r4 rol r3 rol r2 ; Toss the carry lsl r7 rol r6 rol r5 rol r4 rol r3 rol r2 rol r1 ; Toss the carry lsl r7 rol r6 rol r5 rol r4 rol r3 rol r2 rol r1 rol r0 ; And toss the final carry ;< At this point the registers r0-r6 contain the ;< compressed characters - 8 of them, in 7 bytes. I didn't ;< paste the boring storing stuff. ;< Next, once the seven bytes have been retrieved into ;< registers r0-r6. we decompress them... lsr r0 ror r1 ror r2 ror r3 ror r4 ror r5 ror r6 ror r7 lsr r1 ror r2 ror r3 ror r4 ror r5 ror r6 ror r7 lsr r2 ror r3 ror r4 ror r5 ror r6 ror r7 lsr r3 ror r4 ror r5 ror r6 ror r7 lsr r4 ror r5 ror r6 ror r7 lsr r5 ror r6 ror r7 lsr r6 ror r7 lsr r7 ; And lo, the registers r0-r7 contain the same eight valid ASCII characters.
Have fun!
S.