Caution: Code in this append uses @ instead of % because the forum has a problem with % symbols.
I have a 512 word bootloader that I use for my om128 and om644p microcontroller devices. These bootloaders were compiled with an older version of the GCC compiler (3.4.5) and avrlibc (1.4.4) using WinAVR 20060125. This resulted in very compact code.
Later versions of GCC (i.e v4) produced much larger code that would not fit. This hasn't been a problem up to now because because I simply used the older version of GCC. However I am now working on some new devices that require a later compiler to support the underlying AVRs (e.g. mega328p and mega1284p).
A further constraint is that the bootloader is actually in two parts. A 128 word "stub" and a 384 word main loader. The stub can be used to provide a self-updating feature. As an aside I really mean to write this up someday for fellow freaks.
On further examination the stub was too big to fit in 128 words for the 128K byte flash devices (mega128, mega1284) and main culprit were the boot.h macros that required 32-bit addresses:
- boot_page_erase(address)
- boot_page_fill(address, data)
- boot_page_write(address)
I resisted the temptation to completely rewrite the stub in assembly and looked at the boot.h macro definitions. For the 128K devices, the RAMPZ register may need to be set. It seemed like a good idea to separate this out from the address calculation and set the RAMPZ register separately.
Here is a snippet of the bootloader that shows the calculation of the address and RAMPZ register from a flash page number:
/* 256 byte page size */ uint16_t address = Page << 8; /* erase page and wait for completion */ #if defined(RAMPZ) #if defined(__AVR_ATmega128__) || defined(__AVR_ATmega1284P__) /* 256 byte page size */ boot_page_erase_extended(address, (Page >> 8)); #endif #elif defined(__AVR_ATmega644__) || defined(__AVR_ATmega644P__) boot_page_erase_normal(address); #endif boot_spm_busy_wait();
Obviously the generated code is quite nice for 128 word pages (shifting left or right by 8 bits is easy). The two new macros are boot_page_erase_normal and boot_page_erase_extended which are defined as follows
#define boot_page_erase_normal(address) __boot_page_erase_normal(address) #define boot_page_erase_extended(address, ramp) \ (__extension__({ \ __asm__ __volatile__ \ ( \ "sts @3, @4\n\t" \ "sts @0, @1\n\t" \ "spm\n\t" \ : \ : "i" (_SFR_MEM_ADDR(__SPM_REG)), \ "r" ((uint8_t)__BOOT_PAGE_ERASE), \ "z" ((uint16_t)address), \ "i" (_SFR_MEM_ADDR(RAMPZ)), \ "r" ((uint8_t)ramp) \ ); \ }))
The "extended" boot_page_erase macro is very similar to the normal macro for 16-bit addresses except for setting the RAMPZ register i.e. the generated code is only a few words larger. The corresponding macros for boot_page_fill and boot_page_write functions are
#define boot_page_fill_normal(address, data) __boot_page_fill_normal(address, data) #define boot_page_fill_extended(address, data, ramp) \ (__extension__({ \ __asm__ __volatile__ \ ( \ "movw r0, @3\n\t" \ "sts @4, @5\n\t" \ "sts @0, @1\n\t" \ "spm\n\t" \ "clr r1\n\t" \ : \ : "i" (_SFR_MEM_ADDR(__SPM_REG)), \ "r" ((uint8_t)__BOOT_PAGE_FILL), \ "z" ((uint16_t)address), \ "r" ((uint16_t)data), \ "i" (_SFR_MEM_ADDR(RAMPZ)), \ "r" ((uint8_t)ramp) \ : "r0" \ ); \ })) #define boot_page_write_normal(address) __boot_page_write_normal(address) #define boot_page_write_extended(address, ramp) \ (__extension__({ \ __asm__ __volatile__ \ ( \ "sts @0, @1\n\t" \ "sts @3, @4\n\t" \ "spm\n\t" \ : \ : "i" (_SFR_MEM_ADDR(__SPM_REG)), \ "r" ((uint8_t)__BOOT_PAGE_WRITE), \ "z" ((uint16_t)address), \ "i" (_SFR_MEM_ADDR(RAMPZ)), \ "r" ((uint8_t)ramp) \ ); \ }))
Here are the resultant code sizes for just the stub using the latest GCC compiler:
- mega1284p - 108 words
- mega128 - 96 words
- mega644p - 99 words
Please give me comments on this approach and if you think it is worthwhile. You may want to perhaps consider some derivation of these macros for the next version of avrlibc.