Improved boot.h macros for AVRs with >64K Flash

Go To Last Post
4 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Caution: Code in this append uses @ instead of % because the forum has a problem with % symbols.

I have a 512 word bootloader that I use for my om128 and om644p microcontroller devices. These bootloaders were compiled with an older version of the GCC compiler (3.4.5) and avrlibc (1.4.4) using WinAVR 20060125. This resulted in very compact code.

Later versions of GCC (i.e v4) produced much larger code that would not fit. This hasn't been a problem up to now because because I simply used the older version of GCC. However I am now working on some new devices that require a later compiler to support the underlying AVRs (e.g. mega328p and mega1284p).

A further constraint is that the bootloader is actually in two parts. A 128 word "stub" and a 384 word main loader. The stub can be used to provide a self-updating feature. As an aside I really mean to write this up someday for fellow freaks.

On further examination the stub was too big to fit in 128 words for the 128K byte flash devices (mega128, mega1284) and main culprit were the boot.h macros that required 32-bit addresses:

  • boot_page_erase(address)
  • boot_page_fill(address, data)
  • boot_page_write(address)
The problem is that the 32-bit address needs 4 registers, plus another 4 for pointer addition plus all the pushes and pops needed for the additional registers - a net add of around 50 words.

I resisted the temptation to completely rewrite the stub in assembly and looked at the boot.h macro definitions. For the 128K devices, the RAMPZ register may need to be set. It seemed like a good idea to separate this out from the address calculation and set the RAMPZ register separately.

Here is a snippet of the bootloader that shows the calculation of the address and RAMPZ register from a flash page number:

	/* 256 byte page size */
	uint16_t address = Page << 8;

    /* erase page and wait for completion */
	#if defined(RAMPZ)
		#if defined(__AVR_ATmega128__) || defined(__AVR_ATmega1284P__)
		/* 256 byte page size */
		boot_page_erase_extended(address, (Page >> 8));
		#endif
	#elif defined(__AVR_ATmega644__) || defined(__AVR_ATmega644P__)	
		boot_page_erase_normal(address);
	#endif	
	boot_spm_busy_wait();

Obviously the generated code is quite nice for 128 word pages (shifting left or right by 8 bits is easy). The two new macros are boot_page_erase_normal and boot_page_erase_extended which are defined as follows

#define boot_page_erase_normal(address) __boot_page_erase_normal(address)

#define boot_page_erase_extended(address, ramp)        \
(__extension__({                                 \
    __asm__ __volatile__                         \
    (                                            \
        "sts @3, @4\n\t"                         \
	     "sts @0, @1\n\t"                         \
        "spm\n\t"                                \
        :                                        \
        : "i" (_SFR_MEM_ADDR(__SPM_REG)),        \
          "r" ((uint8_t)__BOOT_PAGE_ERASE),      \
          "z" ((uint16_t)address),               \
          "i" (_SFR_MEM_ADDR(RAMPZ)),            \
          "r" ((uint8_t)ramp)                    \
    );                                           \
}))

The "extended" boot_page_erase macro is very similar to the normal macro for 16-bit addresses except for setting the RAMPZ register i.e. the generated code is only a few words larger. The corresponding macros for boot_page_fill and boot_page_write functions are

#define boot_page_fill_normal(address, data) __boot_page_fill_normal(address, data)

#define boot_page_fill_extended(address, data, ramp) \
(__extension__({                                 \
    __asm__ __volatile__                         \
    (                                            \
        "movw  r0, @3\n\t"                       \
        "sts @4, @5\n\t"                         \
	     "sts @0, @1\n\t"                         \
        "spm\n\t"                                \
        "clr  r1\n\t"                            \
        :                                        \
        : "i" (_SFR_MEM_ADDR(__SPM_REG)),        \
          "r" ((uint8_t)__BOOT_PAGE_FILL),       \
          "z" ((uint16_t)address),               \
          "r" ((uint16_t)data),                  \
          "i" (_SFR_MEM_ADDR(RAMPZ)),            \
          "r" ((uint8_t)ramp)                    \
        : "r0"                                   \
    );                                           \
}))


#define boot_page_write_normal(address) __boot_page_write_normal(address)

#define boot_page_write_extended(address, ramp)        \
(__extension__({                                 \
    __asm__ __volatile__                         \
    (                                            \
        "sts @0, @1\n\t"                         \
	     "sts @3, @4\n\t"                         \
        "spm\n\t"                                \
        :                                        \
        : "i" (_SFR_MEM_ADDR(__SPM_REG)),        \
          "r" ((uint8_t)__BOOT_PAGE_WRITE),      \
          "z" ((uint16_t)address),               \
          "i" (_SFR_MEM_ADDR(RAMPZ)),            \
          "r" ((uint8_t)ramp)                    \
    );                                           \
}))

Here are the resultant code sizes for just the stub using the latest GCC compiler:

  • mega1284p - 108 words
  • mega128 - 96 words
  • mega644p - 99 words
The mega644p is 9 words smaller (3 per RAMPZ use) than the mega1284p. The mega128 is 12 words smaller than the mega1284p because it can use in/out instructions rather than the longer lds/sts instructions for I/O registers. This shows one of the benefits of keeping to C code as much as possible - the compiler can do a much better optimization job although as shown in this post, sometimes it needs some help.

Please give me comments on this approach and if you think it is worthwhile. You may want to perhaps consider some derivation of these macros for the next version of avrlibc.

--Mike

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Please post this on the avr-libc-dev mailing list, or in the avr-libc Patch Tracker. These are the official ways to get patches included in avr-libc. Posts on this forum are soon forgotten.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Also, please go see the latest boot.h file in the avr-libc CVS repository.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

EW wrote:
Please post this on the avr-libc-dev mailing list, or in the avr-libc Patch Tracker. These are the official ways to get patches included in avr-libc.
Eric - I can do all of those things. I first wanted to get some input from other freaks. From the tone of your posts, it would seem that there is some interest in what I have done.

--Mike