avr-nm and avr-size RAM use

Go To Last Post
18 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm trying to get firmware designed for 8k RAM down to 4k RAM. avr-size reports:

 

				Program Memory Usage 	:	65160 bytes   46.8 % Full
				Data Memory Usage 	:	5679 bytes   69.3 % Full

 

I ran "avr-nm -t decimal firmware.elf" and added up all the "B" type objects, but they only come to 3828. How can I find where the rest of the RAM is being used?

 

This is going to be a real challenge...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Go on. The simple rules are:
1. Appropriate size and scope for variables.
2. Placing const data in Flash especially anonymous strings.
3. Using appropriate Optimisation settings
.
If your program was sensibly designed in the first place, I doubt if you can get from 5679 to 4096.
And most importantly, GCC makes no attempt at static analysis. So you have to guess at local variables and a likely stack depth.
.
If you manage to do the squeeze, I will give you a medal.
.
David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm going to have to rip out some features and heavily optimize others. I realize that tweaking a few flags isn't going to cut it, and everything is already carefully scoped to be as minimal as possible. It actually needs to get down well under 4k to allow for some stack space. I've measured stack use at 422 bytes maximum, but I think I can improve that a bit.

 

I'm just trying to identify what is using the memory that in unaccounted for by avr-nm.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Well,  that is straightforward.   You can use avr-nm to identify the big functions that take up Flash space.

 

But for SRAM,  you have identify the unnecessary uint32_t or too-big buffer.

Without an official tool for static analysis,  you can only guess about the 422 bytes for stack use.

 

Few apps use all of the EEPROM.   Can you swap some SRAM in and out of EEPROM?   Any swap will be painfully slow.

 

It sounds like you are trying to squeeze a mega1284 app into a mega644.

The mega32U4 has got 5kof SRAM (I think).   But this would need shrinking Flash.

 

I would either stick with mega1284 or go to an ARM chip.

 

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0
Data Memory Usage 	:	5679 bytes   69.3 % Full

That is .data and .bss. If you search nm output for "B" you are only getting .bss ?!?

 

For example:

C:\SysGCC\avr\bin>type avr.c
#include <avr/io.h>

char text[] = "hello world";
char buffer[17];

int main(void) {
        while(1) {
        }
}

C:\SysGCC\avr\bin>avr-gcc -mmcu=atmega16 -Os avr.c -o avr.elf

C:\SysGCC\avr\bin>avr-size avr.elf
   text    data     bss     dec     hex filename
    152      12      17     181      b5 avr.elf

C:\SysGCC\avr\bin>avr-nm -S avr.elf
0000007e t .do_clear_bss_loop
00000080 t .do_clear_bss_start
0000008e T __bad_interrupt
0080007d B __bss_end
0080006c B __bss_start
00000054 T __ctors_end
00000054 T __ctors_start
0080006c D __data_end
000000a4 A __data_load_end
00000098 A __data_load_start
00800060 D __data_start
00000076 00000010 T __do_clear_bss
00000060 00000016 T __do_copy_data
00000054 T __dtors_end
00000054 T __dtors_start
00810000 N __eeprom_end
00000000 W __heap_end
00000054 W __init
0000003e a __SP_H__
0000003d a __SP_L__
0000003f a __SREG__
0000045f W __stack
00000096 t __stop_program
00000000 a __tmp_reg__
00000054 T __trampolines_end
00000054 T __trampolines_start
0000008e W __vector_1
0000008e W __vector_10
0000008e W __vector_11
0000008e W __vector_12
0000008e W __vector_13
0000008e W __vector_14
0000008e W __vector_15
0000008e W __vector_16
0000008e W __vector_17
0000008e W __vector_18
0000008e W __vector_19
0000008e W __vector_2
0000008e W __vector_20
0000008e W __vector_3
0000008e W __vector_4
0000008e W __vector_5
0000008e W __vector_6
0000008e W __vector_7
0000008e W __vector_8
0000008e W __vector_9
00000000 W __vector_default
00000000 T __vectors
00000001 a __zero_reg__
0080006c D _edata
0080007d N _end
00000098 T _etext
00000094 T _exit
0080006c 00000011 B buffer
00000094 W exit
00000092 00000002 T main
00800060 0000000c D text

As you can see I have 12 (0x0000000C) bytes for "text" which is in "D" (.data) and 17 (0x00000011) bytes for "buffer" which is in "B" (.bss).

Last Edited: Thu. Feb 9, 2017 - 11:45 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

PS obviously build the code with -fdata-sections and -ffunction-sections then later link with -gc-sections so any "dead" stuff is removed.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
That is .data and .bss. If you search nm output for "B" you are only getting .bss ?!?

 

Apparently so... I have version 2.24:

 

>avr-nm -S --size-sort -t decimal firmware.elf | grep " B "
08400269 00000001 B ADC_conversion_complete_SIG
08400263 00000001 B ADC_dma_ch0_complete_SIG
08400266 00000001 B ADC_dma_ch1_complete_SIG
08398586 00000001 B CLK_has_xosc
08398585 00000001 B clk_xosc_failed
08399708 00000001 B COM_in_packet_source
08399709 00000001 B COM_reset_count
08398584 00000001 B COR_new_audio_available
08401552 00000001 B cs_wakeup_int_en
08400090 00000001 B DBG_bad_int_count
08400091 00000001 B DBG_bad_int_status
08398591 00000001 B HW_extosc_rtc
08398590 00000001 B HW_reset_confirmation
08398592 00000001 B HW_rev4_pcb
08398582 00000001 B IR_awake
08399713 00000001 B ir_comms_timeout
08399715 00000001 B IR_incoming_packet_ready_SIG
08399714 00000001 B ir_reset_rx_state_machine_SIG
08399711 00000001 B IR_sync_timeout_SIG
08399712 00000001 B IR_synced_SIG
08399710 00000001 B IR_wakeup_SIG
08398598 00000001 B NL_day_result_cache_idx
08398597 00000001 B NL_day_result_cache_valid
08402240 00000001 B NL_tc_overflow_SIG
08398626 00000001 B RF_awake
08398576 00000001 B rf_high_speed_mode
08398625 00000001 B RF_wakeup_SIG
08401658 00000001 B RTC_alarm_SIG
08401659 00000001 B RTC_day_tick_SIG
08401553 00000001 B rtc_event_alarm_hour_AT
08401660 00000001 B rtc_event_alarm_minute_AT
08398587 00000001 B RTC_failed
08401661 00000001 B RTC_hours_tick_SIG
08401554 00000001 B RTC_minutes_tick_SIG
08401657 00000001 B RTC_seconds_tick_SIG
08401560 00000001 B RTC_valid
08402276 00000001 B SCH_correlation_enabled
08402274 00000001 B SCH_correlation_hour
08402277 00000001 B SCH_correlation_min
08402273 00000001 B sch_drive_ran_today
08402272 00000001 B SCH_mode
08402275 00000001 B sch_nl_finished_today
08398583 00000001 B TERM_awake
08399810 00000001 B TERM_bad_param
08399811 00000001 B term_history_length
08399809 00000001 B TERM_index
08400009 00000001 B TERM_line_length
08399808 00000001 B TERM_param_count
08399716 00000001 B TERM_wakeup_SIG
08400267 00000002 B ADC_dma_frame_count_AT
08400264 00000002 B ADC_ts_offset
08400270 00000002 B ADCA_ext_offset
08400257 00000002 B ADCA_supply_offset
08402406 00000002 B errno
08398593 00000002 B HW_product_id
08401742 00000002 B LOG_last_log_id
08401744 00000002 B LOG_logs_since_last_sequential_read
08398595 00000002 B log_sequence
08401777 00000002 B NL_last_noise_log_id
08398574 00000002 B RF_drive_by_tx_count
08398588 00000002 B RTC_error_comp
08400007 00000002 B term_timeout_counter_seconds
08400088 00000002 B TERM_timeout_period_seoncds
08398627 00000003 B COM_destination_id
08401739 00000003 B HW_device_id
08400259 00000004 B ADC_ts_k_per_adc
08401673 00000004 B HW_discharge_count_mC
08401555 00000005 B RTC_last_reset_timestamp
08396800 00000005 B rtc_time_AT
08401662 00000011 B HW_mcu_id
08400206 00000013 B CFG_location_cache
08399864 00000013 B TERM_param_index
08399795 00000013 B TERM_param_length
08398605 00000020 B BL_jump_table
08401746 00000031 B NL_day_result_cache_cnv
08402241 00000031 B NL_day_result_cache_lcf_alarms
08400219 00000038 B CFG_radio_config_cache
08400092 00000050 B DBG_telemetry
08399812 00000052 B TERM_param_floats
08399877 00000052 B TERM_param_int32s
08401677 00000062 B HW_saved_diagnostics
08400142 00000064 B CFG_user_config_cache
08400010 00000078 B term_history
08399717 00000078 B TERM_line
08399929 00000078 B TERM_line_unformatted
08401561 00000096 B RTC_temperature_readings
08402278 00000128 B temp_string
08401779 00000200 B nl_bins
08401979 00000261 B nl_epoch_log
08398630 00000539 B COM_in_packet
08399169 00000539 B COM_out_packet
08400912 00000640 B ADC_stream_buffer_a
08400272 00000640 B ADC_stream_buffer_b

Obviously there are some big buffers there which will be first on the chopping block, but it still doesn't add up. Naturally I build with -fdata-sections and -ffunction-sections then later link with -gc-sections.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Oh, case insensitive search, gets me up to 3839.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You missed my point. If you grep " B " you only get HALF the global RAM usage. Every item in your report above is an uninitialized global (buffers and stuff). You also need to grep " D " for the initialized globals.

 

In my test:

C:\SysGCC\avr\bin>avr-nm -S --size-sort -t decimal avr.elf | grep " B "
08388716 00000017 B buffer

C:\SysGCC\avr\bin>avr-nm -S --size-sort -t decimal avr.elf | grep " D "
08388704 00000012 D text

So for my code:

char text[] = "hello world";
char buffer[17];

You get to see the 17 bytes of "buffer" but don't hear about the 12 bytes in "text".

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks Cliff, that sorted it. I had mistakenly discarded all the "D" stuff because I glanced at the names and thought it was stuff stored in flash, but actually it is RAM. My dimly lit memory is telling me I optimized my code a little by moving some stuff into RAM because, hay, I've got 8k so why not. Nice big 1400 byte const lookup table I can move straight into flash.

 

Thanks!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

david.prentice wrote:
Without an official tool for static analysis,  you can only guess about the 422 bytes for stack use.
Some lint have stack usage :

What's new in Gimpel Software's PC-lint/FlexeLint 9.00

http://www.gimpel.com/html/lint90.htm

...

Stack Usage -- We can report on the overall stack requirements of any program whose function calls are non-recursive and deterministic (i.e. calls not made through function pointers). This is very useful for embedded systems development where the amount of stack required can be mission critical. A complete detailed report of stack usage for each function is available as well.

...

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

david.prentice wrote:
And most importantly, GCC makes no attempt at static analysis. So you have to guess at local variables and a likely stack depth.
May be possible for GCC to list stack usage for each function :

Sysprogs

Introducing Dynamic Stack Verifier

by

August 25, 2016

https://sysprogs.com/w/introducing-dynamic-stack-verifier/

...

 

Enabling Stack Verifier

...

... and adding a GCC flag that dumps the stack usage by each function into text files.

...

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I fill the RAM with a canary value in an init function, and then monitor stack use at run time by checking what the highest address with that value is.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Go on.   Have you shrunk the SRAM down to less than 3764 bytes yet?

 

If you found an inappropriate 1400 byte buffer,   I am guessing that you will find many other opportunities to save SRAM.

 

In most practical programs,   buffers and large arrays are the only real SRAM consumers.

Regular variables whether global or local seldom add up to much.

 

My "wild assertion" applies to >=8kB AVRs.    The mega48 and smaller Tinys are much less endowed with SRAM.

 

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

gchapman wrote:
May be possible for GCC to list stack usage for each function :

 

I seem to recall some threads about GCC stack usage "tools" a couple of years ago.  Dunno if the same as what you linked.

 

david.prentice wrote:
In most practical programs, buffers and large arrays are the only real SRAM consumers.

 

And indeed, you mentioned e.g. Mega48-sized models.  Anyway, those that post here that like to see call-back functions in ISRs along with an SEI in there could be rudely surprised.  (and not obvious from .MAP and such)

 

Do we even have any idea of what processor family OP is working with?  Is it even an AVR8?

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yeah, I got it down to under 4k. Need to do some careful testing of course, but managed to keep the streaming function in so I'm happy. It's a mature product that is part of a range, and at the bottom end there is a lot of price competition so we are looking at ways of shaving a few Euros off.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0
				Program Memory Usage 	:	65160 bytes   46.8 % Full
				Data Memory Usage 	:	5679 bytes   69.3 % Full

An easy guess:  Flash = 65160 / 0.468 = 139230 = 0x22000,   SRAM = 569 / 0.693 = 8194 = 0x2000.

 

So it might not be a Mega1284.   Possibly an Xmega128 with 128k Flash, 8k Boot, 8k SRAM, 2k EEPROM.

And the target is an Xmega64.

 

All the same,  4kB of SRAM should keep most applications happy.

 

David.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

XMEGA 128 A3U, trying to drop down to a 64 A3U.

 

As you noticed, the size % calculation includes the bootloader. There is nothing to stop the application being in there, except for the NVM limitations or if you need an actual bootloader.