Dear All...
I didn't get why the datasheet of ATMega328P shows two bytes element for the Interrupt vector:
Actually since the JMP instruction takes 4 byte, I'm in line with what is shown by assembly for the same device: 4bytes element
Dear All...
I didn't get why the datasheet of ATMega328P shows two bytes element for the Interrupt vector:
Actually since the JMP instruction takes 4 byte, I'm in line with what is shown by assembly for the same device: 4bytes element
I didn't get why the datasheet of ATMega328P shows two bytes element for the Interrupt vector:
Not bytes. Two words. Flash is addressed in 16-bit words, not in 8-bit bytes.
The assembler you're using is avr-as, part of the toolchain that ships with Studio. As it is basically the GNU assembler, all work done there is in bytes, not words. Hence the apparent discrepancy.
Before the table, it says:
Each Interrupt Vector occupies two instruction words in ATmega168A/168PA and ATmega328/328P, and one instruction word in ATmega 48A/48PA and ATmega88A/88PA.
As mentioned in #2, program addresses point to word size data. You can see the same thing in the Program Counter and targets of branches.
Thx!!!
Isn't that an FAQ?
All of the ATmega48 and 88 program memory can be reached with an rjmp instruction, which only requires a single word. The ATmega168 and 328 have more memory and need a two word jmp instruction, although you can, of course use an rjmp to access the first 8 Kbytes.
although you can, of course use an rjmp to access the first 8 Kbytes.
It can reach all of flash. The limitation is that an rjmp (and rcall) can reach up to 4kB backwards, and 4kB (actually, 2047 words) forwards. The 'R' stands for 'relative', as in relative to the current PC.
I quote from page117 of the Atmel AVR Instruction Set manual:
RJMP
Relative jump to an address within PC - 2K +1 and PC + 2K (words). For AVR microcontrollers with Program memory not
exceeding 4K words (8K bytes) this instruction can address the entire memory from every address location. See also JMP.
As I understand it you can relative jump anywhere in flash, provided that you do not have more than 4K (words) altogether. This is because the addendum to the PC 'rolls round' if the resulting address exceeds the 4K limit in chips with only 4K of flash memory. But this behaviour is not there for chips (the 168 and 328) that have more than 4K of flash. In these chips you can only move +/- 2K as per the manual.
As I understand it you can relative jump anywhere in flash, provided that you do not have more than 4K (words) altogether. This is because the addendum to the PC 'rolls round' if the resulting address exceeds the 4K limit in chips with only 4K of flash memory. But this behaviour is not there for chips (the 168 and 328) that have more than 4K of flash. In these chips you can only move +/- 2K as per the manual.
Tell me again the difference in operation of RJMP on e.g. Mega88 vs. Mega168.
Haven't you just said that it is +/- 2K words on each?
If I am on either model, and at word address 100, and do an RJMP of -200 words, where do I end up?
The corresponding example at 100 words from the end of flash: Same result?
Back when I was your age an ASM project in AVRstudio would have a check box for Wrap Relative Jumps. I cannot remember whether that was ASM1 or ASM2, and whether the intebnt was to warn you if you were wrapping.
Tell me again the difference in operation of RJMP on e.g. Mega88 vs. Mega168.
Haven't you just said that it is +/- 2K words on each?
Yes.
If you only have 4k words of flash, +/-2k is "everywhere." On 168 (with 8k words of flash), rjmp works exactly the same, but +/-2k is only part of the address space.
All of this is only loosely related to the OP, but the point I was trying to make was that rjmp/rcall are not limited to the first 4k-word (8k-byte) of flash on devices with greater than 8k-byte flash, as was implied in #6.
The limitation is +/- 2k-word (4k-flash) relative to the PC. If the absolute address is before the beginning of flash (0x0000) or past the end of flash (0x3FFF [word address] for a 32kB device), then the computed address will be 'wrapped'.
For example (as alluded to in #9) if the PC is at 0x0100, but the rjmp/rcall operand is -0x0200, then on a 32kB device the destination will be 0x3F00.
Likewise, if the PC is at 0x3C00, but the rjmp/rcall operand is +0x0800, then on a 32kB defvice the destination will be 0x0400.
I will just make it clear:
The wrap around is a special case for 8K devices.
On all other AVRs the opration will be a undefined!
I will just make it clear:
The wrap around is a special case for 8K devices.
On all other AVRs the opration will be a undefined!
Rubbish.
I don't have real AVR hardware here, but here's a simple test I run using simulavr and an atmega328 as a target.
$ cat rjmp_test.S .section .text .org 0 reset: ldi r16, 'A' rjmp .-10 out: ldi r24, 'L' sts SIMULAVR_OUTPUT_ADDR, r24 ldi r24, 'o' sts SIMULAVR_OUTPUT_ADDR, r24 ldi r24, 'a' sts SIMULAVR_OUTPUT_ADDR, r24 ldi r24, 'd' sts SIMULAVR_OUTPUT_ADDR, r24 ldi r24, 'e' sts SIMULAVR_OUTPUT_ADDR, r24 ldi r24, 'd' sts SIMULAVR_OUTPUT_ADDR, r24 ldi r24, ':' sts SIMULAVR_OUTPUT_ADDR, r24 ldi r24, ' ' sts SIMULAVR_OUTPUT_ADDR, r24 sts SIMULAVR_OUTPUT_ADDR, r16 ldi r24, 0x0A sts SIMULAVR_OUTPUT_ADDR, r24 exit: rjmp exit .org 0x7FF6 jmp out ldi r16, 'B' jmp out
$ avr-gcc -mmcu=atmega328 -Wall -Wextra -nostartfiles -DSIMULAVR_OUTPUT_ADDR=0x20 -DSIMULAVR_EXIT_ADDR=0xFF rjmp_test.S -o rjmp_test.elf $ avr-objdump -S rjmp_test.elf rjmp_test.elf: file format elf32-avr Disassembly of section .text: 00000000 <__ctors_end>: 0: 01 e4 ldi r16, 0x41 ; 65 2: fb cf rjmp .-10 ; 0xfffffffa <__eeprom_end+0xff7efffa> 00000004 <out>: 4: 8c e4 ldi r24, 0x4C ; 76 6: 80 93 20 00 sts 0x0020, r24 a: 8f e6 ldi r24, 0x6F ; 111 c: 80 93 20 00 sts 0x0020, r24 10: 81 e6 ldi r24, 0x61 ; 97 12: 80 93 20 00 sts 0x0020, r24 16: 84 e6 ldi r24, 0x64 ; 100 18: 80 93 20 00 sts 0x0020, r24 1c: 85 e6 ldi r24, 0x65 ; 101 1e: 80 93 20 00 sts 0x0020, r24 22: 84 e6 ldi r24, 0x64 ; 100 24: 80 93 20 00 sts 0x0020, r24 28: 8a e3 ldi r24, 0x3A ; 58 2a: 80 93 20 00 sts 0x0020, r24 2e: 80 e2 ldi r24, 0x20 ; 32 30: 80 93 20 00 sts 0x0020, r24 34: 00 93 20 00 sts 0x0020, r16 38: 8a e0 ldi r24, 0x0A ; 10 3a: 80 93 20 00 sts 0x0020, r24 0000003e <exit>: 3e: ff cf rjmp .-2 ; 0x3e <exit> ... 7ff4: 00 00 nop 7ff6: 0c 94 02 00 jmp 0x4 ; 0x4 <out> 7ffa: 02 e4 ldi r16, 0x42 ; 66 7ffc: 0c 94 02 00 jmp 0x4 ; 0x4 <out>
$ simulavr -d atmega328 -W 0x20,- -e 0xFF -T exit -f rjmp_test.elf
Loaded: B
SystemClock::Endless stopped
number of cpu cycles simulated: 37
The simulator knows exactly what to do. So does real hardware.
A specific assembler, on the other hand, may have trouble knowing what to do with a symbol that is 'out-of-range' of an rjmp. It may not know that it can use wrapping, if indeed wrapping brings the symbol into range. Such is the case with the gnu assembler:
$ cat rjmp_test.S .section .text .org 0 ldi r16, 'A' rjmp change out: ldi r24, 'L' sts SIMULAVR_OUTPUT_ADDR, r24 ldi r24, 'o' sts SIMULAVR_OUTPUT_ADDR, r24 ldi r24, 'a' sts SIMULAVR_OUTPUT_ADDR, r24 ldi r24, 'd' sts SIMULAVR_OUTPUT_ADDR, r24 ldi r24, 'e' sts SIMULAVR_OUTPUT_ADDR, r24 ldi r24, 'd' sts SIMULAVR_OUTPUT_ADDR, r24 ldi r24, ':' sts SIMULAVR_OUTPUT_ADDR, r24 ldi r24, ' ' sts SIMULAVR_OUTPUT_ADDR, r24 sts SIMULAVR_OUTPUT_ADDR, r16 ldi r24, 0x0A sts SIMULAVR_OUTPUT_ADDR, r24 exit: rjmp exit .org 0x7FF6 jmp out change: ldi r16, 'B' jmp out
$ avr-gcc -mmcu=atmega328 -Wall -Wextra -save-temps -nostartfiles -DSIMULAVR_OUTPUT_ADDR=0x20 -DSIMULAVR_EXIT_ADDR=0xFF rjmp_test.S -o rjmp_test.elf rjmp_test.o: In function `reset': (.text+0x2): relocation truncated to fit: R_AVR_13_PCREL against `no symbol' collect2: error: ld returned 1 exit status
Let me try. This may or may not make it clear, but there is no RJMP disadvantage or penalty for having more than 8kb of FLASH.
Edit: fixed FLASH
I don't say that it don't work, but there are no guaranty!
It's totally up to Atmel how to behave when running undefined instructions.
I don't say that it don't work, but there are no guaranty!
It's totally up to Atmel how to behave when running undefined instructions.
Please point to the documentation which states that rjmp/rcall across the bottom/top flash boundary has undefined behaviour.
Nowhere that is why you can't count on it! (They have never given any information about the internal memory controller).
But for the 8K devices they specifically say it will wrap around.
This would be the same as saying that it's ok to make a jump and call outside the memory, and expect that unused MSB(s) will be handled as zeros!
Add:
As I say it will probably work but when they sell 16K chips that actually is 32K dies programmed to behave as the 16k counterpart, it could easy not be the case.
But for the 8K devices they specifically say it will wrap around.
No, for 8K devices they say that all of flash is reachable from every location. That is not a prohibition against wrapping in the >8kB case.
The instruction set manual is as close to an authoritative document as there is. No mention of any prohibition against wrapping under any circumstances. While its safe use is not explicitly mentioned, that is not enough to infer the behaviour is undefined. Unmentioned != undefined.
In the absence of documentation otherwise, there is no reason to expect arithmetic on the PC to be anything but normal twos-complement arithmetic. Indeed, the fact that the rjmp/rcall instructions employ twos-complement operands implies this. For this to work anywhere, even without crossing the bottom/top boundary, the operand must be sign-extended to match the width of the PC before the arithmetic is performed.
As I say it will probably work but when they sell 16K chips that actually is 32K dies programmed to behave as the 16k counterpart, it could easy not be the case.
It is a specious argument. Any evidence that down-grading of this nature occurs? And if it does, any evidence that this supposed 'programming to behave as 16k' would omit the changes required to properly handle PC arithmetic? When programming and verifying such a down-graded 16k device, what happens when flash above 16 is programmed? What happens when it is verified? What about the extra EEPROM? What about the extra SRAM? What about the different completely different fuse layout between, say, the 168 and the 328? What about the different BLS sizes? If Atmel had engineered their 32k devices to be factory-configurable to look like their 16k variant, why omit proper PC width adjustment during relative jumps/calls when so many other configuration changes would be required anyway?
I know flash wraps around even when you READ past the end of FLASH. Example: reading 0x8000 on a m328 will read location 0x0000.
Do really expect that they produce as many different dies as chips.
Why do you think that the biggest chip in a serie always come first?
Back in 2000 I started on a project, and desided to use a AVR4414, but before we got in production it became obsolete, but I should get our 8515s for the same price as the 4414.
When asked why, it because it was never made they where all 8515, and the demand was smaller than expected.
I will be surprised if not all tiny 4,5,9 are tiny 10's
Like all 5E seems to have 32K flash (the new prices seems to indicate it).
Because this is a flash chip, you will not know which things Atmel have programmed, that can change the behaviour of a die.
for sure m328 is a "real" chip ;)
The question would be if someone could do the test on a m164. (And best on one of the first chips sold!).
Funny small side note.
The NXP ARM 2101 2102 2103 where all 2103 (at least in the beginning), (I know some that by accident used 2101 on some PCBs where the code was to big and everything worked fine.)
From my memory, I think we confirmed that a Tiny5 was in fact Tiny10. i.e. it contained the extra Flash.
Apparently many 64kB STM32F103 contain 128kB Flash.
.
I suspect that different variants come from the same die. But any configuration is likely to look after the address mapping.
However you do anything off-spec at your own risk.
Any respectable tool will tell you if you are wrapping.
.
David.
The question would be if someone could do the test on a m164. (And best on one of the first chips sold!).
I challenge anyone with the 16k variant of a family which includes a 32k variant to demonstrate that bottom/top boundary PC arithmetic fails. A virtual cold one.
I cannot conceive of a reason that, having down-graded SRAM size, downgraded EEPROM size, downgraded flash size, adjusted BLS page size, re-arranged fuse bits, that the AVR designers would have overlooked shearing off the MSB of the PC for instructions involving PC arithmetic. As I've pointed out, the rjmp/rcall operand already undergoes sign extension to match the PC. If the PC width were not properly accounted for during this arithmetic, then it would fail everywhere, not just at the bottom/top boundary.
In the (so far hypothetical) case of a 32k device down-graded and sold as a 16k device, which bank would be used? Your assertion that wrapping is 'undefined' implies that it must be the lower bank. What if the 'bad' bank is the lower bank? In order for that to work, the MSB of the PC would be 'locked' as '1'. In effect, either bank could be selected by locking the MSB either to '0' or to '1' during the final stages of production. Regardless, PC arithmetic would remain unaffected, even at the (new) bottom/top boundary.
Atmel is quite clear in all other respects when it comes to undefined behaviour i.e. reserved bits in I/O registers, reserved I/O addresses, timing constraints on system clock and input signals, etc. Yet they have failed to make any statements of any kind w.r.t. a prohibition against PC wrapping. Why do you suppose that is?
Can anyone recall any processor they have ever used at any time in the history of computing where PC wrapping did not operate as expected?
Can anyone recall any processor they have ever used at any time in the history of computing where PC wrapping did not operate as expected?
Actually, yes :-)
Let me cite from wikipedia:
(...) the [80286] CPU was supposed to emulate an 8086's behavior in real mode, its startup mode, so that it could run operating systems and programs that were not written for protected mode. The 80286 had a bug: it failed to force the A20 line to zero in real mode. Therefore, the combination F800:8000 would no longer point to the physical address 0x00000000 but the correct address 0x00100000. As a result, some DOS programs would no longer work. To remain compatible with such programs, IBM decided to fix the problem on the motherboard.
The 80286 failed to do the wraparound when it should, maybe you recall the A20 gate from old PC's BIOS setup.
It depends what you mean with expected!
If PC is 16 bit I would expect some chips (not only AVRs), to mirror as a non complete memory address.
I never checked if the AVR4414 could work as a AVR8515 , but since they have different ID's I guess that they(Atmel) could program the address decoder aswell.
I cannot conceive of a reason that, having down-graded SRAM size, downgraded EEPROM size, downgraded flash size, adjusted BLS page size, re-arranged fuse bit ........................
It's called time to marked!
I know 2 other chips but they are under NDA and not AVR's! (when you buy die's you know what you get ;) )
Actually, yes :-)
Fair enough. But as you go on to quote:
The 80286 had a bug:
That is a design/implementation error (and a well-documented one), not designed-in undefined behaviour which lacks any documentary evidence.
It's called time to marked(sic)!
I'm sorry, you're suggesting that the time saved during production by >>not<< locking the PC's MSB to a '0' or a '1' based on a the tested and selected bank, while time >>is<< spent during production to burn in the device signature, >>and<< calibrate the RC and burn in the calibration value, >>and<< burn other production information into the signature row, >>and<< adjust BLS based on the reduced flash size, >>and<< twiddle fuses... is an important short-cut to market?
Not a compelling argument.
My challenge stands.
Although, this is now a ridiculous departure from the OP...
I just checked in the simulator. (mega164)
First the assembler complain with rjmp out of reach, which kind of indicate that we are outside defined area.
start: ldi r16,0x20 start2: push r16 ; ldi r16,0x00 ; ldi r16,0x20 ldi r16,0x40 push r16 ret .org 0x0020 inc r16 rjmp start2-1 ; this works ;rjmp start2-2 ; this give a assembler error
And that behaves like the you expect (0x00 0x20 0x40 .... give the same result, and high byte of PC is 0x00), so PC has the size of the flash.
First the assembler complain with rjmp out of reach,
I mentioned this in #13. However, assembler != hardware.
outside defined area.
... of the assembler, not the hardware.
Try my code in #13. You'll need to change the .org at the end to 0x3FF6 for a 16k device. You'll need also to change the mechanism for communicating with the simulator. My code relies on simulavr's I/O port-to-pipe facility. You can simply step through the code instead and see where the PC leads you.
Since we are assured by various people at Atmel that the simulator is based directly on the VHDL for the device, we can be equally assured that it will behave the same as real hardware.
No 1 die can make different chips.
They have to calibrate osc, ADC gain etc. anyway.
But over and out about this
I have just tried out rjmp wrapping on a ATmega328p on my (shock, horror) Arduino board. Ignore the reference to a soft UART (I program entirely in assembler):
code:
main: jmp end m1: ldi a,'A' rcall write_byte_96 self: rjmp self ;--------------------------------------------------------------------- .nolist .include "uart96_328.inc" .list ;---------------------------------------------------------------------- .org FLASHEND - 0x100 ; account for Arduino bootloader end: .dw 0xC149 ; calculated wrap-around jump to m1
The jump was calculated by inspection of the listing file:
main: C:000047 940c 3eff jmp end C:000049 e401 m1: ldi a,'A' C:00004a d001 rcall write_byte_96 C:00004b cfff self: rjmp self ;--------------------------------------------------------------------- .list ;---------------------------------------------------------------------- .org FLASHEND - 0x100 ; account for Arduino bootloader end: .dw 0xC149 ; calculated wrap-around jump to m1 C:003eff c149
My assembler (AVRA) didn't like an rjmp at the very top of memory trying to access an address near the bottom, but the calculated jump worked perfectly.
My first version didn't work:
.org FLASHEND end: .dw 0xC049
I had forgotten that Arduinos come with a bootloader occupying 0x3F00 to 0x3FFF, so my initial jump from 'main:' was rejected by the mcu.
Yes, but the real test is doing that on an Arduino with the mega 168. I think I might have one somewhere...
;rjmp start2-2 ; this give a assembler error
Back when I was your age an ASM project in AVRstudio would have a check box for Wrap Relative Jumps.
It's still there but it will still give a out of reach error!
I found one mega 168 Arduino and wrote a blink test program (adapted from #30):
#define __SFR_OFFSET 0x00 #include <avr/io.h> .global main main: jmp end self: rjmp self m1: /* blink LED if jump OK */ ldi r16, _BV(PB5) out DDRB, r16 /* init delay counter */ clr r17 clr r24 clr r25 toggle: out PINB, r16 delay: adiw r24, 1 adc r17, r1 brhc delay rjmp toggle .org 0x3700 end: rjmp 0x4000 + m1
It works, proving that the Mega 168 program counter wraps around (it shouldn't work in a 328 Arduino). I'll attach my Arduino sketch, fully written in assembly (I had never actually tried to do a sketch fully in assembly, but it's possible).
That may not be an unequivocal test. Can you post the resulting .elf (or .hex)?
Here is the elf file. I never guessed this wraparound issue would cause so much confusion...