Posted by avruser1523: Fri. Jan 14, 2022 - 03:46 AM
1
2
3
4
5
Total votes: 0
@clawson great explanation and information.
I noticed trying to write data to program memory with something like this:
int (*f)() = &main;
*(char*)f = 5;
generates this:
96: 25 e0 ldi r18, 0x05 ; 5
9a: 20 83 st Z, r18
So now trying to figure where is the Z register being populated or what is it mapped to and how `st` figures destination (Z) is program memory and not SRAM.
From googling it seems Z is specially used to access program memory.
main() is simply the address of a location in the flash based code space. You cannot write to that simply by writing through a pointer to it??. Sure some code will be generated because all that will happen is that the flash address (not a RAM address, it's in a different memory space!!) of "main" will be determined then a write to that location will be generated (you didn't actually show the bit where Z was loaded with the flash address of main() ). In fact if I repeat the exercise to show more code first you will notice::
D:\junkavr\testGCC\testGCC\main.c(10,1): warning: accessing data memory with program memory address
The compiler has cleverly spotted what you are up to and does not like it either! But it does generate some code:
Perhaps because circumstances are different (I haven't seen the rest of your code) the compiler has chosen not to go indirect via Z in my case but to make a direct write. So you can see it's going to STS to some location 0x20E in the RAM space. Sure the label in the disassembly shows that "main" is located at 0x20E in flash but flash 0x20E != RAM 0x20E and that is the important point. You cannot just pick the address of something in one address space and hen use it for any kind of access in another address space. It's tantamount to just picking random numbers! I mean who knows what might actually have been positioned at 0x20E in the RAM where this write is being made?? I'm probably corrupting something with this pernicious code!
As it happens the AVR does allow flash to be written (it's how bootloaders work) but it's not done with a an ST or STS. The opcode that does it is SPM (Self Program (flash) Memory). On Mega AVRs the SPM opcode is limited in that it cannot appear in the "normal" flash space but there is a region (towards the top of the flash space) called the "BootLoader Section" (BLS) and SPM can only be fetched and executed succcessfully in the BLS. But it's not as simple as:
You will see that the "minimal example" to operate one SPM is:
;This example shows SPM write of one page for devices with page write
;- the routine writes one page of data from RAM to Flash
; the first data location in RAM is pointed to by the Y-pointer
; the first data location in Flash is pointed to by the Z-pointer
;- error handling is not included
;- the routine must be placed inside the boot space
; (at least the do_spm sub routine)
;- registers used: r0, r1, temp1, temp2, looplo, loophi, spmcrval
; (temp1, temp2, looplo, loophi, spmcrval must be defined by the user)
; storing and restoring of registers is not included in the routine
; register usage can be optimized at the expense of code size
.equ PAGESIZEB = PAGESIZE*2 ;PAGESIZEB is page size in BYTES, not words
.org SMALLBOOTSTART
write_page:
;page erase
ldi spmcrval, (1<<PGERS) + (1<<SPMEN)
call do_spm
;transfer data from RAM to Flash page buffer
ldi looplo, low(PAGESIZEB) ;init loop variable
ldi loophi, high(PAGESIZEB) ;not required for PAGESIZEB<=256
wrloop: ld r0, Y+
ld r1, Y+
ldi spmcrval, (1<<SPMEN)
call do_spm
adiw ZH:ZL, 2
sbiw loophi:looplo, 2 ;use subi for PAGESIZEB<=256
brne wrloop
;execute page write
subi ZL, low(PAGESIZEB);restore pointer
sbci ZH, high(PAGESIZEB) ;not required for PAGESIZEB<=256
ldi spmcrval, (1<<PGWRT) + (1<<SPMEN)
call do_spm
;read back and check, optional
ldi looplo, low(PAGESIZEB) ;init loop variable
ldi loophi, high(PAGESIZEB) ;not required for PAGESIZEB<=256
subi YL, low(PAGESIZEB) ;restore pointer
sbci YH, high(PAGESIZEB)
rdloop: lpm r0, Z+
ld r1, Y+
cpse r0, r1
jmp error
sbiw loophi:looplo, 2 ;use subi for PAGESIZEB<=256
brne rdloop
;return
ret
do_spm:
;input: spmcrval determines SPM action
;disable interrupts if enabled, store status
in temp2, SREG
cli
;check for previous SPM complete
wait: in temp1, SPMCR
sbrc temp1, SPMEN
rjmp wait
;SPM timed sequence
out SPMCR, spmcrval
spm
;restore SREG (to enable interrupts if originally enabled)
out SREG, temp2
ret
And that is the ONLY way to write to flash memory space.
As you will read elsewhere if you dig into the details of it (if using C/C++ in GCC then this is perhaps a good place to start: https://www.nongnu.org/avr-libc/... ) you cannot simply pick some byte location in flash and write 0x05 to it. Flash is broken up into "SPM pages" which are some binary multiple in size (often 32, 64 or 128 bytes long). To write a single byte location you would need to read the entire 32/64/128 in a page to the "SPM buffer" and, while doing so, locate where the 0x05 is to go and replace the current contents with that. You would then issue (using SPM and SPMCR) an "erase page" command which wipes all 32/64/128 back to 0xFF (a high voltage charge pump is applied to all the transistors in the NAND locations of the flash page to return any 0 bits to 1 so all the bytes read 0xFF). Then the contents of your SPM buffer will be written (by the issue of another SPM and the correct setting of SPMCR) back to the now erased page - so to change one byte you actually change all 32/64/128 in the adjacent bytes of the page it sits in. If you look at the code above you can see that is pretty much what it is doing.
In the AVR-Libc example pretty much the same thing is moved from the Asm level to the C level (which may make it a little easier to see the logic of what is going on):
void boot_program_page (uint32_t page, uint8_t *buf)
{
uint16_t i;
uint8_t sreg;
// Disable interrupts.
sreg = SREG;
cli();
eeprom_busy_wait ();
boot_page_erase (page);
boot_spm_busy_wait (); // Wait until the memory is erased.
for (i=0; i<SPM_PAGESIZE; i+=2)
{
// Set up little-endian word.
uint16_t w = *buf++;
w += (*buf++) << 8;
boot_page_fill (page + i, w);
}
boot_page_write (page); // Store buffer in flash page.
boot_spm_busy_wait(); // Wait until the memory is written.
// Reenable RWW-section again. We need this if we want to jump back
// to the application after bootloading.
boot_rww_enable ();
// Re-enable interrupts (if they were ever enabled).
SREG = sreg;
}
From your previous posts you seem to be an intelligent engineer so I'm not sure how you are having such a problem grasping this concept of multiple (and quite separate!) memory spaces in a Harvard architecture CPU. I wonder if you are possibly "over thinking" this a bit?
By the way one consequence of all this is that unlike a von Neumann machine it is difficult (though I ugess by SPM in the BLS just possible) for an AVR to run self modifying code. The way self modifying code is usually done is to arrange for the opcodes being fetched by the CPU to be located in RAM then some of those opcodes can make modifications to upcoming code that is just ahead of the code that does the modifying. It's a very dangerous thing to do but it's quite possible (I'm old enough to have "hacked" the protection systems on games like "Knightlore" on the Sinclair Spectrum and CPC464 to be able to "get in" and change the behaviour (infinite lives, map editor) on a machine like a Z80 (with games loaded into RAM from tape) such games had a tape loader that did protection by self-modifying code - thankfully you just couldn't do something like this on an AVR because the CPU will only opcode fetch from flash, not RAM and it is very very difficult to do "on the fly" changes to that flash.
Also BTW you may have spotted it above but SRAM is made up of fairly complex flip-flops that hold the state of each bit and can be easily changed. In flash the technology is quite different and far simpler - it is a technology called NAND flash and the stored bits are basically a stored charge on the isolated gate of a transistor. Using the high voltage method the charges on each bit location transistor can be restored to 1. Then some bits can be written to 0. There is also a sensing mechanism for reading that can sense the state of the stored charge. But this is NOTHING like SRAM !
Posted by avrcandies: Fri. Jan 14, 2022 - 11:21 AM
1
2
3
4
5
Total votes: 0
So now trying to figure where is the Z register being populated or what is it mapped to and how `st` figures destination (Z) is program memory and not SRAM.
You do know that the Z register is simply cute way of referring to registers r31:r30, right? Of course r31 to r00 live in common sram (or at least we are told this)...note otherwise we really have no way of knowing that it is in a common sram area, other than from documents...the opcodes, instructions, etc are there partly to make such matters invisible to us. You don't really care whether r14 is in an sram array of storage or is in some separated flip flops tucked away in the corner of the AVR, apart from r09 & r21.
So Z is not in program memory, but in sram...prog memory (flash, we are told) is accessed through LPM & SPM instructions.
It might be beneficial to get thoroughly familiar with the AVR addressing modes & instruction set. That in itself would probably answer the bulk of your uncertainties.
From googling it seems Z is specially used to access program memory.
That is not a good way to say it. Z (r31:r30), Y (R29:r28), X (r27:r26), are simply different register pairs. Z can be used for anything that X & Y can. To access program memory you use LPM & SPM commands. Each of these commands must include Z as a parameter source (X & Y not allowed). Not allowing X & Y, is a rare nonorthogonal AVR design tradeoff.
By studying the instruction set, you should (or will) know that rather quickly.
The following needs to be looking very familiar, or you will continue to scratch your head in an infinite loop:
When in the dark remember-the future looks brighter than ever. I look forward to being able to predict the future!
You don't really care whether r14 is in an sram array of storage or is in some separated flip flops tucked away in the corner of the AVR
+1
But it is useful (important?) to understand that SFRs really are different...
(but I may have mentioned that before...?)
ADDENDUM
OP started with references to x86 for comparison.
A big difference between x86 and AVR (and other microcontrollers) is that the x86 has no internal RAM - so its CPU registers really are different from the rest of its SRAM
From googling it seems Z is specially used to access program memory.
Just to point out that Z can be used for more than one thing! Sure, if you are doing SPM stuff then (as well as SPMCR) the Z pair (ZH:ZL = R31:R30) are involved in that but in "normal running" Z (and Y and X) are used often and specially if you start doing anything that involves "offsets from a base address" - like array elements or struct{} members or, indeed, anything that is "indexed" (X, Y, Z are, indeed "index registers"). While this may not be what the compiler actually chooses to do one might envisage code such as:
but STS is a large and slow opcode so it would be more efficient to use:
ldi r30, 0xC7
ldi r31, 0x01
ldi r24, 17
st Z+3, R24
ldi r24, 26
st Z+5, r24
ldi r24, 34
st Z+8, r24
Sure there is a bit more "overhead" in this at the start as you have to load the 0x1C7 base into Z but then the ST ops are more efficient than STS. The code you saw for writing 0x05 to the "main" address (but in RAM not flash) was using this "indirect/indexed through Z technique"
For your info when a function in GCC uses local variables on the stack then the way the compiler actually handles that is that it moves the stack down to make room for the locals and then it arranges for Y to point at the next location below this so all the stack frame variables are at Y+offset starting from Y+1 upwards. It's the same kind of technique as the Z thing but this means Y is almost always tied up for use as the stack frame pointer so it only leaves Z and X for the compiler to use for other things.
As an example in this fairly contrived program:
#include <avr/io.h>
void func() {
volatile int arr[5];
volatile long l;
volatile char c;
c = 'X';
l = 0xBABEFACE;
arr[3] = 12345;
}
int main()
{
int (*f)() = &main;
*(char*)f = 5;
while(1) {
func();
}
}
The code of func() is:
0000020e <func>:
#include <avr/io.h>
void func() {
20e: cf 93 push r28
210: df 93 push r29
212: cd b7 in r28, 0x3d ; 61
214: de b7 in r29, 0x3e ; 62
216: 2f 97 sbiw r28, 0x0f ; 15
218: cd bf out 0x3d, r28 ; 61
21a: de bf out 0x3e, r29 ; 62
volatile int arr[5];
volatile long l;
volatile char c;
c = 'X';
21c: 88 e5 ldi r24, 0x58 ; 88
21e: 8f 87 std Y+15, r24 ; 0x0f
l = 0xBABEFACE;
220: 8e ec ldi r24, 0xCE ; 206
222: 9a ef ldi r25, 0xFA ; 250
224: ae eb ldi r26, 0xBE ; 190
226: ba eb ldi r27, 0xBA ; 186
228: 8b 87 std Y+11, r24 ; 0x0b
22a: 9c 87 std Y+12, r25 ; 0x0c
22c: ad 87 std Y+13, r26 ; 0x0d
22e: be 87 std Y+14, r27 ; 0x0e
arr[3] = 12345;
230: 89 e3 ldi r24, 0x39 ; 57
232: 90 e3 ldi r25, 0x30 ; 48
234: 8f 83 std Y+7, r24 ; 0x07
236: 98 87 std Y+8, r25 ; 0x08
}
238: 2f 96 adiw r28, 0x0f ; 15
23a: cd bf out 0x3d, r28 ; 61
23c: de bf out 0x3e, r29 ; 62
23e: df 91 pop r29
240: cf 91 pop r28
242: 08 95 ret
20e/210 starts by pusing the two halves of Y (R29:R28) to the stack to preserve them. 212/214 then reads in in the stackpointer SPL/SPH from the 0x3D/0x3E IO locations where it is made visible. The SBIW subtracts 0xF (15) from it before storing it back. That is 10 bytes for arr[], 4 bytes for l and 1 byte for c. So space has been opened up to hold the variables. Note that it very deliberately did the reading and sbtracting using R29/R28 (Y) so by the end Y is holding the address of the next empty stack location. arr[] will be the 10 bytes from Y+1 to Y+10, the long variable will be in the four bytes from Y+11 to Y+14 and the char variable, c, will be at Y+15. You can then see exactly that when, for example, the value of c is written as 'X' by ST Y+15 at 0x21E. The value for l is written by the ST Y+11 .. ST Y+14 at 0x228 .. 0x22E and the 12345 is written to the middle element of the array by the stores at 0x234/0x236.
Finally as the function unwinds (and remember that Y is still holding "base of stack frame") 0x238 adds the 0x0F that was "resereved" back to Y so it how holds the SP position as it was after R28/R29 had been pushed on entry. Thta is written OUT to the SPH/SPL location to change the stack pointer then the last couple of POPs recover whatever Y was on entry to this function and it can finally RET.
So X+offset (rarely), Y+offset (always for stack frame variables) and Z+offset (for general read/write of indexed data) are seen a lot in the code generated by GCC.
(note that other compilers do this in different ways, in compilers such as Codevision and Imagecraft they don't try to "share space" on the single hardware stack but only use the hardware stack for CALL/RET and PUSH/POP. Those compiers set up a separate "soft stack" for doing the stack frame variables (but this does require the compilers to reserve a special area for that "data stack").
Posted by avruser1523: Fri. Jan 14, 2022 - 01:47 PM
1
2
3
4
5
Total votes: 0
@clawson
I actually first did suspect my C code was changing something in SRAM and not program memory, but since my compiler did't warn about anything (maybe I didn't use a proper flag) I thought it's smart enough to detect this and use a special instruction/offset to do what I want, turns out it doesn't. Of course this is not doable in x86 and ends up with a segfault as .text area is read-only, but I just wanted to see what the compiler generates.
Harvard is.... interesting. I should probably change the title of the thread to make it more relatable.
Where did you get your compiler? I forget when it was (maybe 4.6 ?) but the error reporting was greatly improved at one issue so the 5.4.0 that comes with Studio 7 definitely has this more thorough error reporting. I wonder if you are using something "really old" like "WinAVR" ? (that is 4.3.0 - it dates from December 2009)
I guess it must be the -Wall in there that is causing all warning to be shown?...
... nope, just turned off -Wall and I still get the warning (it did turn off the ones I was getting about variables being set but not used though).
I may have a go at this in one of my Ubuntu VMs but in theory, given the same command line switches, if it's the same 5.4.0 it should behave the same whether Windows or Linux.
Posted by avruser1523: Fri. Jan 14, 2022 - 02:27 PM
1
2
3
4
5
Total votes: 0
avrcandies wrote:
You don't really care whether r14 is in an sram array of storage or is in some separated flip flops tucked away in the corner of the AVR
Thing is i'm just trying to picture a bit of the details.
From the datasheet and previous discussions I figured registers (GPRs and SFRs) are separate from SRAM, as you say they are separated flip flops tucked away somewhere, but are part of the address space of SRAM.
Well it is odd. I got gcc-avr and avr-libc in a VM and set up a short test where "int n" is a deliberately unrefd variable and also with the code to make a write to a code address and I find...
Like you found it does NOT object to the attempted use of a flash symbol as the target of a write. Not sure why this 5.4.0 is behaving different to a Windows based one???
EDIT: OK so doing exactly the same thing in Windows actually produces the SAME result...
So there's something else on the Studio 7 invocation that causes the warning to be thrown - let me experiment....
So there's something else on the Studio 7 invocation that causes the warning to be thrown - let me experiment....
avruser1523 wrote:
It's actually the optimization flag `-Og` that shows it.
Oh yeah, I get that now, how very curious indeed. I guess when I didn't specify -O it was an implied -O0 by default. But it's still generating AVR code so I wonder why the backend would not have spotted this?
I guess I'll just add this to my long list of reason why -O0 should never be used !!
The IBM PC had DRAM not SRAM. Sure the registers in the CPU were probably static (is x86 CMOS?) but could you actually stop it clocking anyway? (can't remember anything like "SLEEP"?)
I have an idea that the original 8088/6 did not have static registers; the clock could not be stopped. Certainly the 8080 wasn't a static device and got quite upset if the clock went away. Ken Shirriff's blog is an excellent resource for gate-level analysis of processors and other parts (often by analysis of the actual silicon).
Posted by avrcandies: Fri. Jan 14, 2022 - 09:09 PM(Reply to #67)
1
2
3
4
5
Total votes: 1
Thing is i'm just trying to picture a bit of the details.
The best source for programming details is to simply assimilate the datasheet--it gives all the details that are exposed to the public, in a form they want use to visualize.
They show some registers on a diagram as next to each other, or in an array, then that is what we believe & go with.
What happens inside the chip (electronic details) could be very different, shown or unshown. The diagrams may generalize many aspects just to make it easier to understand, by filtering out details unrelated to programming needs.
When in the dark remember-the future looks brighter than ever. I look forward to being able to predict the future!
#5 "If you think you need floating point to solve the problem then you don't understand the problem. If you really do need floating point then you have a problem you do not understand."
Posted by Brian Fairchild: Sun. Jan 16, 2022 - 01:28 PM
1
2
3
4
5
Total votes: 0
clawson wrote:
So did 80C88 add instructions to sleep the CPU?
No. Remember, the '86 family is a microprocessor not a microcontroller so the concept of 'sleep' doesn't really fit. With all the peripherals external to the chip you'd somehow have to put those to sleep as well. Even the clock for the CPU usually came from an external chip, something like an 8284, with that clock signal being fed to all the peripheral chips.
#5 "If you think you need floating point to solve the problem then you don't understand the problem. If you really do need floating point then you have a problem you do not understand."
@clawson great explanation and information.
I noticed trying to write data to program memory with something like this:
generates this:
So now trying to figure where is the Z register being populated or what is it mapped to and how `st` figures destination (Z) is program memory and not SRAM.
From googling it seems Z is specially used to access program memory.
- Log in or register to post comments
TopFor the definitive answer, look in the AVR Instruction Set Manual:
http://ww1.microchip.com/downloads/en/devicedoc/atmel-0856-avr-instruction-set-manual.pdf
It's used for Indirect addressing in general - not just for program memory:
EDIT
look earlier in the code ...
Top Tips:
- Log in or register to post comments
TopI think it needs to be split at #42 ?
A Moderator can do that for you ...
Top Tips:
- Log in or register to post comments
Topmain() is simply the address of a location in the flash based code space. You cannot write to that simply by writing through a pointer to it??. Sure some code will be generated because all that will happen is that the flash address (not a RAM address, it's in a different memory space!!) of "main" will be determined then a write to that location will be generated (you didn't actually show the bit where Z was loaded with the flash address of main() ). In fact if I repeat the exercise to show more code first you will notice::
The compiler has cleverly spotted what you are up to and does not like it either! But it does generate some code:
Perhaps because circumstances are different (I haven't seen the rest of your code) the compiler has chosen not to go indirect via Z in my case but to make a direct write. So you can see it's going to STS to some location 0x20E in the RAM space. Sure the label in the disassembly shows that "main" is located at 0x20E in flash but flash 0x20E != RAM 0x20E and that is the important point. You cannot just pick the address of something in one address space and hen use it for any kind of access in another address space. It's tantamount to just picking random numbers! I mean who knows what might actually have been positioned at 0x20E in the RAM where this write is being made?? I'm probably corrupting something with this pernicious code!
As it happens the AVR does allow flash to be written (it's how bootloaders work) but it's not done with a an ST or STS. The opcode that does it is SPM (Self Program (flash) Memory). On Mega AVRs the SPM opcode is limited in that it cannot appear in the "normal" flash space but there is a region (towards the top of the flash space) called the "BootLoader Section" (BLS) and SPM can only be fetched and executed succcessfully in the BLS. But it's not as simple as:
In fact if you read the opcode manual:
http://ww1.microchip.com/downloa...
You will see that the "minimal example" to operate one SPM is:
And that is the ONLY way to write to flash memory space.
As you will read elsewhere if you dig into the details of it (if using C/C++ in GCC then this is perhaps a good place to start: https://www.nongnu.org/avr-libc/... ) you cannot simply pick some byte location in flash and write 0x05 to it. Flash is broken up into "SPM pages" which are some binary multiple in size (often 32, 64 or 128 bytes long). To write a single byte location you would need to read the entire 32/64/128 in a page to the "SPM buffer" and, while doing so, locate where the 0x05 is to go and replace the current contents with that. You would then issue (using SPM and SPMCR) an "erase page" command which wipes all 32/64/128 back to 0xFF (a high voltage charge pump is applied to all the transistors in the NAND locations of the flash page to return any 0 bits to 1 so all the bytes read 0xFF). Then the contents of your SPM buffer will be written (by the issue of another SPM and the correct setting of SPMCR) back to the now erased page - so to change one byte you actually change all 32/64/128 in the adjacent bytes of the page it sits in. If you look at the code above you can see that is pretty much what it is doing.
In the AVR-Libc example pretty much the same thing is moved from the Asm level to the C level (which may make it a little easier to see the logic of what is going on):
From your previous posts you seem to be an intelligent engineer so I'm not sure how you are having such a problem grasping this concept of multiple (and quite separate!) memory spaces in a Harvard architecture CPU. I wonder if you are possibly "over thinking" this a bit?
By the way one consequence of all this is that unlike a von Neumann machine it is difficult (though I ugess by SPM in the BLS just possible) for an AVR to run self modifying code. The way self modifying code is usually done is to arrange for the opcodes being fetched by the CPU to be located in RAM then some of those opcodes can make modifications to upcoming code that is just ahead of the code that does the modifying. It's a very dangerous thing to do but it's quite possible (I'm old enough to have "hacked" the protection systems on games like "Knightlore" on the Sinclair Spectrum and CPC464 to be able to "get in" and change the behaviour (infinite lives, map editor) on a machine like a Z80 (with games loaded into RAM from tape) such games had a tape loader that did protection by self-modifying code - thankfully you just couldn't do something like this on an AVR because the CPU will only opcode fetch from flash, not RAM and it is very very difficult to do "on the fly" changes to that flash.
Also BTW you may have spotted it above but SRAM is made up of fairly complex flip-flops that hold the state of each bit and can be easily changed. In flash the technology is quite different and far simpler - it is a technology called NAND flash and the stored bits are basically a stored charge on the isolated gate of a transistor. Using the high voltage method the charges on each bit location transistor can be restored to 1. Then some bits can be written to 0. There is also a sensing mechanism for reading that can sense the state of the stored charge. But this is NOTHING like SRAM !
- Log in or register to post comments
TopAnyway to me the whole thread seems to be about the same thing - the understanding of Harvard versus von Neumann.
- Log in or register to post comments
TopYou do know that the Z register is simply cute way of referring to registers r31:r30, right? Of course r31 to r00 live in common sram (or at least we are told this)...note otherwise we really have no way of knowing that it is in a common sram area, other than from documents...the opcodes, instructions, etc are there partly to make such matters invisible to us. You don't really care whether r14 is in an sram array of storage or is in some separated flip flops tucked away in the corner of the AVR, apart from r09 & r21.
So Z is not in program memory, but in sram...prog memory (flash, we are told) is accessed through LPM & SPM instructions.
It might be beneficial to get thoroughly familiar with the AVR addressing modes & instruction set. That in itself would probably answer the bulk of your uncertainties.
That is not a good way to say it. Z (r31:r30), Y (R29:r28), X (r27:r26), are simply different register pairs. Z can be used for anything that X & Y can. To access program memory you use LPM & SPM commands. Each of these commands must include Z as a parameter source (X & Y not allowed). Not allowing X & Y, is a rare nonorthogonal AVR design tradeoff.
By studying the instruction set, you should (or will) know that rather quickly.
The following needs to be looking very familiar, or you will continue to scratch your head in an infinite loop:
When in the dark remember-the future looks brighter than ever. I look forward to being able to predict the future!
- Log in or register to post comments
Top+1
But it is useful (important?) to understand that SFRs really are different...
(but I may have mentioned that before...?)
ADDENDUM
OP started with references to x86 for comparison.
A big difference between x86 and AVR (and other microcontrollers) is that the x86 has no internal RAM - so its CPU registers really are different from the rest of its SRAM
Top Tips:
- Log in or register to post comments
TopJust to point out that Z can be used for more than one thing! Sure, if you are doing SPM stuff then (as well as SPMCR) the Z pair (ZH:ZL = R31:R30) are involved in that but in "normal running" Z (and Y and X) are used often and specially if you start doing anything that involves "offsets from a base address" - like array elements or struct{} members or, indeed, anything that is "indexed" (X, Y, Z are, indeed "index registers"). While this may not be what the compiler actually chooses to do one might envisage code such as:
One way it could do this would be (say nums[] is based at 0x1C7):
but STS is a large and slow opcode so it would be more efficient to use:
Sure there is a bit more "overhead" in this at the start as you have to load the 0x1C7 base into Z but then the ST ops are more efficient than STS. The code you saw for writing 0x05 to the "main" address (but in RAM not flash) was using this "indirect/indexed through Z technique"
For your info when a function in GCC uses local variables on the stack then the way the compiler actually handles that is that it moves the stack down to make room for the locals and then it arranges for Y to point at the next location below this so all the stack frame variables are at Y+offset starting from Y+1 upwards. It's the same kind of technique as the Z thing but this means Y is almost always tied up for use as the stack frame pointer so it only leaves Z and X for the compiler to use for other things.
As an example in this fairly contrived program:
The code of func() is:
20e/210 starts by pusing the two halves of Y (R29:R28) to the stack to preserve them. 212/214 then reads in in the stackpointer SPL/SPH from the 0x3D/0x3E IO locations where it is made visible. The SBIW subtracts 0xF (15) from it before storing it back. That is 10 bytes for arr[], 4 bytes for l and 1 byte for c. So space has been opened up to hold the variables. Note that it very deliberately did the reading and sbtracting using R29/R28 (Y) so by the end Y is holding the address of the next empty stack location. arr[] will be the 10 bytes from Y+1 to Y+10, the long variable will be in the four bytes from Y+11 to Y+14 and the char variable, c, will be at Y+15. You can then see exactly that when, for example, the value of c is written as 'X' by ST Y+15 at 0x21E. The value for l is written by the ST Y+11 .. ST Y+14 at 0x228 .. 0x22E and the 12345 is written to the middle element of the array by the stores at 0x234/0x236.
Finally as the function unwinds (and remember that Y is still holding "base of stack frame") 0x238 adds the 0x0F that was "resereved" back to Y so it how holds the SP position as it was after R28/R29 had been pushed on entry. Thta is written OUT to the SPH/SPL location to change the stack pointer then the last couple of POPs recover whatever Y was on entry to this function and it can finally RET.
So X+offset (rarely), Y+offset (always for stack frame variables) and Z+offset (for general read/write of indexed data) are seen a lot in the code generated by GCC.
(note that other compilers do this in different ways, in compilers such as Codevision and Imagecraft they don't try to "share space" on the single hardware stack but only use the hardware stack for CALL/RET and PUSH/POP. Those compiers set up a separate "soft stack" for doing the stack frame variables (but this does require the compilers to reserve a special area for that "data stack").
- Log in or register to post comments
Top@clawson
I actually first did suspect my C code was changing something in SRAM and not program memory, but since my compiler did't warn about anything (maybe I didn't use a proper flag) I thought it's smart enough to detect this and use a special instruction/offset to do what I want, turns out it doesn't. Of course this is not doable in x86 and ends up with a segfault as .text area is read-only, but I just wanted to see what the compiler generates.
Harvard is.... interesting. I should probably change the title of the thread to make it more relatable.
- Log in or register to post comments
Top- Log in or register to post comments
TopI'm on ubuntu, so `apt install gcc-avr avr-libc` and it's `5.4.0`, but no warnings.
- Log in or register to post comments
Top@awneil
which other SRAMs are you referring to? does x86 use SRAM beside its registers and cache?
- Log in or register to post comments
TopWell that is odd. Just for the record this is a typical invocation of the 5.4.0 in Studio 7...
I guess it must be the -Wall in there that is causing all warning to be shown?...
... nope, just turned off -Wall and I still get the warning (it did turn off the ones I was getting about variables being set but not used though).
I may have a go at this in one of my Ubuntu VMs but in theory, given the same command line switches, if it's the same 5.4.0 it should behave the same whether Windows or Linux.
- Log in or register to post comments
TopIt's actually the optimization flag `-Og` that shows it.
- Log in or register to post comments
TopGo back to the AVR1200 etc. It had NO RAM only the 32 registers.(and a HW stack to store return addresses, then was not memory mapped at all)
- Log in or register to post comments
TopThing is i'm just trying to picture a bit of the details.
From the datasheet and previous discussions I figured registers (GPRs and SFRs) are separate from SRAM, as you say they are separated flip flops tucked away somewhere, but are part of the address space of SRAM.
- Log in or register to post comments
TopWell it is odd. I got gcc-avr and avr-libc in a VM and set up a short test where "int n" is a deliberately unrefd variable and also with the code to make a write to a code address and I find...
Like you found it does NOT object to the attempted use of a flash symbol as the target of a write. Not sure why this 5.4.0 is behaving different to a Windows based one???
EDIT: OK so doing exactly the same thing in Windows actually produces the SAME result...
So there's something else on the Studio 7 invocation that causes the warning to be thrown - let me experiment....
- Log in or register to post comments
TopI guess I'll just add this to my long list of reason why -O0 should never be used !!
- Log in or register to post comments
TopIt's the optimization flag `-Og` that shows it.
- Log in or register to post comments
TopSorry for repeating the above comment, just wanted to make sure it doesn't get lost.
- Log in or register to post comments
TopThe IBM PC had DRAM not SRAM. Sure the registers in the CPU were probably static (is x86 CMOS?) but could you actually stop it clocking anyway? (can't remember anything like "SLEEP"?)
- Log in or register to post comments
TopThe x86 neither knows nor cares what type(s) of memory is/are connected to its external buses - SRAM, DRAM, Flash, EPROM, memory-mapped IO, ...
Perhaps I should have said, "... so its CPU registers really are different from any other SRAM in the system"
I think the original 8086 was NMOS ?
There were later 80C86 versions - also 80C186
I think it was certain 386 models that first introduced SRAM as cache?
EDIT
Fix quote.
From 'Intel 8086 Family User's Manual October', 1979:
Top Tips:
- Log in or register to post comments
TopI have an idea that the original 8088/6 did not have static registers; the clock could not be stopped. Certainly the 8080 wasn't a static device and got quite upset if the clock went away. Ken Shirriff's blog is an excellent resource for gate-level analysis of processors and other parts (often by analysis of the actual silicon).
Neil
Neil Barnes
www.nailed-barnacle.co.uk
- Log in or register to post comments
TopThe best source for programming details is to simply assimilate the datasheet--it gives all the details that are exposed to the public, in a form they want use to visualize.
They show some registers on a diagram as next to each other, or in an array, then that is what we believe & go with.
What happens inside the chip (electronic details) could be very different, shown or unshown. The diagrams may generalize many aspects just to make it easier to understand, by filtering out details unrelated to programming needs.
When in the dark remember-the future looks brighter than ever. I look forward to being able to predict the future!
- Log in or register to post comments
TopAbsolutely.
eg,
Top Tips:
- Log in or register to post comments
TopCorrect. They were made in an HMOS process and the datasheet specifies both a minimum and maximum clock period.
#1 Hardware Problem? https://www.avrfreaks.net/forum/...
#2 Hardware Problem? Read AVR042.
#3 All grounds are not created equal
#4 Have you proved your chip is running at xxMHz?
#5 "If you think you need floating point to solve the problem then you don't understand the problem. If you really do need floating point then you have a problem you do not understand."
- Log in or register to post comments
TopThe 80C86/8 came later - as a fully-static design.
https://datasheetspdf.com/pdf-file/843377/Harris/80C88/1
Top Tips:
- Log in or register to post comments
TopSo did 80C88 add instructions to sleep the CPU?
- Log in or register to post comments
TopI don't think so, but the hardware could. For a desktop processor it's more about how fast and how cheap than how slow and power-sipping.
Neil
Neil Barnes
www.nailed-barnacle.co.uk
- Log in or register to post comments
TopNo. Remember, the '86 family is a microprocessor not a microcontroller so the concept of 'sleep' doesn't really fit. With all the peripherals external to the chip you'd somehow have to put those to sleep as well. Even the clock for the CPU usually came from an external chip, something like an 8284, with that clock signal being fed to all the peripheral chips.
#1 Hardware Problem? https://www.avrfreaks.net/forum/...
#2 Hardware Problem? Read AVR042.
#3 All grounds are not created equal
#4 Have you proved your chip is running at xxMHz?
#5 "If you think you need floating point to solve the problem then you don't understand the problem. If you really do need floating point then you have a problem you do not understand."
- Log in or register to post comments
TopPages