2 Word AVR Instructions

Go To Last Post
14 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I was wondering how the AVR micro controller was able to distinguish between one word and two word instructions to know how to increment the program counter.

 

Since the address bus is 16 bits would it take two clock cycles for the AVR processor to fetch a 32 bit instruction and the instruction will only be executed after the entire instruction was fetched?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Greetings Nouveaux_AVRIL Welcome to AVR Freaks!

 

I think that you will find that your assumption is not  correct.

 

It will always read the first byte. Usually the op-code is in that byte. The second byte is almost always data. Thats the way RISC instructions work. So, interpreting the first byte provides the information about an additional byte. Further, there is an instruction pipeline so that the ALU knows a lot about the instruction before it ever reaches the point of execution.

 

You can actually get a good feel by looking at the AVR Instruction Set document. It can be found at: http://ww1.microchip.com/downloa...

 

Jim

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

Last Edited: Sat. Mar 2, 2019 - 05:06 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0


Microchip (Atmel) is completely free in designing the instruction set of their processors.

One of the goals to design the instruction set is to optimise it different ways to make it as easy as possible to decode the instructions.

For example instuctions which do a mathematical operation between two registers, and than store the result in one of those registers wil almost certainly have the 5bit adresses of the 32 registers in the same bits encoded in the opcode, and the "mathematical operation" (For example, add, or, substact, Xor) also in the same bits in the opcode.

 

If you compare the opcode of different instructions you will see a lot of overlap of bits between opcodes.

 

When an AVR fetches the first 16 bits of a 32 bit opcode, it is almost certain that the it has a unique bit pattern to mark it as a 32 bit opcode.

It can only fetch 16 bits at a time, so fetching the next 16 bits will be a clock cycle later, which means 32 bit instructions take longer to exectute than 16 bit opcodes.

Take for example a Long Call:

 

 

The opcode itself is encoded in 10 bits, and the other 22 bits define the address.

All bits that define the opcode are in the first 16 bits of the instruction.

Doing magic with a USD 7 Logic Analyser: https://www.avrfreaks.net/comment/2421756#comment-2421756

Bunch of old projects with AVR's: http://www.hoevendesign.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for the quick response!

 

Sorry if this may sound like a silly question, but I am still new and learning about AVR processors.

 

So for example we take the LDS instruction which is 32 bits. On the fetch cycle the first 16 bits will be read. The op-code and target register are stored in the first word and are read, but the data space address is located in the second word of the instruction. How will the processor know what the data space address is without a second fetch cycle for the second word? Is there something in the op-code that tells the processor to quickly fetch the second word before executing?

 

Edit: So after reading your response, it looks like it will take 2 clock cycles to fetch a 32 bit instruction. After the instruction is completely read it will execute the instruction. But while that instruction is executing it will fetch the next instruction. Is my understanding correct?

Last Edited: Sat. Mar 2, 2019 - 06:10 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yes, that is what a pipeline is all about. 

 

There can be hardware that analyzes the op-code several MCU clocks before it is ever used. 

 

Jim

 

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Euhm, you probably posted before you read my post ( 2 minutes difference).

 

The AVR probably has a mechanism that normally just fetches 16 bit chunks from Flash and pushes it in the Pipeline. That works for most 16 bit and 32 bit instructions. Only for branches this mechanism is interrupted. For a branch the last 16 bits fetched are worthless, and a new address must be loaded, which is why such instructions also take longer to execute.

 

ka7ehk wrote:
There can be hardware that analyzes the op-code several MCU clocks before it is ever used. 

With a 2-stage pipeline, which executes an instruction each cycle ???

The AVR is not smart enough for that.

To my amazement I read recently that some of the instructions for a 80486 took several thousand CPU cycles to execute.

PIC 16 has a 4 stage deep pipeline, but AVR has only 2 levels, and thus never looks further forwards.

Doing magic with a USD 7 Logic Analyser: https://www.avrfreaks.net/comment/2421756#comment-2421756

Bunch of old projects with AVR's: http://www.hoevendesign.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hmm, thought I read that the pipeline was more than two deep. Not the first or last time I'm wrong!

 

Jim

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

So for a 32 bit instruction like LDS as I used as an example before. The first 16 bits are read and sent through the pipeline. But the instruction is incomplete until the second instruction is fetched and sent down the pipeline, so does the first half of the instruction perform a NOP to wait for the second half of the instruction?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Pretty sure there is a diagram of the pipeline in most AVR spec sheets. For example, the Mega328P spec sheet available at the Microchip web site is at: 

 

http://ww1.microchip.com/downloa...

 

There is a brief statement just below Figure 6.1 in section 6.1.

 

A timing diagram for the pipeline is shown in Figure 6.4 in Section 6.6: Instruction Timing.

 

Jim

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Long time ago (about 20 years time fly), I for fun played with a SRET equivalent instruction for the AVR. (skip return meaning that if a function return false it skip the instruction after the (r)call).

(basically end with RET for true, and for false jump to a rutine that add one to the return addr.  

 

And here my problem was that if that instruction was a LDS or STS (only 32bit instructions on a 8515), I had to add 2 and not only one!

so I had to read a byte from the next instruction and add an extra it it is a LDS or STS instrution.

 

I tell this because it was simple to find out, because those instructions are simple to , decode and the AVR must do it a similar way, when it prefetch it get the size of the instruction, so the next prefetch can be handled as data or a instruction. 

 

Last Edited: Sat. Mar 2, 2019 - 09:54 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The processor is a finite state machine. As such, you do not wait for the rest of the instruction - it is part of the sequence

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Paulvdh wrote:
AVR has only 2 levels
One, actually:

 

 

Although a semantic argument might suggest that 'single level pipelining' really means two levels:  1) current 16-bit word, and 2) 16-bit word 'on-deck'.

 

 

Nouveaux_AVRIL wrote:
so does the first half of the instruction perform a NOP to wait for the second half of the instruction?
Don't make it more complicated than it is.

 

Most 16-bit instructions take one cycle to execute.  >>All<< 32-bit instructions take >>at least<< 2 cycles to execute.  Guess why.

 

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

Last Edited: Sat. Mar 2, 2019 - 03:00 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If you're really interested in the lowest level stuff you might be interested in opencores.org.

They collect VHDL / Verilog / Whatever code for putting microcontrollers in FPGA chips.

I believe the've got at least 4 different versions of the AVR architecture over there.

 

https://opencores.org/projects/avr_core

Doing magic with a USD 7 Logic Analyser: https://www.avrfreaks.net/comment/2421756#comment-2421756

Bunch of old projects with AVR's: http://www.hoevendesign.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You can probably assume that the prefetch thing goes along and reads (PC) from program memory every cycle.  For the two-word instructions, they'll go ahead and used that pre-fecthed word as part of the current instruction, and need another cycle to read the NEXT instruction.  Jump/Call/branch/skip instructions may adjust PC, making the pre-fetched word "not useful", and have to use up another cycle waiting for the value to be fetched from the new PC address.