Optimizing atmega SPI for speed

Go To Last Post
14 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I am working on an application where two ATMega324P processors will exchange data over SPI bus by connecting the SPI lines directly together. But after reading the data sheet and doing some tests it doesn't seem easy to implement a fast and reliable data link. A special problem is when the ATMega SPI slave is doing some other task than receiving from the master, the master will not know about the delay and the slave will lose bytes that has been sent.

To overcome this problem I used a separate IO pin as an acknowledge from the slave to the master (5 lines total), and instead of polling the SPIF ("byte sent") in the master, it has to poll the IO pin ("slave received byte") before it can continue.

The transfer is two bytes at once, and the ack pin from the slave toggle state at each byte recieve.

Code for master:

spi16:  ; Exchange var1, var2 over SPI
        cbi    PORTB, 4       ; Slave select

        out    SPDR0, var1    ; Send byte

spi161: sbic   PIND, 7        ; Wait for slave

        rjmp   spi161

        in     var1, SPDR0    ; Read byte

        out    SPDR0, var2    ; Send second byte

spi162: sbis   PIND, 7        ; Wait for slave

        rjmp   spi162

        in     var2, SPDR0    ; Read byte

        sbi    PORTB, 4       ; Deselect slave

        ret

Code for slave:

spi16:  ; Exchange var1 and var2 with SPI master.

        out    SPDR0, var1

        sbi    PORTD, 7	; Set SPI master ack pin

spi161: in     sys, SPSR0	; New char received from SPI?

        sbrs   sys, SPIF0
        rjmp   spi161		; No, wait for it

        in     var1, SPDR0	; Yes, get character

        out    SPDR0, var2

        cbi    PORTD,7		; Clear SPI master ack pin

spi162: in     sys, SPSR0	; New char received from SPI?

        sbrs   sys, SPIF0
        rjmp   spi162		; No, wait for it

        in     var2, SPDR0	; Yes, get character

        ret

Theoretically when the MCU's are running at 18.4MHz and SPI clock speed is set to mclk/4, the transfer speed could be up to 575 kb/s, and when testing the code above I measured approx 300 kb/s. Important is, I haven't yet verified that the data was transferred without errors :)

After analyzing the code (execution order) I came to that all instructions for master and slave will add up (no overlap/paralell processing at all) and contribute to overhead in addition to the hardware shifting of the data. So these about 30 instructions would be 2uS wasting of time, and the hardware shifting of data are about 3.5uS, sum is 5.5uS for 16bit SPI transfer -> 364 kb/s. Measured value was 300 kb/s, so the polling/wait loops do not repeat many cycles.

(when the hardware is running (shifting out data), these two routines are polling/waiting all the time, so at least 8*4 instructions of code can be executed during this time, probably at both the master and the slave side)

Some questions:
1. Is it actually necessary to use feedback from slave to master to make AVR-AVR SPI link reliable?
2. Is it necessary to use the SS pin when there are only one slave, I believe the slave resets anyway when one char has ben transferred?
3. Any ideas on how to optimize the above routines to save some cycles, other code examples?

Last Edited: Fri. Jun 19, 2009 - 07:04 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

As the two processors are nearby, have you considered using the USART instead to take advantage of a bit more buffering? (2.5 characters IIRC before you have to get to the RX ISR)

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

What transfer speed can you expect over the USART channel? The ATMega324P even has a SPI/USART combination, I haven't checked it.

In the design I use the USART for another purpose, communicating with a terminal.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Maximum operating speed is Fclk/2 for the USART (so the datasheet tells me ;))

The communication to the terminal is duplex or are you just sending?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

What transfer speed can you expect over the USART channel?

Why would it be much different than SPI? SPI slave can get clk/4 IIRC, but you then need in a continuous burst to do a full ISR every 32 AVR clock cycles--you ain't gonna be able to do that with the latencies you described so you are not going to achieve those speeds anyway.

Quote:

In the design I use >>the<< USART for another purpose, communicating with a terminal.

Remember that you have two, not one.

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

jayjay1974 wrote:
Maximum operating speed is Fclk/2 for the USART (so the datasheet tells me ;))

The communication to the terminal is duplex or are you just sending?

The usart is used for full duplex command interface. Have designed a board with a RS232 connector and MAX3232 connected to the USART pins.

Fclk/2 for USART is a surprise to me, have to check that out. I routed the SPI pins (PortB 4-7) +PORTD, pin7 ack pin to a separate connector for the other mcu.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

The usart is used for full duplex command interface. Have designed a board with a RS232 connector and MAX3232 connected to the USART pins.

Repeat: There is more than one USART.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:
Quote:

What transfer speed can you expect over the USART channel?

Why would it be much different than SPI? SPI slave can get clk/4 IIRC, but you then need in a continuous burst to do a full ISR every 32 AVR clock cycles--you ain't gonna be able to do that with the latencies you described so you are not going to achieve those speeds anyway.

Quote:

In the design I use >>the<< USART for another purpose, communicating with a terminal.

Remember that you have two, not one.

Lee

Are the second USART's signals routed to PORTB, 4-7, same as SPI?

About my application: Slave mcu connected to Compact flash or PATA hard drive using most of the IO pins. FAT filesystem and file processing routines. Supposed to be connected to host mcu (AVR32, AVR or other via SPI) The host mcu will open, read or create files and store data (in bursts) over SPI. The idea with SPI is to use same standard as other mcu's and peripheral components. Another idea is that the AVR32 has "DMA" functionality on ie SPI, it can read data as slave without cpu intervention so the atmega can run at clk/2 as a master.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Are the second USART's signals routed to PORTB, 4-7, same as SPI?

??? Wouldn't that be in the datasheet?

Think about your data throughput on your SPI slave during a "burst" or transaction or whatever. Take a short burst at modest speeds, say 4 bytes with a 1MHz SCK rate. That is one byte every 8 microseconds. If you don't grab the SPDR contents before another 8 microseconds it is lost when the second byte overwrites it.

On an AVR that basically implies "sitting" on the SPI during a transaction. Either a hard polling loop, or a skinny ISR with no latency from any other ISR.

It gets even worse if your slave is expected to respond with data to the master.

As I mentioned, it is a little more forgiving with the USART and with careful attention to ISR construction it might be able to do a highish bit rate. But you latency must still be low, say <20us at 1Mbps.

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

:) Hard to find, when looking at the pinout drawing (they omitted the USART1 TX/RX pins there!). But PORT descriptions says pins PC2 and PC3 next to USART0 pins. I'll try transferring data directly to disk over the USART0 I already have routed on the board (maybe the MAX3232 supports it too). If it works at good speed rates it actually eliminate the need for the SPI-link since the command interface is disabled anyway during the file I/O.

For the SPI, I have tested using polling and no interrupts. The extra IO pin (ack) used in the routines above is supposed to make the master wait until the slave has read the data register to avoid overwriting.

Normally the master polls if the byte has been sent (SPIF), but if the above routines works as desired, the master is actually polling a "copy" of the slave side SPIF.

An idea is to synchronize this with other hardware IO such as an memory card in my application. As I understand it, it is possible to execute 32 instructions before each SPIF/ack poll without disturbing the SPI routines.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Hard to find, when looking at the pinout drawing (they omitted the USART1 TX/RX pins there!). But PORT descriptions says pins PC2 and PC3 next to USART0 pins.

Do you have a broken datasheet? My 8011G Mega164P family datasheet searches out RXD1 in each pinout diagram. And they are not on PC2/PC3 but rather PD2/PD3. And my Pin Descriptions/alternate Functions says

PD3
INT1 (External Interrupt1 Input)
TXD1 (USART1 Transmit Pin)
PCINT27 (Pin Change Interrupt 27)
PD2
INT0 (External Interrupt0 Input)
RXD1 (USART1 Receive Pin)
PCINT26 (Pin Change Interrupt 26)
PD1
TXD0 (USART0 Transmit Pin)
PCINT25 (Pin Change Interrupt 25)
PD0
RXD0 (USART0 Receive Pin)
PCINT24 (Pin Change Interrupt 24)

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Is it possible for use 6 dc motor in atmega16 ? if it possible how ?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

great faux -pas, don't hijack a thread with an unrelated question.
In answer to your question: what dc motor? what do you want to do with these non-descript dc motors?

The quality of the answer you receive is directly proportional to the quality of the question.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I did this sort of thing using a lattice ISP1016 years ago. That is one way. Depending on board space you could look at deep FIFO UARTS.