i am using one ATMEGA3209(master) to read 32x other ATMEGAS(slaves), each with its own SS pin, with all Atmegas runnign at 20 MHz.
I read the slaves via SPI and then the master Atmega compiles a packet of 800 Bytes(each slave sends 25Bytes per read) and sends it out via UART to a serial to USB cable from FTDI(384 KBps), as many times per second as possible.
I am running SPI on f_cpu/4(which should give me 610 KBps) and UART at 921600 bps (112 KB/s).
At the moment i am receiving on COM port 67 KBps, using Tera Term software, so around 84 packets/reads per second. I realize there is also some time wasted while the MASTER sets SS lane of each individual and while slave responds.
Now i wish to reach around 250-400 packets/reads per second(200-320 KBps) which should be possible with SPI, but not with serial connection, so i will change the MASTER mcu from Atmega3209 to a SAM D21 (runnign at 48MHz), so then the D21 will read all slaves via SPI and then output data via USB peripheral (at max 1464 KBps) to PC directly. And if i also push the SPI speed a little, while sacrificing data integrity, i may increase packets per second greatly.
The ultimate solution would be to also change all slaves to D21 in which case i could use 50-100% faster SPI and achive up to 1000 packets per second (800 KBps).
Anyone has any other advice how to increase the data throughput? With the D21 as master and AtMega3209 as slaves, the bottleneck looks to be on the SPI.
Thanks in advance for your opinions :D (and yes i know 32 atmega slaves is a lot :P )
Update1: After checking SAM D21 prices and power consumption, i decided i will change all Atmegas to SAM D21 runnig at 48MHz and have around 15% lower power consumption and having much higher data throughput.