sparrow2: There is no point in checking other received bytes in the SPI ISR - the issue is, that SPI does not have an Rx double buffer so the span between ISRs is 1 byte long, but the latency has to be less than one bit time.
Yeah I got it wrong (it's more than a year since I worked on that): it does not have a Tx double buffer.
It means, that the slave must not output byte to SPI Tx while the "train is in motion". It must output the byte to be transmitted to the master in the space between two consecutive bytes. In other words, the master, before sending the next byte, it must wait until the slave's SPI ISR fires and gets to the point where it writes to the outbound buffer (the inbound buffer is double-buffered so it might be read later).
If one strives for maximum throughput, and does not want to spend an extra pin for handshaking or to perform some fancy signaling on the SS signal, the master must be written with knowledge of the slave's SPI interrupt latency.
This is why sometimes counting cycles matters.