Data stream with interrupts through USART on Atmega328p

Go To Last Post
19 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm trying to implement a serial stream to send bytes from Arduino Uno to PC (Python code runs on PC using PySerial library). I carefully implemented a C code in a "GCC C Executable Project" using Microship Studio 7.0.2542 and used avrdude to upload the code. Everything seems to be right, but the results are different than expected. I hope you can help me.

 

The C code implements a circular buffer and uses the USART_RX_vect and USART_TX_vect to handle the USART interrupts.

 

The expected behavior is (on Arduino side):

  1. wait a startar signal (2 bytes, 0xff and 0xff) from Python code through USART;
  2. replies with an Ack signal (2 bytes, 0xff an 0xff) and
  3. send 10 bytes (for testing purpose) 0x00, 0x01, ..., 0x09.

 

but

 

The following happens (on Python side):

  1. I send the start signal;
  2. receive the Ack response;
  3. receive de bytes: 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x00, 0x00;

 

 

If I send the start signal again, this time I receive:  0x02, 0x00, 0x04, 0x00, 0x06, 0x00, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00;

and if I send start again: 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x02, 0x03, 0x04, 0x05

and again: 0x06, 0x07, 0x08, 0x09, 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07

 

I plan to use the arduino to controll a servo system end send state variables (like voltages, currents, control signal, reference error, etc.) to PC in real time, so this code is only the beginning.

 

The Avrdude arguments are -C "C:\Arduino\avrdude\avrdude.conf" -p atmega328p -c arduino -P COM4 -b 115200 -U flash:w:"$(TargetDir)$(TargetName).hex":i in the Microship Studio's "External Tools" configured.

 

The Build log:

------ Build started: Project: SerialStream, Configuration: Debug AVR ------
Build started.
Project "SerialStream.cproj" (default targets):
Target "PreBuildEvent" skipped, due to false condition; ('$(PreBuildEvent)'!='') was evaluated as (''!='').
Target "CoreBuild" in file "C:\Program Files (x86)\Atmel\Studio\7.0\Vs\Compiler.targets" from project "D:\Arduino\C-GCC\SerialStream\SerialStream.cproj" (target "Build" depends on it):
    Task "RunCompilerTask"
        Shell Utils Path C:\Program Files (x86)\Atmel\Studio\7.0\shellUtils
        C:\Program Files (x86)\Atmel\Studio\7.0\shellUtils\make.exe all --jobs 8 --output-sync 
        make: Nothing to be done for 'all'.
    Done executing task "RunCompilerTask".
    Task "RunOutputFileVerifyTask"
                Program Memory Usage     :    1102 bytes   3,4 % Full
                Data Memory Usage         :    50 bytes   2,4 % Full
                Warning: Memory Usage estimation may not be accurate if there are sections other than .text sections in ELF file
    Done executing task "RunOutputFileVerifyTask".
Done building target "CoreBuild" in project "SerialStream.cproj".
Target "PostBuildEvent" skipped, due to false condition; ('$(PostBuildEvent)' != '') was evaluated as ('' != '').
Target "Build" in file "C:\Program Files (x86)\Atmel\Studio\7.0\Vs\Avr.common.targets" from project "D:\Arduino\C-GCC\SerialStream\SerialStream.cproj" (entry point):
Done building target "Build" in project "SerialStream.cproj".
Done building project "SerialStream.cproj".

Build succeeded.
========== Build: 1 succeeded or up-to-date, 0 failed, 0 skipped ==========

 

The Avrdude log:

avrdude.exe: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.00s

avrdude.exe: Device signature = 0x1e950f (probably m328p)
avrdude.exe: NOTE: "flash" memory has been specified, an erase cycle will be performed
             To disable this feature, specify the -D option.
avrdude.exe: erasing chip
avrdude.exe: reading input file "D:\Arduino\C-GCC\SerialStream\Debug\SerialStream.hex"
avrdude.exe: writing flash (1102 bytes):

Writing | ################################################## | 100% 0.21s

avrdude.exe: 1102 bytes of flash written
avrdude.exe: verifying flash memory against D:\Arduino\C-GCC\SerialStream\Debug\SerialStream.hex:

Reading | ################################################## | 100% 0.15s

avrdude.exe: 1102 bytes of flash verified

avrdude.exe: safemode: Fuses OK (E:00, H:00, L:00)

avrdude.exe done.  Thank you.

 

Some thing very weird is that the "Datam Memory Usage" in the Build log do not match the expected. For exaple if I set the buffer size to 300 (OUT_BUFF_SZ in the C code) and the "data" array size to 255 the Build log logs that the memory usage is only 320, I expected more then 550 (300 + 255). Or am I wrong?

 

Here is the code:

/*
 * main.c
 *
 * Created: 4/24/2022 2:59:01 PM
 *  Author: Maciel B Nunes
 */ 

#define F_CPU 16000000UL
#define __DELAY_BACKWARD_COMPATIBLE__

#define OUT_BUFF_SZ 30
#define IN_BUFF_SZ 8

#include <avr/io.h>
#include <util/delay.h>
#include <avr/interrupt.h>

#define set_bit(reg,bit) (reg |= (1<<bit))
#define clr_bit(reg,bit) (reg &= ~(1<<bit))
#define tst_bit(reg,bit) (reg &(1<<bit))
#define cpl_bit(reg,bit) (reg ^= (1<<bit))

#define true	0xff
#define false	0x00
typedef uint8_t bool;

// Types ===========================================================================================

typedef volatile struct {
	uint8_t		buff[OUT_BUFF_SZ];
	uint16_t	head;
	uint16_t	tail;
	uint16_t	count;
}OutBuff;

// Variables =======================================================================================

OutBuff out;

bool		volatile done			= true;
uint8_t		volatile terminator[]	= {255,255};
uint16_t	volatile in_head		= 0;
uint8_t		volatile in_buff[]		= {0,0,0,0,0,0,0,0};

// Functions Prototyes =============================================================================

void out_initialize(void);
void out_putByte(uint8_t);
uint8_t out_flushByte(void);
bool out_isEmpty(void);
bool out_isFull(void);
uint16_t out_getFilledSz(void);
void out_clearBuff(void);
void initialize(void);
void waitForStartSignal(void);
void sendStartAck(void);
void flush(void);
void sendBytes(uint8_t[], uint16_t, uint16_t);
void rxHandler(void);
void txHandler(void);

// Functions =======================================================================================

void out_initialize(){
	out_clearBuff();
}

void out_putByte(uint8_t b){
	out.buff[out.head++] = b; out.head %= OUT_BUFF_SZ; out.count++;
}

uint8_t out_flushByte(){
	uint8_t b = out.buff[out.tail++];
	out.tail %= OUT_BUFF_SZ; out.count--;
	return b;
}

bool out_isEmpty(){
	return out.count == 0;
}

bool out_isFull(){
	return out.count == OUT_BUFF_SZ;
}

uint16_t out_getFreeSz(){
	return OUT_BUFF_SZ - out.count;
}

uint16_t out_getFilledSz(){
	return out.count;
}

void out_clearBuff(){
	out.head = out.tail = out.count = 0;
}

void setup(){
	out_initialize();
	done = true;
	/* Baud rate 1Mb/s, 2 stop bits, no parity and TX and RX interruptions enabled. */
	UCSR0A |= (1<<U2X0);			// Enables double speed (prescaler 8).
	UBRR0 = 1;						// 1Mb/s.
	UCSR0B |= ((1<<TXEN0) | (1<<TXCIE0) | (1<<RXEN0) | (1<<RXCIE0));
	UCSR0C |= (1<<USBS0);			// 2 Stop bits.
	sei();
}

void waitForStartSignal(){
	while(true){
		in_head = 0;
		while(in_head < 1);
		if(in_buff[0] != terminator[0]) continue;
		while(in_head < 2);
		if(in_buff[1] != terminator[1]) continue;
		return;
	}
}

void sendStartAck(){
	while(!done);
	UDR0 = terminator[0]; UDR0 = terminator[1];
	//sendBytes(terminator, 0, 2);
}

void flush(){
	if(done) txHandler();
}

void sendBytes(uint8_t byte_array[], uint16_t start, uint16_t end){
	uint16_t free_sz = out_getFreeSz();
	uint16_t array_sz = end-start;
	if(free_sz >= array_sz){
		for(uint16_t i=start; i<end ; i++) out_putByte(byte_array[i]);
		flush(); return;
	}
	set_bit(PORTB, PB5);
	uint16_t i = start, turn_end;
	while(true){
		while(!out_isEmpty());
		if(array_sz - i > OUT_BUFF_SZ) turn_end = i + OUT_BUFF_SZ;
		else turn_end = end;
		for(; i<turn_end; i++) out_putByte(byte_array[i]);
		flush();
		if(i == end) return;
	}
}

void rxHandler(){
	in_buff[in_head++] = UDR0;
}

void txHandler(){
	done = false;
	uint16_t filled_sz = out_getFilledSz();
	if(filled_sz > 1){
		UDR0 = out_flushByte(); UDR0 = out_flushByte();
	}
	else if(filled_sz > 0){
		UDR0 = out_flushByte();
	}
	else{
		done = true;
	}
}

ISR(USART_RX_vect){
	rxHandler();
}

ISR(USART_TX_vect){
	cli();
	txHandler();
	sei();
}

int main(void){
	setup();
	set_bit(DDRB, PB5);
	clr_bit(PORTB, PB5);

	uint8_t data[255];
	for(uint16_t i=0; i<255; i++) data[i] = i;
	while(true){
		waitForStartSignal();
		sendStartAck();
		sendBytes(data, 0, 10);
	}
}

 

What is wrong?

 

 

Attachment(s): 

This topic has a solution.
Last Edited: Fri. May 6, 2022 - 02:58 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

What exactly is code like:

void sendStartAck(){
	while(!done);
	UDR0 = terminator[0]; UDR0 = terminator[1];
	//sendBytes(terminator, 0, 2);
}

trying to achieve? You cannot just do back to back writes to UDR0 like that. The second write will over-write the first before it's had a chance to move into the transmit holding register. When you write UDR0 you have to be sure (either by interrupt or polling) that the UDRE flag ("UDR is Empty") bit in the UCSRA register is set.

 

Also:

ISR(USART_TX_vect){
	cli();
	txHandler();
	sei();
}

What exactly are the cli() and sei() in this trying to achieve? The AVR already clears I in SREG on entry to the ISR anyway so the cli() is superfluous but what you should never do (unless you have a very specific reason to do so) is use sei() within an ISR. The fact is that on the normal exit from ISR (via RETI rather than RET) the I flag will be restored anyway - but you don't want that done INSIDE the ISR!

 

It is also VERY unwise to call a function from an ISR() on AVR (well when using avr-gcc anyway) as it will stack/unstack almost the entire context.

 

If I were you I'd work my way back to first principles and get that working to start with, Just write simple UART_sendchar() and UART_getchar() routines that can handle (with a polling wait on status flags) a single send/receive. Then build multi-byte send/receive on top of those. When you are familiar with all that is required then slowly start to add an interrupt driven solution to it. Start with RXC and just have that get one byte at first. When that works add a ring-buffer and allow it to receive and buffer multiple bytes.

 

Don't try to do everything in one go up front until you fully understand he basic steps.

Last Edited: Tue. Apr 26, 2022 - 10:08 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

No error/bounds checking for buffer overflow on the receive side?

#1 Hardware Problem? https://www.avrfreaks.net/forum/...

#2 Hardware Problem? Read AVR042.

#3 All grounds are not created equal

#4 Have you proved your chip is running at xxMHz?

#5 "If you think you need floating point to solve the problem then you don't understand the problem. If you really do need floating point then you have a problem you do not understand."

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Maciel B Nunes wrote:

  1. wait a startar signal (2 bytes, 0xff and 0xff) from Python code through USART;
  2. replies with an Ack signal (2 bytes, 0xff an 0xff) and

FF is a bad trigger byte, as a serial line idles high, any noise spike that looks like a start bit will look like your trigger! , better to use  ascii control character(s), oh I don't know maybe an ENQ SOH maybe, and for your ack, how about ACK STX maybe.  These characters may be better suited then idle line bytes.

 

I do commend you for breaking your larger project down to byte size functions and proving them out one at a time, better to solve one problem at a time!

Good luck

 

Jim

 

 

 

FF = PI > S.E.T

 

Last Edited: Tue. Apr 26, 2022 - 12:45 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thank you all for your attention!

 

To clarify, this is my TCC (Control and Automation Engineering) project. Before I created this post I did other more basic projects:

  1. first, a simple character echo without using interrupt;
  2. then a simple send an array of bytes using interrupt;
  3. now streaming data using ring-buffer, something closer to what I really intend to do.

 

I intend to implement a communication protocol for sending and receiving boolean, int8, uint8, int16, uint16, int32, uint32 and float32, and I will use the CRC algorithm to check the data integrity. There's still a lot to do, the current code is just for concept checking and approach testing.

 

About hardware troubles: I didn't testes with other arudino board (I have only this one) but a have tested with another Atmega328p chip (a new one). I also tested it with a 9V supply, since this increased the ADC accuracy on another project, I thought it could be a problem with poor or noisy power from the PC's USB port.

 

So, Clawson:

The purpose of the sendStartAck function is to notify the Python code that the start signal has been received and the following data is the array data.

The cli() and sei() in ISR(USART_TX_vect)... was just for verification and I forgot to remove it.

And the 2 subsequentes UDR0 = 'some_bybe': I do this only when I'm sure there's nothing being sent (when UDR0 and the shift register are empty, this is one of the purposes of the done variable), so the first byte is shifted to the shift register before the second byte is loaded in UDR0 (I believe). I tested this in a interrupt driven code sending 255 bytes and it worked fine. Even so, I decided to follow your (and manual's) guidance of always checking the UDRE0 bit before loading a new byte.

About the calling functions inside the ISR I didn't know that this can lead to trouble. Can you point me to a reference that contains this kind of information? a good book would be perfect! I'm being guided, basically, by the Atmega328p manual, Articles and tutorials on the internet and Forums like this

 

Anyway
I followed the guidance of all of you, but the problem keeps happening. So I've simplified the code to make it easier to find the problem.

 

This is the new code:

/*
 * main.c
 *
 * Created: 4/26/2022 11:54:23 AM
 *  Author: Maciel B Nunes
 */ 

#define F_CPU 16000000UL

#define OUT_BUFF_SZ 30
#include <xc.h>
#include <util/delay.h>
#include <avr/interrupt.h>

#define set_bit(reg,bit) (reg |= (1<<bit))
#define clr_bit(reg,bit) (reg &= ~(1<<bit))
#define tst_bit(reg,bit) (reg &(1<<bit))
#define cpl_bit(reg,bit) (reg ^= (1<<bit))

#define true	0xff
#define false	0x00
typedef uint8_t bool;

// Types ===========================================================================================

typedef volatile struct {
	uint8_t		buff[OUT_BUFF_SZ];
	uint16_t	head;
	uint16_t	tail;
	uint16_t	count;
}OutBuff;

// Variables =======================================================================================

OutBuff out;

bool		volatile done			= true;
uint8_t		volatile terminator[]	= {0x01, 0x01};
uint16_t	volatile in_head		= 0;
uint8_t		volatile in_buff[]		= {0,0,0,0,0,0,0,0};

// Functions Prototyes =============================================================================

void out_putByte(uint8_t);
void waitForStartSignal(void);
void sendStartAck(void);
void flush(void);
void sendBytes(volatile uint8_t[], uint16_t, uint16_t);

// Functions =======================================================================================

void out_putByte(uint8_t b){
	out.buff[out.head++] = b; out.head %= OUT_BUFF_SZ; out.count++;
}

uint16_t out_getFreeSz(){
	return OUT_BUFF_SZ - out.count;
}

void setup(){
	out.head = out.tail = out.count = 0;
	done = true;
	/* Baud rate 1Mb/s, 2 stop bits, no parity and TX and RX interruptions enabled. */
	UCSR0A |= (1<<U2X0);			// Enables double speed (prescaler 8).
	UBRR0 = 1;						// 1Mb/s.
	UCSR0B |= ((1<<TXEN0) | (1<<TXCIE0) | (1<<RXEN0) | (1<<RXCIE0));
	UCSR0C |= (1<<USBS0);			// 2 Stop bits.
	sei();
}

void waitForStartSignal(){
	while(true){
		in_head = 0;
		while(in_head < 1);
		if(in_buff[0] != terminator[0]) continue;
		while(in_head < 2);
		if(in_buff[1] != terminator[1]) continue;
		return;
	}
}

void sendStartAck(){
	while(!done);
	while(!tst_bit(UCSR0A, UDRE0));
	UDR0 = terminator[0];
	while(!tst_bit(UCSR0A, UDRE0));
	UDR0 = terminator[1];
}

void flush(){
	if(!done) return;
	if(out.count > 1){
		while(!tst_bit(UCSR0A, UDRE0));
		UDR0 = out.buff[out.tail++];
		out.tail %= OUT_BUFF_SZ; out.count--;
		while(!tst_bit(UCSR0A, UDRE0));
		UDR0 = out.buff[out.tail++];
		out.tail %= OUT_BUFF_SZ; out.count--;
	}
	else if(out.count > 0){ 
                while(!tst_bit(UCSR0A, UDRE0));
		UDR0 = out.buff[out.tail++];
		out.tail %= OUT_BUFF_SZ; out.count--;
	}
	else done = true;
}

void sendBytes(uint8_t volatile byte_array[], uint16_t start, uint16_t end){
	if(out_getFreeSz() >= end-start){
		for(uint16_t i=start; i<end ; i++) out_putByte(byte_array[i]);
		flush();
	}
}

ISR(USART_RX_vect){
	in_buff[in_head++] = UDR0;
}

ISR(USART_TX_vect){
	done = false;
	if(out.count > 1){
		while(!tst_bit(UCSR0A, UDRE0));
		UDR0 = out.buff[out.tail++];
		out.tail %= OUT_BUFF_SZ; out.count--;
		while(!tst_bit(UCSR0A, UDRE0));
		UDR0 = out.buff[out.tail++];
		out.tail %= OUT_BUFF_SZ; out.count--;
	}
	else if(out.count > 0){
		while(!tst_bit(UCSR0A, UDRE0));
		UDR0 = out.buff[out.tail++];
		out.tail %= OUT_BUFF_SZ; out.count--;
	}
	else done = true;
}

int main(void){
	setup();

	uint8_t volatile data[255];
	for(uint16_t i=0; i<255; i++) data[i] = i+2;
	while(true){
		waitForStartSignal();
		sendStartAck();
		sendBytes(data, 0, 10);
	}
}

What is wrong?

 

As I am using a 30 byte buffer and data array with size 255 should the compiler report a memory usage greater than 285 or not? It reports only 49 bytes, why?

 

Here is the Python code:

# https://pyserial.readthedocs.io/en/latest/pyserial_api.html

from time import sleep
from serial.tools.list_ports import comports as getComPorts
import serial

class ComPortManager:

    def __init__(self, port_name, baud_rate=1000000, stop_bits=serial.STOPBITS_TWO):
        self.port = serial.Serial(port=port_name, baudrate=baud_rate, stopbits=stop_bits)
        sleep(3)
        if self.port.is_open:
            self.port.reset_input_buffer()
            print(f'{self.port.name} open!')
        else:
            print(f'{self.port.name} closed!')
        print(self.port.getSettingsDict(), '\n')

    def listen(self):
        terminator = b'\x01\x01'
        self.port.timeout = 0.5
        for i in range(5):
            self.port.write(terminator)
            start_response = self.port.read(size=2)
            if start_response != terminator:
                print("Start ack error!. Start response:", start_response)
                return
            data = self.port.read(size=255)     # waits until timout (0.5 sec) occures
            print(f'{i} Received in hex ({len(data)} bytes):', data.hex(sep=',', bytes_per_sep=1))

if __name__ == '__main__':

    ports = getComPorts()
    if len(ports) > 1:
        raise ValueError("Are there more then one com port available!")
    elif len(ports) < 1:
        raise ValueError("Are there not ports available!")

    port = ports[0]
    com_manager = ComPortManager(port.device)
    com_manager.listen()

 

And the python log:

Last Edited: Tue. Apr 26, 2022 - 06:15 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Welcome! ... good project that you will have in your toolbox.

Maciel B Nunes wrote:
I intend to implement a communication protocol for sending and receiving boolean, int8, uint8, int16, uint16, int32, uint32 and float32, and I will use the CRC algorithm to check the data integrity.
fyi, similar for QP/SpyTM, Data Visualizer (Microchip Studio extension, MPLAB X plug-in, standalone), and sigrok.

Maciel B Nunes wrote:
I also tested it with a 9V supply, since this increased the ADC accuracy on another project, I thought it could be a problem with poor or noisy power from the PC's USB port.
USB VBUS is noisy; most wall warts are noisy with possible faults (any power supply should have a voltage clamp or crowbar though commonly left out)

A capacitance multiplier will significantly reduce the noise other than at relatively low frequencies (off-line power)

IIRC, the only AVR with a PSRR is AVR DB's op amp.

Noise folds; anti-aliasing filters are first with power supply filters second. 

Maciel B Nunes wrote:
As I am using a 30 byte buffer and data array with size 255 should the compiler report a memory usage greater than 285 or not?
Maybe yes, maybe no; consider linters.

 


AVR - Calibration & Daq toolchain? | AVR Freaks

 

QTools: QP/Spy™ Data Protocol

MSG_CONF_STREAM | Data Visualizer Software User's Guide

Protocol decoder:numbers_and_state - sigrok

 

Power Stability for electrochemical sensors ? | AVR Freaks

 

ARR30-C. Do not form or use out-of-bounds pointers or array subscripts - SEI CERT C Coding Standard - Confluence

 

Maciel B Nunes wrote:
ring-buffer

What is it? | GitHub - QuantumLeaps/lock-free-ring-buffer

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Maciel B Nunes wrote:

uint8_t volatile data[255];
	for(uint16_t i=0; i<255; i++) data[i] = i+2;

Optimizing compilers seem confusing to beginners, it's possible, the compiler sees this array is only written once, so it may place this fixed data array into flash, not sure this is happening, but that may explain why you don't see it as taking ram space in your compiler usage report.  Looking at the .map file will tell you where the data is located, flash vs ram.

Jim

 

 

FF = PI > S.E.T

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Unless you are pressed for a high data rate, you can simply send a string of your data and value, that may be easier to debug, since you can see it on the terminal, or use the terminal to emulate your avr talking to the Pc.

 

example messages:

RPM=567

VBATT=234 

DOORSENSE=OPEN

PBRAKE=ON

POTVOL=47

POTBASS=621

FISHLEN=12.47 

...so the avr is simply reporting what is going on, say every second.

on reception it is broken into two fields, type and value (ex: type is DOLLARS & value is 75) 

 

This is very clear and easy to diagnose what is going on.

 

Or, perhaps to a specific query from the PC:

POTVOL?

RPM?

In that case, it may be enough just to report the value (RPM?  reports 567, or 567 and a checksum: 567, 2194)

 

A drawback is this is "open" and easy to read (if you want things hidden/secret)..the old dual-edge sword.

 

You can have this working in very short order.  You can always add modes of transmission (such as binary), blocks of data, etc.

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

Last Edited: Tue. Apr 26, 2022 - 09:47 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ki0bk wrote:

Maciel B Nunes wrote:

uint8_t volatile data[255];
	for(uint16_t i=0; i<255; i++) data[i] = i+2;

Optimizing compilers seem confusing to beginners, it's possible, the compiler sees this array is only written once, so it may place this fixed data array into flash, not sure this is happening, but that may explain why you don't see it as taking ram space in your compiler usage report.  Looking at the .map file will tell you where the data is located, flash vs ram.

Jim

 

 

In this instance, the volatile qualifier will force the compiler to put the array in RAM. Since it is declared locally in a function it is being allocated on the stack. If it was declared with a static qualifier it would have a fixed address and would be included in the RAM size in linker output.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The dearth of comments makes it hard to follow the intent of the code. I don't get why the code in flush() and ISR(USART_TX_vect) handle (out.count > 1) differently than (out.count > 0). I think that just the latter should be enough.

 

I had a devil of time figuring out how out.count ever became non-zero because of this coding style:

void out_putByte(uint8_t b){
	out.buff[out.head++] = b; out.head %= OUT_BUFF_SZ; out.count++;
}

 

I highly suggest starting at a lower baudrate. Running 8-N-2 at 1Mbps on a 16MHz MCU gives you only 176 CPU cycles per byte. When I compile your code I get a TX_ISR of 183 instructions (some of which are multi-cycle), not include the multiple multiplication calls (from % modulus).

 

The blocking waits in ISR(USART_TX_vect) are a big red flag. Here's a hint: I almost never use USART_TX_vect; I usually use USART_UDRE_vect.

 

You have the TX_ISR enabled all the time. I suspect that sending the ACK bytes triggers the ISR before sendBytes() finished filling the buffer.

 

 

 

 

 

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Running 8-N-2 at 1Mbps on a 16MHz

I wonder why OP is using such an extreme speed compared to a typical 9600 baud or thereabouts, or even 115200 baud?  There is no reason mentioned---is the latency a huge factor? Even at 9600 baud, that a is only 1ms per char.  Maybe trying to take thousands of readings every second?  

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Thanks for these informations, it will increase my knowledge about related solutions. Especially the Data Stream Protocol and QP/Spy™ Data Protocol.
The Data Stream Protocol seems like a simplifed version of my intending. And can even be enought.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The data variable did not appear in the .map file. So for testing I declared it outside of main(), so it appeared in the .data section when I initialize it with non-zero value and in the .bss section otherwise. In both cases the compiler now reports the expected amount of memory usage, but it did not resolve the issue. I think that when it is declared inside a function it is allocated at runtime, so it doesn't appear in the .map.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yes I pressed for a high rate.

This reply has been marked as the solution. 
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I had an inkling as to what the real problem was; so I programmed your code into a clone Arduino UNO (with CH340) and connected it to Bray Terminal. I didn't expect it to work because of your Wacko baudrate of 1000000, but f&*k me; using Bray Terminal's custom baudrate setting it actually worked.

Sending the 0x01 0x01 start marker highlighted the problem immediately. I was confronted by a colossal wall of data: (Bray Crashed)

01 01

02 03 04 05 06 07 08 09 0A 0B 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

02 03 04 05 06 07 08 09 0A 0B 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

This sequence of 30 bytes was repeated over & over.

Obviously the variable (out.count) has become corrupted.

  • The problem is our old friend of concurrent unprotected access of a variable/variables shared by main and ISR code.
  • This is a common fault in simplistic RingBuffer implementations and more complex careful coding is necessary to allow overlapped access to the RingBuffer.
  • I've fixed up (and tested) your program. It uses a linear buffer as explained below:
  • Because you have a Request-Response mechanism controlled by your done flag, I'm going to suggest reverting to a linear buffer. We can use a version of that flag to easily enforce strict non-overlapping access to that buffer.
  • You seem to have a desire for really fast comms, but you used the TX Complete Interrupt. This would result in stuttering between characters so I switched to the Data Register Empty Interrupt which allows for a continuous burst of characters without delay between them. The Data Register Empty Interrupt is Condition Based as opposed to Event Based which means it must be enabled and disabled as required. You'll see this in the code.
  • I hope you agree the resulting code looks much simpler, but if you outgrow the non-overlapped linear buffer scheme, we can go over that later.

 

Enjoy:

/**
    * main.c
    *
    * Created: 4/26/2022 11:54:23 AM
    * Author: Maciel B Nunes
    */

#define F_CPU 16000000UL
#define OUT_BUFF_SZ 30

#include <stdbool.h>
#include <avr/io.h>
#include <avr/interrupt.h>
#include <util/delay.h>

#define set_bit(reg,bit) (reg |= (1<<bit))
#define clr_bit(reg,bit) (reg &= ~(1<<bit))
#define tst_bit(reg,bit) (reg &(1<<bit))
#define cpl_bit(reg,bit) (reg ^= (1<<bit))

// Types ===========================================================================================

typedef struct {
    uint8_t     buff[OUT_BUFF_SZ];
    uint16_t    rdIdx, wrIdx, count;
} OutBuff;

// Variables =======================================================================================

OutBuff              out;
bool        volatile bUartBusy = false;
uint8_t     volatile terminator[] = { 0x01, 0x01 };
uint16_t    volatile in_head = 0;
uint8_t     volatile in_buff[] = { 0,0,0,0,0,0,0,0 };

// Functions Prototyes =============================================================================

void out_putByte (uint8_t);
void waitForStartSignal (void);
void sendStartAck (void);
void flush (void);
void sendBytes (volatile uint8_t[], uint16_t, uint16_t);

// Functions =======================================================================================

void out_putByte (uint8_t b) {
    out.buff[out.wrIdx++] = b;
    out.count++;
}

void setup () {
    /* Baud rate 1Mb/s, 2 stop bits, no parity and TX and RX interruptions enabled. */
    UCSR0B = 0;
    UBRR0  = 1;						// 1Mb/s.
    UCSR0A = (1 << U2X0);			// Enables double speed (prescaler 8).
    UCSR0C = (1 << USBS0) | (1 << UCSZ01) | (1 << UCSZ00);	// 8-Data NoParity, 2-Stop bits.
    UCSR0B = (1 << TXEN0) | (0 << TXCIE0) | (0 << UDRIE0) | (1 << RXEN0) | (1 << RXCIE0);
    sei();
}

void waitForStartSignal () {
    while (true) {
        in_head = 0;
        while (in_head < 1);
        if (in_buff[0] != terminator[0]) continue;
        while (in_head < 2);
        if (in_buff[1] != terminator[1]) continue;
        return;
    }
}

void sendStartAck () {
    out.rdIdx = out.wrIdx = out.count = 0;
    out_putByte(terminator[0]);
    out_putByte(terminator[1]);
}

void flush () {
    if (out.count) {
        /* Enable UDR Empty ISR to Kick Things Off */
        bUartBusy = true;
        set_bit(UCSR0B, UDRIE0);
    }
}

void sendBytes (uint8_t volatile byte_array[], uint16_t start, uint16_t end) {
    for (uint16_t i = start; i < end; i++)
        out_putByte(byte_array[i]);
}

ISR(USART_RX_vect) {
    in_buff[in_head] = UDR0;
    if (in_head < (sizeof(in_buff) - 1))
        in_head++;
}

ISR(USART_UDRE_vect) {
    if (out.count) {
        out.count--;
        UDR0 = out.buff[out.rdIdx++];
    }
    else {
        /* Disable this ISR. We've Transmitted the whole buffer */
        clr_bit(UCSR0B, UDRIE0);
        bUartBusy = false;
    }
}


int main(void) {
    setup();

    uint8_t volatile data[255];
    for (uint16_t i = 0; i < 255; i++) data[i] = i + 2;

    while (true) {
        while (bUartBusy);
        waitForStartSignal();
        sendStartAck();
        sendBytes(data, 0, 10);
        flush();
    }
}

 

 

 

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thank you, balisong42, for your contribution!

 

 

balisong42 wrote:
I don't get why the code in flush() and ISR(USART_TX_vect) handle (out.count > 1) differently than (out.count > 0). I think that just the latter should be enough.

As I need high rate, I put 2 bytes to send each interrupt if I have at least 2 bytes in the buffer, so I check if there are more than 1 or more than 0 bytes in the buffer.
The intention is to decrease the number of times the sendding ISR is executed, but for that it is necessary to use USART_TX_vect, because this vector is executed when UDR0 is empty and the shift register is also, so I can load one byte into UDR0 and another as soon as UDR0 is empty.
I thought the ISR would be terminated before even the first byte was sent, because what it does is just check the amount of data in the buffer, check that UDRE is clean, increment tail and decrement count.
But as you very well observed (THANK YOU!) there are several instructions to perform these tasks, and, apparently, the sending of the 2 bytes ends before the ISR.

In order not to lose the high rate I changed the buffer variables (tail, head and count) to uint8_t to reduce the number of instructions on manipulating buffer variables and used buffer's length equal to 256 in order to don't needding use the modulus (%) since increasing an uint8_t variable by 1 it becomes 0.

So we find troubles but they are not all. As  N.Winterbottom realized, the mainly trouble is the  concurrent unprotected access of variables shared by sendBytes function and the ISR code.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Maciel B Nunes wrote:
The intention is to decrease the number of times the sendding ISR is executed,

so your saying in main() the cpu is being starved because of the ISR() overhead from the single character transmission rate is so high, does it really need to be that high? what percent of cpu is being used by the ISR()?

Jim

 

FF = PI > S.E.T

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Thank you very much,  N.Winterbottom, for solving the trouble!

I got your solution and made some improvements for reducing the times of executing the ISR for sendding data, reducing the amount of instrutions in the ISR and use a ring-buffer.

 

Summing up I:

- Used interrupt on TX done event (not in UDR0 empty state): for reducing how much the sendding ISR is executed;

- Disabled interruptions by TX done event while the sendBytes and flush functions execution: for solving the concurrent manipulating buffer's variables;

- Used buffer's variables of type uint8_t to: for decrease the computational cost of manipulating these variables (mainly saving time on ISR);

- Used buffer's variables of type uint8_t and buffer's length equal to 256: to avoid the needding of use the %, so decreasing the computational cost of manipulating these variables.

 

If some one want to see the final code, I can share a well commented version here.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ki0bk wrote:
so your saying in main() the cpu is being starved because of the ISR() overhead from the single character transmission rate is so high, does it really need to be that high? what percent of cpu is being used by the ISR()?

 

For while the sendding is the only task the cpu is doing, but soon will do much more, how much I didn't know yet, but im trying to otmize the code to reduce the chance of having to switch to a more powerful microcontroller.

I intend to implement the controller of a servo system with at least 2 motors (which generate PWM signals as speed signals), a resistive touch panel (acting as an object position sensor) with high sampling rate and a dynamic system's state observer.