Autodetect CR, LF, or CRLF

Go To Last Post
18 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi,

I have my UART running with interrupts, and I use it for a development/service-menu. In that menu I can set some pins, run a sequence, read sensors, change settings in the eeprom (using a EEMEM struct), and so on.

The terminal programs on my computer use CR, or LF, or CRLF.

Question:
Waiting until a line is entered and returning it as a string without CR or LF is no problem, but the next line could contain the 'LF' which was still in the buffer if the terminal sends CRLF.
I also want to request the number of characters in my buffer, and this should be the actual number of characters.
And I want to be able to enter an empty line.

Is there a common solution for this?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Kun.io wrote:
Question:
Waiting until a line is entered and returning it as a string without CR or LF is no problem, but the next line could contain the 'LF' which was still in the buffer if the terminal sends CRLF.
I also want to request the number of characters in my buffer, and this should be the actual number of characters.
And I want to be able to enter an empty line.

Is there a common solution for this?


I don't know if the following is common or standaard, but it is what I have used in the past.

An ISR is used to read characters out of the USART and put them into a circular buffer. A routine in the main part of the program then reads characters from the buffer and if the character is a CR, it sets a variable called CR_Flag, and treats the CR as the End-of-Line (EOL) indicator. If it reads an LF from the buffer, it checks CR_Flag and if set it throws the LF away. If not set, it treats the LF as an EOL. The CR_Flag is then cleared. For any character that is not CR or LF, the CR_Flag is cleared.

As for getting a count of printable characters in the buffer, the ISR could increment a count variable and the routine that removes characters from the buffer would decrement the count variable.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I use a simple method:
Whenever the UART interrupt receive 0x0A or 0x0D, it was replaced by 0x00.
But if this was the first byte inside the FIFO or the previous byte was 0x00, it was rejected.
So the first 0x0A or 0x0D mark the end of a line.
And every direct following 0x0A or 0x0D was ignored.

The main read the FIFO until 0x00 and parse the line.

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thank you Chuck99 and danni !
I'm going to implement the flag, as suggested by Chuck99.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

danni wrote:
I use a simple method:
Whenever the UART interrupt receive 0x0A or 0x0D, it was replaced by 0x00.
But if this was the first byte inside the FIFO or the previous byte was 0x00, it was rejected.
So the first 0x0A or 0x0D mark the end of a line.
And every direct following 0x0A or 0x0D was ignored.

The main read the FIFO until 0x00 and parse the line.

Peter


But what if you get tow CRs back to back? In other words, a blank line. You would see 0x00,0x00 and not know it it was a CRLF or CRCR.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

That's why I am implementing a 'cr_flag' right now.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Chuck99 wrote:
But what if you get tow CRs back to back? In other words, a blank line. You would see 0x00,0x00 and not know it it was a CRLF or CRCR.

Yes, every empty line was ignored.
I use the UART not to display books with empty lines.
I use it to receive control commands.
And an empty line can not contain any command.
Furthermore I reject also all leading spaces and tabs.

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

danni wrote:
Yes, every empty line was ignored.
I use the UART not to display books with empty lines.
I use it to receive control commands.
And an empty line can not contain any command.
Furthermore I reject also all leading spaces and tabs.

Peter


Peter, I only mentioned it because Kun.io's original post stated:
Kun.io wrote:
And I want to be able to enter an empty line.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You are a bit stuck if you want to accept CR, LF, CRLF.

Any machine generated input will probably use LF or CRLF.
Human generated input will probably use CR.

One possibility is to treat LF as the principal terminator. If you receive CR + timeout, call it a terminator. A CRLF is a terminator too.

I really can't see how you can differentiate between CR, LF, CRLF if they are all machine generated.

A timeout is not too hard to identify. You just note the time that you received the CR.

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Chuck99 wrote:

As for getting a count of printable characters in the buffer, the ISR could increment a count variable and the routine that removes characters from the buffer would decrement the count variable.

Side question, what happen when your routine decrement the count variable and half way decrementing (assuming it is 16-bit) the ISR fired and incremented it?

Do you disable the ISR before decrementing then only reenable the ISR or dont care?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Side question, what happen when your routine decrement the count variable and half way decrementing (assuming it is 16-bit) the ISR fired and incremented it?

Do you disable the ISR before decrementing then only reenable the ISR or dont care?


You are talking about atomicity. You always need interrupt atomicity protection when the foreground read/write cannot be done by a single atomic instruction itself (which on AVRs means a read/write to anything above 8bit).

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for all your answers, I have it already working (still testing).

I started with these: https://www.avrfreaks.net/index.p...

It's not that hard:
UART_rbuflen() will return the number of bytes (inclusive CR,LF).
UART_getc() will return the next byte (inclusive CR,LF).
UART_gets() will return a zero-terminated string (without CR,LF), and returns the length of the string.

Both CR and LF will make the funtion UART_gets() to return with a zero-terminated string.
If a CR is detected, UART_gets() sets a flag. If it was followed by LF, the LF is still in the buffer and is read next time. But next time the flag indicates that the LF should be discarded.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Glad to see you have it working for you... here is how I've done it in the past. In the RX ISR I test for CR or LF. [my API is standardized on LF as a terminator, but I accept CR or LF (and CRLF and LFCR) as an input terminator on the serial port] so, if CR or LF is seen, a LF is put into the buffer, and a flag is set. If the next character is a LF or CR [note the opposite order here] nothing is added to the buffer, and the flag is cleared. The end result is this:

CR LF ==> LF
LF CR ==> LF
CR CR ==> LF LF
LF LF ==> LF LF
CR LF CR ==> LF LF
LF CR LF ==> LF LF

The assumption is that whatever is providing the input will be consistent in its line termination so it is unlikely that deliberate empty lines will be dropped.

In my application I had no desire/need to know what method was used, so silently converting was ok, if your code needs to know, or handle different scenarios, then doing it at the library or application level is a better choice. My method also adds a small amount of additional overhead to the ISR.

Writing code is like having sex.... make one little mistake, and you're supporting it for life.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I detect a CR LF.
But should I also detect a LF CR?
I have never seen a program that uses it.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

it's not common, but I had to do it in an application where a device I was interfacing to sent CRLF as LFCR at one point. [was probably a bug - but technically legal according to the protocol spec I was adhering to -- either way, I had to deal with it] In my case, my code doesn't care as LF or CR would have no other meaning except as being part of a line termination.

Writing code is like having sex.... make one little mistake, and you're supporting it for life.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Code uploaded as project: https://www.avrfreaks.net/index.p...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi Kun
i fall down here because i'm searching for a serial communication library for attiny85 programed from arduino. then i notice attiny85 have no usart but usi and software uart. and here come your library, thank for sharing.
but i have no experience with avrstudio, only with arduino bootloader over attiny85.
could you please show me if it's posible to call that library from arduino?
or could you point me some place to know how to port your library to arduino
best regards pescadito
or point me

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

pescadito, the library (only updated by me) is for AVR chips with hardware USART only. So it won't work for an ATtiny85.

I think you have to port the SoftwareSerial library from Arduino. As far as I know, using a ATtiny with Arduino is just one problem after another.
However, they have done it: http://hlt.media.mit.edu/?p=1695