[EDIT] Follow on tutorial after this: http://www.avrfreaks.net/forum/codec-parsing-json-based-config-file-using-micro-memory-ragel
This is my first tutorial here so please be nice. I'm pretty new to AVRs, not so new to programming. I've fiddled with AVRs a few times over the last couple of years but only recently got UART <-> RS232 communications up and running. As part of my first project doing that I had a need to interpret strings sent over the UART. Naturally, I did this the complicated way. I guess the first question to answer is why you might need to do this. The obvious one is the example given above - receiving some form of commands in a text-based format via the UART. It's possibly overkill for simple command parsing but I really wanted to test and prove the feasibility of this technique. There seems to be a common conception that small microcontrollers just aren't fast enough or have enough memory to handle a `proper' parser. A quick search seems to indicate that people either use basic string comparisons or write small hand parsers. While it's true that most parser generators assume the availability of many KB of memory and as a result don't suit micros, hand-made parsers can also be extremely difficult to change and debug and often turn out to be less efficient than machine-generated ones. Techniques used frequently on PCs such as using the stack heavily (e.g. recursive descent parsers), pushing and popping tokens on and off possibly large stacks (e.g. Lex/YACC) and using large tables in memory to control the parser don't fit well at all with microcontroller resource limits. A while ago I came across a state machine compiler and parser generator called Ragel ( http://www.complang.org/ragel/ ), which has much lower requirements to run. It's stable, well-documented and most importantly for us a couple of its output modes are great for low-memory systems. The disadvantage is that it can be very complicated at first and almost lost me several times. I've since learned to love it though, which is why I decided to write this to take people through its capabilities, quirks and potential stumbling points. To illustrate the capabilities of the system I will go through from start to finish writing an interpreter for a tiny scripting language designed to run on an atmega8 chip. All examples are using avr-gcc but should be fairly easy to port - ragel adds no dependencies to the final code. I assume readers have good experience with C programming and understand subjects such as pointers. I also assume readers will be able to install ragel and can enter commands on the command line. Windows users can obtain ragel via the Cygwin project installer and use it via the BASH shell that is installed with it. The details of input and output to the parser will not be covered - this sort of parser could be plugged in to any communication medium. Note on versions of ragel - this article was written for ragel 6.0. Apparently some older versions may not support all of the features used here. In particular 5.3 has been identified as not supporting some options used.