Need help parsing commands more efficiently

Go To Last Post
14 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi guys,

 

I'm working on a project that should, once completed, make developing stuff with the NRF24L01p a bit easier.

In short, I'm developing a GUI for it, using Qt. This sends commands over serial to an XMega, which in turn translates the commands to SPI commands for the NRF. All function of it should be accessible over the GUI.

The serial interface should however also be easy to use without the GUI, so everything needs to be human readable.

Therefore, I need to define a simple text based command set and parse the commands on the XMega.

 

And this part is causing trouble. I found a working solution, but I don't like it. It's not elegant and efficient enough for me.

Here's part of the ugly code:

 

uint8_t parseInst(char *input)
{

    //***** this part seems too ugly:
        char buffer[100];
	char *param;
	char *instruction=strtok_r(input,(char*)" ",&param);
	char *instruction2=strtok_r(NULL,(char*)" ",&param);
	strcat(strcat(strcpy(buffer,instruction)," "),instruction2);
    //*****

	if (strcasecmp(buffer,"write reg")==0)
	{
		return parseWrite(param);
	}

	else if (strcasecmp(buffer,"read reg")==0)
	{
		return parseRead(param);
	}
	else if (strcasecmp(buffer,"test")==0) //test for response
	{
		return OK;
	}

	return badInstr;
}

 

I marked the part for which I'd like another solution. Basically, I want to split the string I receive from USART after the first to words and pass the remainder to the other functions so they can parse it. But the length of the words can vary and there can also be only one word ("test" for example).

 

Please let's not discuss naming of function and variables, it's a work in progress ;) I'll refactor everything when it works as expected.

 

If anyone is interested, I might publish all of this project as completely open source and license free (or probably partially under LGPL v3, as dictated by Qt Community license)

 

Thanks in advance

-Patrick

 

Edit: I forgot to say: I'm using Atmel Studio 7.0.1006 on default settings concerning compiler and linker and do everything in C

"Some people die at 25 and aren't buried until 75." -Benjamin Franklin

 

What is life's greatest illusion?"  "Innocence, my brother." -Skyrim

 

Last Edited: Sat. Feb 18, 2017 - 03:32 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Parsing commands can be a tough and time consuming business, especially if you want to make it robust.  The more conditions you can remove the better.  For example, if you don't need upper/lower case, just lower case the entire command in the beginning.  If you don't need spaces, preprocess the command to remove the spaces.  If you don't need numbers or symbols, strip them out ahead of time.  The point is to make the command simpler.  I ended up converting a project I had to a console type project, and did exactly this.  Compilers also tend to preprocess a commands into tokens.  Here is an example of the interface:

 

TEXTENIGMA 1.00

ENTER HELP= FOR HELP

AAAA>HELP=
SHOW OPTIONS        OPT= (OPTIONS CHANGE DEPENDING ON CONFIGURATION)
SHOW CONFIG         SHOW=
INIT CONFIG         INIT=
LOAD CONFIG         LOAD=FILENAME
SAVE CONFIG         SAVE=FILENAME
QUIT                QUIT=
AAAA>OPT=
SET MACHINE         MACH=A865,D,G111,G260,G312,I,KD,M3,M4,N,R,SWK,T
SET REFLECTOR       REFL=BCD
SET ROTORS          ROTR=X111 (X=B OR G, 1=12345678)
SET GAP ROTORS      GAP=ABC,AMZ,GQZ
SET TURN NOTATION   TURN=INDICATOR,NOTCH
SET RINGS           RING=AAAA
SET PLUGBOARD       PLUG=AB .. YZ (1-13 PAIRS)
SET INDICATOR       INDC=AAAA
FORWARD INDICATOR   FWD=NNNN (NNNN IS 0-9999)
SET SPACE MODE      SPC=NONE,ENTERED,4CHARS,5CHARS
AAAA>

I am using some code that does this:

 

          if (strncmp(s1,"PLUG=",5)==0 && Machines[MachineState.Machine].HasPlugboard)
            {
            }

          //no commands with commas appear after this point
          //let,num remaining
          if (typepresent(0,TP_COMMA))
            goto invalidcommand;

          //let,num
          if (strncmp(s1,"LOAD=",5)==0)
            {
            }
            

I use a couple of functions to test for or remove characters from the command:

#define TP_LETTERS _BV(0)
#define TP_NUMBERS _BV(1)
#define TP_EQUAL   _BV(2)
#define TP_COMMA   _BV(3)
#define TP_SPACE   _BV(4)

uint8_t typepresent(uint8_t ASkip, uint8_t AType)
{
  uint8_t c1,c2;

  for (c1=0,c2=0;s1[ASkip+c1];c1++)
    if (((AType & TP_LETTERS) && s1[ASkip+c1]>='A' && s1[ASkip+c1]<='Z') ||
       ((AType & TP_NUMBERS) && s1[ASkip+c1]>='0' && s1[ASkip+c1]<='9') ||
       ((AType & TP_EQUAL) && s1[ASkip+c1]=='=') ||
       ((AType & TP_COMMA) && s1[ASkip+c1]==',') ||
       ((AType & TP_SPACE) && s1[ASkip+c1]==' '))
      c2++;

  return c2;
}

void typeremove(uint8_t AType)
{
  uint8_t c1,c2;

  AType=~AType;

  for (c1=0,c2=0;s1[c1];c1++)
    if (((AType & TP_LETTERS) && s1[c1]>='A' && s1[c1]<='Z') ||
       ((AType & TP_NUMBERS) && s1[c1]>='0' && s1[c1]<='9') ||
       ((AType & TP_EQUAL) && s1[c1]=='=') ||
       ((AType & TP_COMMA) && s1[c1]==',') ||
       ((AType & TP_SPACE) && s1[c1]==' '))
      s2[c2++]=s1[c1];

  s2[c2]=0;
  strcpy(s1,s2);
}

Hope this helps.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I found a solution. It's not bullet proof but it works, should be much faster and is more elegant than what I previously had.

uint8_t parseInst(char *input)
{
	if (strncasecmp("write reg",input,9)==0) 
	{
		return parseWriteReg(input+10);
	}

	else if (strncasecmp("read reg",input,8)==0) 
	{
		return parseReadReg(input+9);
	}
	else if (strncasecmp(input,"read addr",9)==0) 
	{
		return parseReadAddress(input+10);
	}
	else if (strncasecmp(input,"write addr",10)==0) 
	{
		return parseWriteAddress(input+11);
	}
	else if (strncasecmp(input,"test",4)==0) 
	{
		usart_put_string(USART,"OK\r\n");
		return OK;
	}

	return badInstr;
}

Unfortunately, it requires counting characters, which makes it a bit ugly again. But I'm OK with that.

Other solutions and ideas are still welcome!

 

-Patrick

"Some people die at 25 and aren't buried until 75." -Benjamin Franklin

 

What is life's greatest illusion?"  "Innocence, my brother." -Skyrim

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Making a command-line-interface robust and manageable can quickly become a complex sub-project, especially once you apply sanity-checks to the inputs.


My experience is that when the number of commands exceeds about 5 or 6 it becomes more manageable and maintainable to use table-driven systems instead of long chains of nested if-then-elses.

static const struct<br />
	{<br />
	const char *name;<br />
	bool ( *action )( void );<br />
	} cmds[] =<br />
	{<br />
		{ "MENU",			cmd_menu },<br />
		{ "STATUS",			cmd_status },<br />
		{ "COMMS",			cmd_comms },<br />
		{ "CALIBRATE",		cmd_calibrate },<br />
...<br />
		{ "AC_DEBOUNCE",	cmd_ac_debounce },<br />
		{ NULL, 			NULL },<br />
	};
The given command is found and invoked by
	n = 0;<br />
	while ( cmds[n].action != NULL )<br />
		{<br />
		if ( strcmp( &cli_input[0], &cmds[n].name[0] ) == 0 )<br />
			{<br />
			( *cmds[n].action )();<br />
			break;<br />
			};<br />
		n ++;<br />
		};
The action routine process the remainder of the command.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Am I the only one who thought you original posted code in #1 was just fine. I found it perfectly readable.

Your code has the look of some experience in a higher level language than C, is that so ?

 

I don't know if you plan to scale up at all, but I note you pass a pointer to the remainder of the command string into those functions that need to do more parsing.

 

+1 For use of strtok_r()

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just to say that there are well known lexical processing libraries you can use for complex vocabularies. Things like yacc (yet another compiler compiler). You feed them a language specification in something like BNF then they process input text.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

There are written many book about compilers and a big part is the lexical scanner, not that you need it but perhaps you can get some ideas.

 

The first thing make language syntax that are easy to handle take a look here https://en.wikipedia.org/wiki/Backus–Naur_form

 

Last Edited: Sun. Feb 19, 2017 - 11:38 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If performance is your main issue you can use a switch( buffer[n])... to jump almost directly to the right command, and use a few if()else().

For elegance & robustness my vote goes to the table driven approach of mikech.

 

Some simple tricks for more speed is to sort your commands in the frequency used and to use all return values from your compare function.

RETURN VALUE
       The  strcasecmp()  and strncasecmp() functions return an integer less than,
       equal to, or greater than zero if s1 (or the  first  n  bytes  thereof)  is
       found, respectively, to be less than, to match, or be greater than s2.

 

And if you like to do (most) things twice you are of course permitted to go on and refector immediately after it works.

 

 

I also have no doubt of your good intentions, but I don't underdstand the goal of this project.

nRF24l01+ does a pretty simple task in an overly complex way (silly 6 datapipes model) but it is a popular chip, probably because of it's almost non existent price.

Most of the hobbyists probably just want a lib that works for their particular color of uC and are not going to get themselves familiar with an Xmega just to get anohter chip working.

If you really want to write a gui it's probably going to be a more usefull project if you make it work with a bus pirate

http://dangerousprototypes.com/d...

and also add support for directly connecting the nRF to some general I/O pins of a Raspi, beaglebone, whatever.

I believe I've read in the manual of AVRDUDE that you can configure it to work with about any board which has general purpose I/O pins.

 

And if you really want to add some usability:

There are some "mostly compatible" clones from china widely available and the differences between those clones are probably the source of most problems with these nordic chips.

So if you can document the differences between these clones and make your software autodetect them that will be usefull.

 

When I was writing my own lib for the nRF... one of the biggest problems was determining if the the chip was  not sending anything, or the other was not receiving, and what I wanted was a simple device which was constantly listening in on all air traffic and log it to a serial port or something. This was not so easy to do with ( a third) nRF... and is one of the reasons I don't like these chips very much. I've also looked breefly into the direction of buying a Televison RF dongle and using/modifiing it for software defined radio to sniff the air data, but they seem to go to slightly above 2GHz and do not reach the 2.4GHz band of the nRF...

 

Also note that soldering an extra decoupling cap as close as possible to the nRF chip on those cheap chinese boards can increase the range quite a bit.

 

So my best advise to other hobbyists regarding RF module's:

Don't use the nRF24L04+ It's an overly complex design with a bunch of silly features which only make it harder to use and do not add something usefull.

(A very usefull, almost mandataory feature for a mesh network would be collision avoidance, which the nRF... does not have).

Get yourself some modules in the sub GHz bands ( JeeNode could be a good start) and if you have the budget then add a dongle with SDR to sniff / debug your air data.

 

Edit / Addition:

Why do you think that an str_N_cmp() would be faster then an str_cmp() ?

The compare cops out at the first non equal character without checking the rest of the string, regardless of it's length.

Adding the string length only makes it slower because it has to do extra house keeping after comparing each character of the strings.

 

I have some extra respect for people who want to put in the extra effort to want to write robust & elegant software instead of just throwing something together untill "it works for me".

But ask yourself: "What is my project going to add to:"

https://github.com/search?utf8=%...

 

 

Doing magic with a USD 7 Logic Analyser: https://www.avrfreaks.net/comment/2421756#comment-2421756

Bunch of old projects with AVR's: http://www.hoevendesign.com

Last Edited: Sun. Feb 19, 2017 - 01:18 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You really don't need to get into compiler techniques for something simple like this, KISS applies. Unless you have hundreds of strings to compare, performance should not be an issue.

 

However, a useful tip from language design is to make your syntax "parser friendly", and reduce ambiguity. That would suggest not allowing space between keywords if space is also a delimiter for parameters. You'll find people do perverse things like add multiple space or even tabs "because it look better". I would use "read_reg" "read_mem" etc.

 

Either way, I strongly recommend the table lookup version, it's neater and much more maintainable.

Bob. Engineer and trainee Rocket Scientist.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Why not just "rr", "rm" etc? I bet the user doesn't really want to have to type "read_mem", especially having to type a '_' character every time.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

And I suggest that you use space between so "read reg" will be handled as two words, so when you decode the line will get <read><reg number>  or <read><mem><addr>, those "tokens" are now easy to decode and keep in system, (and cleaner error check).

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thank you for your replies

 

I do this project to get more into Qt, C++ and also to improve my C. I know thinks like this have been done before, but I just like to make my own.

 

Paulvdh wrote:

Why do you think that an str_N_cmp() would be faster then an str_cmp() ?

The compare cops out at the first non equal character without checking the rest of the string, regardless of it's length.

Adding the string length only makes it slower because it has to do extra house keeping after comparing each character of the strings.

 

I know it isn't faster, but it allows me to completely omit the the strtok_r and strcat lines, which are expensive.

 

Paulvdh wrote:

Also note that soldering an extra decoupling cap as close as possible to the nRF chip on those cheap chinese boards can increase the range quite a bit.

 

So my best advise to other hobbyists regarding RF module's:

Don't use the nRF24L04+ It's an overly complex design with a bunch of silly features which only make it harder to use and do not add something usefull.

 

First line: Solid advice, not only for increased range but also because these modules can actually stop working at all when there's to much noise on the supply lines.

 

But I strongly disagree with your advice to not use these modules. They're really not that hard to use and the value you get for the price is simply unbeatable.

The pipelines are unnecessary, but you can always just not use them. The other features are quite useful I think, especially auto acknowledgement and re-transmission.

 

N.Winterbottom wrote:

Am I the only one who thought you original posted code in #1 was just fine. I found it perfectly readable.

Your code has the look of some experience in a higher level language than C, is that so ?

 

I don't know if you plan to scale up at all, but I note you pass a pointer to the remainder of the command string into those functions that need to do more parsing.

 

+1 For use of strtok_r()

 

 

My problem with that is that it makes three string from the input only to recombine the first two again. This just seems highly inefficient to me.

Higher level languages? I know some C++, but that's it.

The reason why I pass the remainders of the input to the functions actually is to parse it farther. My line of thinking was that it makes things easier to maintain and test.

 

clawson wrote:

Why not just "rr", "rm" etc? I bet the user doesn't really want to have to type "read_mem", especially having to type a '_' character every time.

I want things to be as easy to understand as possible, and the normal use is the GUI anyway, so it doesn't matter that much.

Since GUI will have a text window that logs all communication, it should be easy to read nonetheless.

 

Thanks again

-Patrick

"Some people die at 25 and aren't buried until 75." -Benjamin Franklin

 

What is life's greatest illusion?"  "Innocence, my brother." -Skyrim

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

For my money i don't think you can really beat one character hierarchical menus with printed help showing the options if you just hit enter or ? or similar. Easy to implement but perhaps more importantly, easy for the user to drive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I agree with Clawson. If you want to connect to an uart then use short single (2 or 3 letter) commands (or decode only first character(s)) and use "?" for help / menu.

When using a gui on top of that the names of low level communication are moot anyway.

 

About nRF difficulties:

Once you've got your lib working, it works.

I believe I ended up using 2 of the data pipes. One to broadcast data to "multiple devices" and another pipe for the return signal.

And how do you want to use the AA function in a mesh network?

And because most hobbyists are not going to use more than a handfull of these devices anyway and prices are so low a module of 3 times the price is still in the same budget range for most.

Doing magic with a USD 7 Logic Analyser: https://www.avrfreaks.net/comment/2421756#comment-2421756

Bunch of old projects with AVR's: http://www.hoevendesign.com