split string in c (atmel studio)

Go To Last Post
22 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

hi.
i want to split string in c.
my input string is something like this:
"hi, Monday,word,thanks"
i want to split that to this:
hi
Monday
word
thanks

i wrote this code and that work perfect but SplitedArray Much space is occupied.
does any body have a better idea?

this is my code:

char SplitedArray[10][140];
void SplitString(char str[], char sep)
{
	char SplitStr[140]="";
	int i=0,j=0,z=0;
	SplitStr[0]='\0';
	for(int k=0;k<10;k++)
	SplitedArray[k][0]='\0';
	
	while(str[i]!='\0')
	{
		if(str[i]==sep)
		{
			strcpy(SplitedArray[z++],SplitStr);
			
			for(int k=139;k>=0;k--)
			SplitStr[k]='\0';
			j=0;
			i++;
		}
		else
		{
			if(str[i]!='\r' && str[i]!='\n')
			{
				SplitStr[j++]=str[i];
			}
			
			i++;
		}
	}
	strcpy(SplitedArray[z],SplitStr);
}
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I guess you never heard of strtok() it was invented to do EXACTLY what you require!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If it's OK to alter the original string by replacing the delimiters it contains by string-terminating NUL characters, then use something like one of the strtok() functions on it. If you DON'T want the original string touched, then make a copy of it, and replace the delimiters in the copy with NULs with something like strtok(). Here's a function similar to strtok(), but with a better (IMHO) interface and without strtok()'s thread unsafeness:

char * nextSubstring(char *string, char delimiter)
{
    char scannedChar;
    while ((scannedChar = *string++) != '\0') {
        if (scannedChar == delimiter) {
            string[-1] = '\0';
            return string;
        }
    }
    return 0;
}

If you need to refer to the split-apart substrings more than once, or in a different order than they appeared in the original string, then build a loop that stores pointers to the substrings, something like so:

char *substrings[MAX_SUBSTRINGS];
char *substring = stringToScan;
for (uint8_t substr = 0; substr < MAX_SUBSTRINGS;) {
    substrings[substr++] = substring;
    if (!substring == 0) {
        break;
    }
    substring = nextSubstring(substring, delimiter);
}

, rather than making more copies of them.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Like I say:

#include 

char text[] = "hi, Monday,word,thanks" ;

int _tmain(int argc, _TCHAR* argv[])
{
	char * p;
	p = strtok (text, ",");
	while (p)
	{
		printf ("%s\n",p);
		p = strtok (NULL, ",");
	}
	return 0;
}
hi
 Monday
word
thanks

The only downside as Levenkay says is that text[] is modified in the process.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

strtok() also relies on static data.

Sid

Life... is a state of mind

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

strtok() also relies on static data.

You sure? This works just as well:

int _tmain(int argc, _TCHAR* argv[])
{
	char text[] = "hi, Monday,word,thanks" ;
	char * p;
	p = strtok (text, ",");
	while (p)
	{
		printf ("%s\n",p);
		p = strtok (NULL, ",");
	}
	return 0;
}

That's in MS VC 2008.

Similarly see the example on the (usually pretty reliable) cplusplus.com:

http://www.cplusplus.com/referen...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Quote:

strtok() also relies on static data.

You sure?

It retains a pointer in static data between calls. That means not only that it's not reentrant, but also that you have to complete the tokenizing of one string before you start tokenizing another.

(Some libraries have reentrant versions, but it's a safe bet that the one we use doesn't. The other problem remains either way.)

Sid

Life... is a state of mind

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ChaunceyGardiner wrote:
Some libraries have reentrant versions, but it's a safe bet that the one we use doesn't.
Do you mean strtok_r? Why would you bet that gcc does not have it?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

No, I was talking about the function Cliff is using - strtok().

Some compilers (e.g. VC) come with a double set of libraries - one for singlethreaded programs and one for multitheaded ones. That allows you to use the standard function names and still choose which version to use.

Sid

Life... is a state of mind

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In which case you could use it simultaneously from multiple threads, but it would still remain non-reentrant (i.e., you would still "have to complete the tokenizing of one string before you start tokenizing another" in the same thread)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yes, that is what I said - except that isn't technically a reentrancy problem.

(I looked at VS2010, and I see that they only supply the multithreaded versions now. Earlier versions gave you a choice.)

Sid

Life... is a state of mind

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Do you think re-entrancy is really going to be an issue on an AVR? I suppose if you were mad enough to call strtok() from an ISR it could be but I would kind of hope that a program would be better designed than that. I suppose there are (RT)OS's where two threads might try to use it but that sounds like a pretty rare scenario and you'd kind of hope that in a sensible design it'd only be the job of a single thread to do the parsing.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Cliff,

You claimed that the only downside to strtok() is that it modifies the source string. I have pointed out that there are two additional "features" a programmer should be aware of - and they only apply to strtok(). The function Levenkay posted doesn't have those problems.

There are plenty of examples here of people doing a lot of time consuming stuff in ISRs. I don't think that's a good idea, but it is a fact that many people do that.

For those people, it is an important distinction that strtok() is not reentrant - as opposed to Levenkay's function which is reentrant.

Also - and this is the most important part - a developer that is not aware of the inability of strtok() to parse more than one string at a time is going to find himself in trouble when trying to do just that. Again, Levenkay's function does not suffer from that problem.

That's all there was to it - I was not suggesting that using strtok() is going to end the world as you know it, I was just pointing out that there are a couple of things people need to know about it - in additon to the "downside that text[] is modified in the process"

Sid

Life... is a state of mind

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

There are plenty of examples here of people doing a lot of time consuming stuff in ISRs. I don't think that's a good idea, but it is a fact that many people do that.

'Tis true but lets face it if they were doing the strtok() stuff in the ISR() it seems very unlikely they'd be doing it in main() as well. ;-)

I also think it a bit (thought admittedly not totally) unlikely that someone would be making two separate, overlapping uses of strtok() in the same thread of execution but I suppose it might happen.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Either way, I don't see a problem with making people aware of those strtok() "features".

It is what it is, i.e. a useful function within certain constraints.

Sid

Life... is a state of mind

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

thank you for guiding me.
i write this but that work once!!

then my avr micro does not work!

this is my code:

	char* p;
	int i=0;
	p = strtok (Resp, ",\r\n");
	while (p)
	{		
		if(i==1) strcpy(Array1,p);
		else if(i==2) strcpy(Array2,p);
		else if(i==3) strcpy(Array3,p);
		else if(i==4) strcpy(Array4,p);
		else if(i==6) strcpy(Array5,p);
		p = strtok (NULL, ",\r\n");
		i++;
	}
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

i is 0 the first time through the loop, so you're skipping the first token.

You're also skipping the sixth token (i == 5), and any tokens after the seventh token (i > 6).

How did you declare your Array1 etc variables ? How about Resp - how is it declared and what do you assign to it ?

What do you do after the loop ?

Sid

Life... is a state of mind

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

How did you declare your Array1 etc variables ?

Just to say that a very common mistake is simply to do this:

char * Array1, * Array2, * Array3, etc. ;

That would create pointers but no associated storage so your strcpy's would be to mysterious destinations.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Levenkay wrote:

Here's a function similar to strtok(), but with a better (IMHO) interface and without strtok()'s thread unsafeness:
char * nextSubstring(char *string, char delimiter)
{
    char scannedChar;
    while ((scannedChar = *string++) != '\0') {
        if (scannedChar == delimiter) {
            string[-1] = '\0';
            return string;
        }
    }
    return 0;
}

If you need to refer to the split-apart substrings more than once, or in a different order than they appeared in the original string, then build a loop that stores pointers to the substrings, something like so:

char *substrings[MAX_SUBSTRINGS];
char *substring = stringToScan;
for (uint8_t substr = 0; substr < MAX_SUBSTRINGS;) {
    substrings[substr++] = substring; 
    if (!substring == 0) {
        break;
    }
    substring = nextSubstring(substring, delimiter);
}

, rather than making more copies of them.

I am horrible at C to be honest, but quick question:

The last set of code:

substring = nextSubstring(substring, delimiter);

I assume it will start where it left off from? ie: substring is now the next segment of the string.

If so, won't:

substrings[substr++] = substring;

always be the leftover string, not the skipped sub string?

If I am wrong and it keeps the whole string the code would make much less sense to me, so I assume I am right so far. I am willing to code and test it out, but it seems like something is wrong. Perhaps I misunderstand the purpose of the code.

Thoughts?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just tried this in Pelles on a PC:

#include 
#include 
#include 
#include 
#include 

#define MAX_SUBSTRINGS 10

char * nextSubstring(char *string, char delimiter)
{
    char scannedChar;
    while ((scannedChar = *string++) != '\0') {
        if (scannedChar == delimiter) {
            string[-1] = '\0';
            return string;
        }
    }
    return 0;
} 

char stringToScan[] = "hi, Monday,word,thanks" ;

int main(int argc, char *argv[])
{
	char delimiter = ',';
	char *substrings[MAX_SUBSTRINGS];
	char *substring = stringToScan;
	for (uint8_t substr = 0; substr < MAX_SUBSTRINGS;) {
	    substrings[substr++] = substring;
	    if (!substring) {
	        break;
	    }
	    substring = nextSubstring(substring, delimiter);
	} 
	for (uint8_t i=0; i < MAX_SUBSTRINGS; i++) {
		if  (!substrings[i]) {
			break;
		}
		printf("%s\n", substrings[i]);
	}
	return 0;
}

It printed:

hi
 Monday
word
thanks

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for running it!

Genius code, I really better appreciate it now. I am horrible with these "C" tricks. I learned Java originally. It is indeed putting the whole string in the first slot (or a pointer to the beginning of the whole string rather), but printing from the pointer at the array knows its the end of the string by the null character, I assume. It all makes so much sense now....

Thank you for running the code. I definitely better understand C now.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:

char stringToScan[] = "hi, Monday,word,thanks";


Beware of what it does if you feed it this string, though:
char stringToScan[] = "hi,,,there";

It may be what you expect, and then again it may not be. ;o)

Sid

Life... is a state of mind