help with Bubble Sort

Go To Last Post
32 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0


Guys, i'm working on a bubble sort and its only resulting in a semi sorted list.  Long file names are not being sorted properly.

 

	for (l = 0; l < entry_nr; l++) {
		for (j = 0; j < entry_nr - 1; j++) {

			// compare strings
			if (strcmp(data[j].strings[j], data[j + 1].strings[j + 1]) > 0) {

				    strcpy(temp, data[j].strings[j]);

			            strcpy(data[j].strings[j], data[j + 1].strings[j + 1]);
		        	    strcpy(data[j + 1].strings[j + 1], temp);
	         }
	    }
	}

 

The Bubble sort results in a sorted list but long file names are not sorted properly.  See below:

 

This topic has a solution.
Last Edited: Tue. Jul 16, 2019 - 03:40 PM
This reply has been marked as the solution. 
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Fianawarrior wrote:
but long file names are not sorted properly.
Define "properly". In the picture I see (first characters only)

 

N, N, S, n, o, s,, t, t

 

N, a, b, n, o, t

 

Given that ASCII character order is (roughly):

 

0..9, A..Z, a..z

 

then they ARE in correct order aren't they? I wonder if you might want to convert everything to either all upper or all lower case during the sort?

 

In a POSIX OS you should have access to strcasecmp(). In the MS world it's probably _stricmp()

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You have an incomprehensible data structure in your sort loop.  What exactly does this mean?:

data[j].strings[j], data[j + 1].strings[j + 1]) > 0)

It should be:

    for (OuterIndex = 0; OuterIndex < entry_nr; OuterIndex++) {
       for (InsideIndex = 0; InsideIndex< entry_nr - 1;  InsideIndex++) {
          if (strcmp(strings[InsideIndex], strings[InsideIndex+ 1]) > 0) {
                strcpy(temp, strings[InsideIndex]);
                strcpy(strings[InsideIndex], strings[InsideIndex + 1]);
                strcpy(strings[InsideIndex + 1], temp);
          }
       }
    }
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0


Hi Simonetta, I had to remove a lot of code before presenting it here.  When sorting the list I needed to also carry other data around with the strings, such as directory info.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0
for (l = 0; l < entry_nr; l++) {

BTW, that's about the nastiest C I have seen in a long time. frown

 

Who in their right mind would use lower case L as a variable name? 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:

for (l = 0; l < entry_nr; l++) {

BTW, that's about the nastiest C I have seen in a long time. frown

 

Who in their right mind would use lower case L as a variable name? 

 

mr! lmfao.  Okay, you have a point.  but it was the quickest decision in finding a suitable loop counter. 

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

a,b,c, ...  i,j,k ... x,y,z

 

Anything has to be better than l surely?

 

(while i is ubiquitous the jury is still out on that one)

 

Coincidentally today, while doing a peer code review, I came across use of both ii and i1 as iterators - wasn't very keen on either of those too! They'd been used to avoid a name clash with the plain i in the outer loop.

 

Things like "byte_counter", "string_index" and so on tell the reader/maintainer far more about what for() loop iterators are being used for. I suppose it's a bit more typing and multiple use of the iterator can start to look complex.

 

(never really mind x and y though assuming we're talking about pixels!)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You're doing more loops than you need.

If you had 4 items, max number of compares you need would be

01 12 23

01 12

01

I think you're doing

01 12 23

01 12 23

01 12 23

01 12 23

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

hint:

 

There's a quicksort function built into the libraries.

The largest known prime number: 282589933-1

It's easy to stop breaking the 10th commandment! Break the 8th instead. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0


Fianawarrior wrote:
... but long file names are not ...

Smiting from the patent holder...

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

 

I hate when the names are long & similar to others...they should be as distinct (and easily discerned) as possible 

 

BLAH:

led35mx_inp4a3acal_switch = 23;

led35mx_inp4b3acal_switch = 45;

led35nx_inp4a3acal_switch = 76;

led35mx_inp4a3bcal_switch = 29;

led35nx_inp4b3acal_switch = 47;

led35mx_inp3b4acal_switch = 28;

 

Someone gave me a bunch of code with many very similar names scattered all around...thank goodness for automatic highlighting!  My eyes hurt after 5 minutes...always on the edge of making a hard to find mistake!

 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Torby wrote:

hint:

 

There's a quicksort function built into the libraries.

 

I'll second that. Use qsort with a custom comparator function.

 

Pedantic mode on: qsort does not necessarily use the Quicksort sorting algorithm, but generally you can expect it to use a fast algorithm (certainly faster than bubble sort!).

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

But otherwise, as pointed out earlier, assuming that the screen grab was the result of two separate sort attempts with different inputs, it is correctly sorted per ASCII. Which is what Cliff pointed out earlier. And since you'll probably use strcmp() as the custom comparator function in qsort() you'll end up with the same result, only faster :)

 

strcmp() sorts by ASCII values. Some ASCII values - 0x30 = '0', 0x39 = '9', 0x41 = 'A', 0x5A = 'Z'; 0x61 = 'a'; 0x7A = 'z' (check out a convenient table at http://www.asciitable.com) so you can see why something starting with a capital letter will return that string as being shorter than something with the same letter in lower case. The comparison method also means that a shorter word such as 'bed' will always be considered of lower value than a longer which shares characters with it (e.g. beds). The comparison is performed letter by letter starting from the left and the order depends on the order of the first characters which are different; in this case the 'logical' comparison is the string end marker 0x00 vs 's', 0x73 (though the actual implementation probably doesn't do it quite like that).

 

Neil

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Anything has to be better than l surely?

I have to admit to using "i" for loop index by nature from FORTRAN days where ints were i - n and variable name length was an issue, but you may not be old enough to remember that.  I did have a problem once typing a "1" instead of an "i".  It has never occurred to me to use anything but "i" for loop index after doing it for 50 years, and working on a LOT of other people's code for as long who all did the same.  That's what "i" is used for.  Now I am going to have to rethink all my code going forward.  Thanks a lot, Cliff!  But you really dont have to be so supercilious about it.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

MarkThomas wrote:
I have to admit to using "i" for loop index by nature from FORTRAN days where ints were i - n and variable name length was an issue

Capital L is obviously not a problem, but small l, well....

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Always use something searchable...if you need to search a file for some reason looking for "i" or "e" will give a baziliion wrong hits, but  battery_sample_num  is probably going to find what you intend to find (even "bsn" is better than  "i", if you despise typing).

 

 

 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

kk6gm wrote:
Capital L is obviously not a problem, but small l, well....

I dont do that any more.  I learned my lesson.  Now I am more likely to use "ii".  I seem to remember in the FORTRAN days there was no distinction between upper and lower case, but that was 40 some years ago and my memory is gradually fading away.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

avrcandies wrote:
Always use something searchable...if you need to search a file for some reason looking for "i" or "e" will give a baziliion wrong hits, but  battery_sample_num  is probably going to find what you intend to find (even "bsn" is better than  "i", if you despise typing).

I can search on " i " and find all my loops.  I am very conscientious about how I use white space.  If I am looking for a particular bit of code I search on variable names or method names, being of the school that code be self documenting, and I never put comments in line with the code when one is needed for a particular tricky bit of algorithm.  Always off to the side.  Being in line just makes the code harder to read and is distracting.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Even though I use long names like battery_sample_num

for variables, I'll still use 'i' as a loop variable.  I find it's

easier to read this:

for (int i=1; i<10; ++i) {
    battery_sample[i-1] = battery_sample[i];
}

than this:

for (int battery_sample_num = 1; battery_sample_num < 10; ++battery_sample_num) {
    battery_sample[battery_sample_num - 1] = battery_sample[battery_sample_num];
}

 

As always, these are questions of style and people disagree.

I even disagree with my own style from last year at times!

 

--Mike

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This very thread proves the issue beyond doubt. I was not objecting to lower case India but to lower case Lima which is too similar to digit One.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

avr-mike wrote:

Even though I use long names like battery_sample_num

for variables, I'll still use 'i' as a loop variable.  I find it's

easier to read this:

for (int i=1; i<10; ++i) {
    battery_sample[i-1] = battery_sample[i];
}

I almost always use spaces around operators, except ++ or --, so I would write:

for (int i = 1; i < 10; i++)
{
    battery_sample[i - 1] = battery_sample[i]  // or maybe for clarity i would use [i-1]  for verydense code with lots of index manipulation I sometimes leave out the spaces 
}

For += and -= I always use the space:  i += 12;   j -= 11;  It is just easier to read for me.  One company I worked for had a style manual we were supposed to follow for our code.  It was mostly stuff I did anyway, so I didn't break the rules too often.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

MarkThomas wrote:
I almost always use spaces around operators,
Me too - the coding standard I've used for the last 20+ years mandates it ;-)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

MarkThomas wrote:

clawson wrote:
Anything has to be better than l surely?

I have to admit to using "i" for loop index by nature from FORTRAN days where ints were i - n and variable name length was an issue, but you may not be old enough to remember that.  I did have a problem once typing a "1" instead of an "i".  It has never occurred to me to use anything but "i" for loop index after doing it for 50 years, and working on a LOT of other people's code for as long who all did the same.  That's what "i" is used for.  Now I am going to have to rethink all my code going forward.  Thanks a lot, Cliff!  But you really dont have to be so supercilious about it.

 

A coworker is out to correct all my carefully acquired bad programming habits.

The largest known prime number: 282589933-1

It's easy to stop breaking the 10th commandment! Break the 8th instead. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

MarkThomas wrote:
One company I worked for had a style manual we were supposed to follow for our code.  It was mostly stuff I did anyway, so I didn't break the rules too often.

Fight a war over design issues, not style. - Jack Ganssle

via

The Embedded Muse 353 - Tools and Tips

...

 

Daniel Wisehart responded to my comments on MISRA-C:

...

A Clang pretty printer with VCS interfaces and IDE plug-ins or extensions :

ClangFormat — Clang 10 documentation

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You're all missing the point: the loop counter should of course be 'q'...cheeky

 

Neil

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

barnacle wrote:

You're all missing the point: the loop counter should of course be 'q'...cheeky

 

Neil

 

What a good idea.  I currently use 'q' as a command to quiet UART to get timings without all the time spent talking to PuTTY, but that is an easy change.  Neil, you are a genius and an original thinker.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0


barnacle wrote:

You're all missing the point: the loop counter should of course be 'q'...cheeky

 

Neil

???  Y'all are losing me. 

 

 

[side note:  I'm of the age where loop counters should obviously be i - j - k]

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Personal taste/choice. I suspect I started using them when I was modelling digital filters, where i, j, k tend to have domain specific understanding. I do prefer for loop counters where the counter name itself isn't particularly important to use q, r, s, t with a comment if required.

 

    ... // blah blah
    // don't compile this tight timing loop away mr compiler!
    for (volatile uint8_t q = 0; q < 10; q++)
    {
        // nothing to see here
    }
    
    // versus
    for (volatile uint8_t hard_counter = 0; hard_counter < 10; hard_counter++)
    {
        // nothing to see here
    }
    

Neil

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You can cause some interference issues if you use i (and j)  as a variable in Matlab...please don't (just save the world)

 

Avoid Using i and j for Variables

MATLAB uses the characters ii and j to represent imaginary units. Avoid using ii and jj for variable names if you intend to use them in complex arithmetic.

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I suppose it would be rude to point out that I like to use loop counters "r18" and "r19". cheeky  S.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:

barnacle wrote:

You're all missing the point: the loop counter should of course be 'q'...cheeky

 

Neil

???  Y'all are losing me. 

 

 

[side note:  I'm of the age where loop counters should obviously be i - j - k]

 

Or maybe a James Bond reference?

The largest known prime number: 282589933-1

It's easy to stop breaking the 10th commandment! Break the 8th instead. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Torby wrote:

theusch wrote:

barnacle wrote:

You're all missing the point: the loop counter should of course be 'q'...cheeky

 

Neil

???  Y'all are losing me. 

 

 

[side note:  I'm of the age where loop counters should obviously be i - j - k]

 

Or maybe a James Bond reference?

my first thought went to "Q & Q"  but a quick wiki search resulted in the conclusion that people knowing that had to be dutch as it was a dutch series, hahahahaha