Code never executed slowing down execution?

Go To Last Post
13 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I have a very strange problem that I don't even know where to start with.

 

Do you think that the strange behaviour noted below might be caused by the fact that my ATMega1284P is running at 20MHz but only at 3.3v which is technically out of spec?

 

 

 

--- even this stuff is pretty unhelpful ---

 

I have a main loop that checks some buttons, sends a bunch of stuff over SPI and then tells me how long it took to do all that.

 

OK - Ignore all the stuff below. I'm going to leave it here because it's part of the process of trying to figure out what's going on. It appears to be somehow related to the SIZE of my program. as things stand right now, compiled with -O3, the .hex file is 8166 bytes. Commenting out the function I've mentioned below brings that down to 7686 bytes and the program works properly. If instead I comment out something else that makes it smaller, the program works correctly. Changing the optimisation setting to either -O1 (7030b) or -Os (6726b) means the program works correctly (although slower overall, it's consistent slowness).

 

Changing the optimisation settings probably isn't very helpful because it ends up creating completely different code anyway. I can comment out all sorts of random bits of code and data and there's no magic number that if the code is smaller than it works, but it seems to happen around the 8K mark. I duplicated a whole bunch of flash data so it was over 11K and had the same slowdown issue, however, commenting out a bunch of the same data to get it below 7.6K had no effect. It seems to be related to were actual executable code lives, because commenting out even small functions makes it work again so now I'm completely lost.

 

 

---- probably not relevant any more ----

 

I have a piece of code attached to one of the buttons:

 

        if (action_timer <= t)
        {
            if ( buttons == BTN_A )
            {
                click();
                current_zone = scan_portals(&player, current_zone);
                current_map = current_zone->map;

                // check for items/containers
            }
            
            // check another button
            
            action_timer = t+200;
        }
        

So that checks roughly every 200ms to see if button A has been pressed. However, even if button A is never pressed, simply having the scan_portals() call in the code causes it to stutter and take 10 times longer than it should. If I comment out that line, it never takes longer than about 6 or 7ms to do everything, with that line in, roughly every 200ms, it takes about 70ms to complete, but the A button is never being pressed? If I move the call to scan_portals() outside the button check (so it IS actually being called, every 200ms) I get the same performance issue. Putting a single call to scan_portals() OUTSIDE the main loop appears to cause the same problem also?

 

The function itself is pretty innocuous:

 

Zone* scan_portals(Sprite* player, Zone* current_zone)
{
    if (player->x > current_zone->x_out &&
        player->x < current_zone->x_out+8 &&
        player->y > current_zone->y_out &&
        player->y < current_zone->y_out+8)
    {
        player->x = current_zone->x_in+4;
        player->y = current_zone->y_in+4;
        return current_zone->return_to;
    }

    for(uint8_t i=0 ; i<current_zone->num_portals ; i++)
    {
        if (player->x > current_zone->portals[i]->x_in &&
            player->x < current_zone->portals[i]->x_in+8 &&
            player->y > current_zone->portals[i]->y_in &&
            player->y < current_zone->portals[i]->y_in+8)
        {
            player->x = current_zone->portals[i]->x_out+4;
            player->y = current_zone->portals[i]->y_out+4;
            return current_zone->portals[i];
        }
    }
    return current_zone;
}

and there's only actually 1 portal in the starting zone, so the for loop only happens once BUT THE CODE NEVER EVEN RUNS ANYWAY!?!

 

So simply having that code as part of the program causes it to run slowly. What on earth could be causing this?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If what you are seeing is real, I suspect a JMP vs. RJMP situation.

As a test, try putting the dead function near the end of flash.

Iluvatar is the better part of Valar.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Usually these kind of weird things turn out to be dodgy code somewhere.

Enable lots of warnings if you haven't already (at least -Wall -Wextra assuming gcc)

Some possibilities that spring to mind to check for

1) unitialised local variable

2) out of bounds array

3) some other undefined behaviour

4) interrupt race conditions

5) run out of stack (but doesn't sound like it in this case)

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

MrKendo wrote:

Usually these kind of weird things turn out to be dodgy code somewhere.

Enable lots of warnings if you haven't already (at least -Wall -Wextra assuming gcc)

Some possibilities that spring to mind to check for

1) unitialised local variable

2) out of bounds array

3) some other undefined behaviour

4) interrupt race conditions

5) run out of stack (but doesn't sound like it in this case)

 

 

The only thing I did that I haven't done before or anywhere else was:

 

I have a typedef

typedef struct Map {
    uint16_t cols;
    uint16_t rows;
    const uint8_t __flash *tileset;
    uint8_t tiles[];
} Map;

and usually I create constant maps

 

static const __flash Map VILLAGE_MAP = {
    .cols = 20,
    .rows = 12,
    .tileset = &VILLAGE_TILES[0],
    .tiles = {

	 11,  11,  11,  11,  11,  11,  15,  15,  15,  11,
	 11,   5,   6,   0,   0,   0,   2,   0,  39,  41,
	 ...
    }
};

however, in this particular case I wanted to be able to generate maps, so I created a 'map buffer':

 

static Map map_buffer = {
    .cols = 16,
    .rows = 8,
    .tileset = &VILLAGE_TILES[0],
    .tiles = {
    	 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    	 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
     }
 };

But I wasn't completely convinced that I shouldn't be using malloc()? I've not needed to do anything like that for AVR before. I'm not sure how to create a variable with a flexible length array in it.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

So it's the old 'struct hack' or flexible array member which was 'legalised' in C99.

I can see how it works with malloc, when you allocate storage for the whole thing, but I don't think you can do it the way you are trying with map_buffer?

What allocates the storage for the tiles array?

Does the compiler generate any warnings?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

with my limited knowledge, I keep messing up pointers and such, but

 

the function call is:

               current_zone = scan_portals(&player, current_zone);

 

the function itself is:

Zone* scan_portals(Sprite* player, Zone* current_zone)

 

and the function returns are:

 

return current_zone->return_to;
 
return current_zone->portals[i];

return current_zone;

 

Now you have not shared the makeup of the Zone type, but I can imagine that things go wrong there.

 

Also the current_zone variable is not known to us.

I always mess up on pointers but to me it looks like you have to tripple check in that area

 

to my most likely totally wrong knowledge

if current_zone is a normal variable and not a pointer.

Then the value of that variable turns into a pointer address ( random number???)

then you return either a top pointer address of some sub address, that also does not look right, but then again we do not know how the Zone type is defined.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

MrKendo wrote:

So it's the old 'struct hack' or flexible array member which was 'legalised' in C99.

I can see how it works with malloc, when you allocate storage for the whole thing, but I don't think you can do it the way you are trying with map_buffer?

What allocates the storage for the tiles array?

Does the compiler generate any warnings?

 

The compiler only complained when it wasn't static. I hoped that it was enough to initialise it when I declare it for the storage to be allocated?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

meslomp wrote:

Now you have not shared the makeup of the Zone type, but I can imagine that things go wrong there.

 

Also the current_zone variable is not known to us.

I always mess up on pointers but to me it looks like you have to tripple check in that area

 

the Zone type is defined as:

 

typedef struct Zone {
    uint8_t x_in;
    uint8_t y_in;
    uint8_t x_out;
    uint8_t y_out;

    const __flash Map* map;

    uint8_t num_portals;
    struct Zone* portals[MAX_PORTALS];
    struct Zone* return_to;
} Zone;

and current_zone is a pointer:

 

Zone* current_zone = &VILLAGE;

so I don't *think* I've got that wrong. I have removed the map_buffer related code and everything seems to work properly again now. I can see that I might have messed up with pointers/memory allocation somewhere, but I don't understand why it causes the problem it causes!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

MalphasWats wrote:
I'm not sure how to create a variable with a flexible length array in it.
dynamic multidimensional array?

Pool?

 

https://en.wikibooks.org/wiki/C_Programming/Common_practices#Dynamic_multidimensional_arrays

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Going back to the tiles[] flexible array member.

Using it with a statically allocated structure in this way, I'm pretty sure, is undefined behaviour as far as C99 standard is concerned.

You will get a warning if built with -pedantic (or an error with -pedantic-errors).

However, it seems that it is supported as a gcc extension.

So it may well be fine, just not portable.

I wonder if you couldn't just use a fixed size for tiles[], whatever size is big enough for all cases.

Even if ultimately you can't, I would do a test with a fixed size, see if it makes any difference, at least you might be able to rule it out as the cause of any problems.

Likewise, I would do a test with everything in RAM (no __flash), again see if it makes any difference.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The way I've seen this kind of thing done

requires either a 'type' parameter or a

'length' field in the struct.  A type would

make sense if you have a few different

fixed-size arrays and these correspond to

some concept you can define in an enum.

 

In the absence of a type field, or if the

array can contain an arbitrary number of

entries, a length field can be used to specify

how many items/bytes have been allocated

to the array.

 

Then when creating these things, malloc

is used to allocate the entire structure with

the right amount of space to cover as many

bytes as needed for the array.  Fill in the

type/length as required and return the

pointer.

 

--Mike

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

MalphasWats wrote:
Do you think that the strange behaviour noted below might be caused by the fact that my ATMega1284P is running at 20MHz but only at 3.3v which is technically out of spec?

 

I thought I'd post an update to this - the problem was getting worse and became completely random with every program loaded on. I'd noticed it got worse the longer I worked on the device, which suggested (to my very limited understanding) issues with clocking/temperature (although it didn't feel warm, I think it was timing issues somewhere).

 

Setting the fuses to use the internal oscillator at 8MHz made the problem go away. I've since replaced the 20MHz crystal with a 16MHz crystal and so far it seems to be working fine (even though it's still *technically* overclocked!).

 

Thank you for your efforts.

-Mike

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You can’t equate overclocking your PC processor with overclocking an AVR in regards to temperature rise. You’re more likely to hit the design limit well before the AVR shows any signs of getting hot. Also note the design limit varies with voltage and temperature.