Any method of limiting generation of ijmp and icall

Go To Last Post
21 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi,
for my project the compiler is notoriously generating a few icall and ijmp instructions.
they land anywhere, a few of them in switch/case
but are unfortunately present in library functions
like fputc/fgetc.
I think using callback function pointer is also translating to them.

Is there any method getting rid of those instructions as I wanted to use offline stack size analysers?

let's say like this one:
http://www.embedded.com/columns/...

or maybe you have better recommendations for stack size analyser for ATMEGA128 and 2560?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You can certainly quell their use in switch() - there's one of the optimizer switches specifically for that (read the manual)

As for the fputc/fgetc I'm afraid that it's an almost inevitable side effect of their multi-stream support. In fgetc for example:

fgetc(FILE *stream)
{
	int rv;

	if ((stream->flags & __SRD) == 0)
		return EOF;

	if ((stream->flags & __SUNGET) != 0) {
		stream->flags &= ~__SUNGET;
		stream->len++;
		return stream->unget;
	}

	if (stream->flags & __SSTR) {
		rv = *stream->buf;
		if (rv == '\0') {
			stream->flags |= __SEOF;
			return EOF;
		} else {
			stream->buf++;
		}
	} else {
		rv = stream->get(stream);
		if (rv < 0) {
			/* if != _FDEV_ERR, assume it's _FDEV_EOF */
			stream->flags |= (rv == _FDEV_ERR)? __SERR: __SEOF;
			return EOF;
		}
	}

	stream->len++;
	return (unsigned char)rv;
}

you will see it doing:

  rv = stream->get(stream);

where "get" is going to be a registered function pointer that knows how to get a character for this particular stream type. Short of rewriting this library code I don't see how you can avoid this if you intend to use the library.

Cliff

EDIT: the option I was thinking of is -fno-jump-tables

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Well, the easiest way how to avoid indirect jumps/calls in fputc()/fgetc() is to avoid fputc()/fgetc().

Similarly, indirect jumps resulting from using function pointers are easiest to get rid by avoiding using function pointers.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

kbosak wrote:
or maybe you have better recommendations for stack size analyser for ATMEGA128 and 2560?
Stack analyzers are only as good as the function tracing. Most of the stack analyzers I've seen work fine with static call structures but completely fall apart when Interrupt Service Routines or function pointers enter the scene. The stack is just too dependent on the dynamic operation of the application. In that case, some kind of real-time stack monitoring in the application is the best solution.

Also, indirect calls are not necessarily bad. I'm sure they are being generated in my rather large mega2560 application. Handled properly, they do not cause problems.

On the other hand, since you called out the mega2560, there are some gotchas with respect to the GCC compiler and function pointer calls, i.e. calls of the form:

typedef void (*funcptr) (uint8_t* param);
. . .
    void foo_bar (uint8_t* param) { . . . }
. . .
    funcptr myfunction = foo_bar;
. . .
    myfunction( gort );

Normally these pointers are embedded in structs of menu calls or parse command tables, but you get the idea.

In the thread FreeRTOS for ATmega2560/1 I talk about this problem and the way I worked around it.

The only other problem I have found is when a switch statement ends up straddling the 128 KByte boundary on the mega2560; the linker gets upset with this and throws a funky message. In that case, I change the link order so the switch statement always ends up on one side or the other of the boundary.

Certainly Cliff's and Jan's suggestions are good ones and address the specific question you asked. Perhaps I've added another dimension to your thinking.

Stu

Engineering seems to boil down to: Cheap. Fast. Good. Choose two. Sometimes choose only one.

Newbie? Be sure to read the thread Newbie? Start here!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

stu_san wrote:
Most of the stack analyzers I've seen work fine with static call structures but completely fall apart when Interrupt Service Routines or function pointers enter the scene.
Quite so. That is one of the problems involved in creating a general purpose static code analyzer. The specialized stack usage analyzer that we created for the ZBasic compiler (which uses avr-gcc as a back-end compiler) has a means to utilize information about the set of destinations for icall/ijmp/eicall/eijmp instructions when such information cannot be determined from the object code itself. It also has special provisions to handle the indirect jumps/calls that exist in various avr-libc modules. Where necessary we examined the avr-libc code and manually computed the stack depth attributable to the destination(s) and then we use that pre-computed stack depth information during the call graph analysis. It would be much more difficult (but perhaps not impossible) to provide the same kinds of "assistance" to a generalized stack analysis tool.

Don Kinzer
ZBasic Microcontrollers
http://www.zbasic.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

dkinzer wrote:
It would be much more difficult (but perhaps not impossible) to provide the same kinds of "assistance" to a generalized stack analysis tool.
Do you mean "manually enter the range of adresses as ijmp/icall target"? IMHO that's more cumbersome than to avoid ijmp/icall at all.

Of course the real solution is to "utilize information about the set of destinations", but that information is known only to the compiler, i.e. the analyser has to be part of the compiler. Now in your case this is somewhat easier than in case of "general" gcc... ;-)

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Dare I ask but can you not test the stack dynamically at runtime rather than statically on the build machine?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

wek wrote:
i.e. the analyser has to be part of the compiler.
Wouldn't that be a great feature? Oh, well.

In my AVR applications I normally do not use much of avr-libc and never use function pointers. Except one recent application, in which I call all kinds of stuff, including sprintf. Which ends up calling fprintf, which calls fputc, which does ICALL. I did not realize that until I saw this thread yesterday. In my own analyzer, I simply ignored the ICALL instructions (I did not implement it in the first place, and then forgot about it). I changed that yesterday to at least complain about ICALLs. And I also added an option to ignore ICALLs. It looks that fputc's ICALL does not get reached in case of sprintf, so this should work for me for now.

To the OP, you can try my analyzer, if you want. It seems to work with IJMPs generated for switches. (Although I think I have seen cases that may not be handled. In which case you would get an error). Again, it does not really support ICALLs, but you can ignore them. I am not sure if it will like atmega2560 - there may be several 64Kword arrays hardcoded here and there ... To run it:

ezstack file.elf -ignoreICall

It will display in the call tree for all ICALLs and assume the call stack depth of 0.

NOTE: The ignoreICall option is case sensitive! (Yes, I know, I am lazy).

Eugene

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Dare I ask but can you not test the stack dynamically at runtime rather than statically on the build machine?
Cliff, the way your question is posed make it sound like it is trivial to test the stack dynamically. I know there have been several threads here about that. But I have not really seen anything that I liked. How do you make sure that all code is covered during the test? I would love to test the stack dynamically. But I do not have an automated test coverage tool. It has long been on my list of things to do, but to me it appears to be even more complicated tool than a static test analyzer (well, for my class of applications anyway).

Eugene

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

kbosak wrote:
for my project the compiler is notoriously generating a few icall and ijmp instructions.
they land anywhere, a few of them in switch/case
but are unfortunately present in library functions
like fputc/fgetc.
I can't help you much there.
Quote:
I think using callback function pointer is also translating to them.

Is there any method getting rid of those instructions as I wanted to use offline stack size analysers?

If you authored the callback mechanism,
you might look for a way to replace a function pointer with an index.
It shouldn't be hard to automate converting a list of
function names into assembly code that picks one using skips and jumps.

Iluvatar is the better part of Valar.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

How do you make sure that all code is covered during the test?

Exactly the same way that you do test coverage. Surely you have some way to exercise every code path (including every conditional branch) in the code. So just do that with a stack monitor also running.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

One other problem with static stack analyzers that I forgot to mention is that none that I know of deal with the "multiple stack" problem. For example, FreeRTOS generates a stack per task. So the analyzer needs to know not only what call is being made but also which stack it is being called on. Worse, it has to realize that a particular ISR (the OS tick) will likely cause a context switch to a different stack.

That case may be degenerate -- one can assume that all other ISRs will fire in all stack contexts, so the ISR analysis would apply to all stacks. On the other hand, the function pointer calls will likely be very restrictive to a particular task (for example, a "command" task) so a general approach will not work there.

Eugene's case seems to be the easiest - no function pointers, no ICALL or EICALL, perfect for a static analyzer. Until he introduces an ISR. IIRC, the one stack analyzer for AVRs that I've seen considers each ISR to fire separately, so it looks at the largest stack usage of all the ISRs and just lumps it into the stack estimation. Crude and pessimistic, but probably usable.

Don points out that, with more information, the analyzer can actually get quite a bit further. I'm not sure how much of Don's information is still in the .elf file, but since Don is adding his own call-graph info, I suspect that not much is there. (Edit: Is there a way to get call-graph infor from GCC? I haven't looked.)

ezharkov wrote:
clawson wrote:
Dare I ask but can you not test the stack dynamically at runtime rather than statically on the build machine?
Cliff, the way your question is posed make it sound like it is trivial to test the stack dynamically. I know there have been several threads here about that. But I have not really seen anything that I liked. How do you make sure that all code is covered during the test? I would love to test the stack dynamically. But I do not have an automated test coverage tool. It has long been on my list of things to do, but to me it appears to be even more complicated tool than a static test analyzer (well, for my class of applications anyway).
You might check the Gnu list or sourceforge for automatic test generation. Also, I remember an article in the MSDN Magazine a few months back where the author describes a system used with C# and C++ with enclosing classes that describe how the class is used, when it could be called, and so on. They used this for automatic generation of tests. It seems to me to also be a way to generate both static and dynamic stack checks. While their implementation is probably serious overload for any project that ends up on an AVR, still it provides food for thought. Another project for my copious spare time (or perhaps someone else's? :wink: ).

Sorry about the long post - hope I haven't bored everyone.

Stu

Engineering seems to boil down to: Cheap. Fast. Good. Choose two. Sometimes choose only one.

Newbie? Be sure to read the thread Newbie? Start here!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Exactly the same way that you do test coverage.
As embarrassing it may sound, I do not really do test coverage. Not automated anyway. I just do not know how.

(One of the few exceptions is a recent project using the enemy's micro. It was a simple project, with just a few digital inputs and outputs. So, I wrote a simulator for the whole thing.)

And again, the way you talk about test coverage make it sound like an easy thing to do. I must be missing something.

Eugene

Last Edited: Tue. Jun 15, 2010 - 04:05 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The way we do it is with special inhouse tools that take the C source and process it into a form that puts a "test point" at every conditional branch. The code is run and logs "hits" into an array. We then have a tool that uses this data to annotate the original C and shows you which branches were/weren't taken. You can then fairly quickly build up a test plan that exercises the code to go through every path. After a few iterations you are left with just a few "hard to reach cases" for which you might actually include some harness code to exercise them.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

stu_san wrote:
One other problem with static stack analyzers that I forgot to mention is that none that I know of deal with the "multiple stack" problem. For example, FreeRTOS generates a stack per task. So the analyzer needs to know not only what call is being made but also which stack it is being called on. Worse, it has to realize that a particular ISR (the OS tick) will likely cause a context switch to a different stack.
I might be missing something, but couldn't this fall back to running the stack analysis tool several times - once per stack - starting from a different point each time; plus adding the context overhead (which I suppose is a constant)?

I don't say it's easy, but this sounds less hopeless than to account for "classical" function calls.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Heresy: make all your variables global then the only use of the stack you need worry about is CALL/RET and unless you use recursion you can probably analyse the worst case manually ;-)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Heresy: make all your variables global then the only use of the stack you need worry about is CALL/RET and unless you use recursion you can probably analyse the worst case manually ;-)
This is what '51/PIC compilers do behind the scene, due to small/no stack. The trick is to learn to overlay as much as possible. You must also make sure you don't re-enter functions.

You can also write the whole program as a single function, possibly using macros.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

stu_san wrote:
So the analyzer needs to know not only what call is being made but also which stack it is being called on. Worse, it has to realize that a particular ISR (the OS tick) will likely cause a context switch to a different stack.
In our stack use analyzer, we perform an analysis for each task. We know the routine that is the entry point for each task and begin the analysis there. Once the maximum depth is determined for the call graph representing a task, we then add the larger of (a) the context size and (b) the maximum depth of all known ISRs. The assumption is that all ISRs execute atomically (all those that we supply do and we recommend that for user-written ISRs, too).

stu_san wrote:
I'm not sure how much of Don's information is still in the .elf file, [...]
We don't use the .elf file. Rather, we analyze the generated object code using information in the .sym file together with other information collected by the compiler when it translated the ZBasic source code to C source code.

The .sym file information tells us where in the binary file each task's main begins and also allows us to identify the problematic avr-libc code sequences. For the latter, we create call graph nodes and add the pre-computed stack depth information. As the call graphs are constructed, we stop when we reach one of the pre-computed nodes so the analyzer never sees the ijmp/icall in those problematic code sequences. The specialized ZBasic task switching code is similarly identified so the analysis can terminate when it reaches the context save and context restore points.

After the call graphs are constructed they are traversed again, this time computing the stack depth used by each node and passing back up the call chain the maximum depth attributable to the node and its callees. When finished, the maximum stack use of each node, including each task's entry point is known (not including ISR and task switching overhead).

ZBasic supports an O-O paradigm with virtual method capability. The virtual function table entries for each object and its derivation sequence are another area where data collected during compilation is used. Where it cannot be determined exactly which virtual method will be invoked, the analyzer assumes that any corresponding method in the derivation sequence might actually be invoked and incorporates the maximum stack use of that set of functions.

Lastly, for user-created dispatch tables where the compiler cannot deduce the set of call targets we provide a pragma to allow the programmer to specify the set of call targets so that stack use analysis may be completed.

If the stack use analyzer encounters any conditions under which it cannot determine the stack use (including detected recursion, indirect jump/calls that cannot be resolved, etc.) the ZBasic compiler issues a warning indicating that the stack use for the particular task is indeterminate.

Don Kinzer
ZBasic Microcontrollers
http://www.zbasic.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ezharkov wrote:
I changed that yesterday to at least complain about ICALLs.
If you've implemented a simple emulator (as we have) and the r31:r30 values are known at the point of the indirect call or jump then there is no problem, of course. For the extended indirect call/jmp, the EIND register value also needs to be known.

Don Kinzer
ZBasic Microcontrollers
http://www.zbasic.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

dkinzer wrote:
If you've implemented a simple emulator (as we have)
At first I wanted to implement it like that. But eventually gave up. So, no, I do not really track the state of the registers.

Eugene

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ezharkov wrote:
...the way you talk about test coverage make it sound like an easy thing to do. I must be missing something.
:lol: HA! Hardly. People still do doctoral theses (thesises?) on the subject.

As you may know, I spent a few years designing ICs, where test coverage was a very big deal. In that environment, it was not unusual at all to design special test structures into the chip; in fact, it was required. Even more often, certain circuit configurations were flagged as "untestable" and were therefore "illegal".

If your chip had less than 99% test coverage, you were not allowed to ship the chip for manufacturing. Every exception was reviewed.

It's not hard to understand why. Even 15 years ago, a set of masks could run to the $millions and the standard run of 24 wafers was $3000 per wafer.

How often do we design software the same way? Are there restrictions on the kind of code you write because it is untestable? Do you have built-in test structures at every interface? Asserts are a very simple version of these test structures; how often do you list valid sequences, or implied relationships between inputs and outputs? How often to we assert an output from a function?

If you design medical devices, this is standard fare. The FDA will require an audit by their people of every subroutine, with associated test harness, to prove the routine is doing what it is supposed to do and rejecting bad input in a way that is handled properly by the higher levels.

At any rate, automatic test generation and test coverage all come out of this. While there are folks that know how to do it, it is not industry standard in all but a few fields.

Most of all, it is not "easy".

Stu

Engineering seems to boil down to: Cheap. Fast. Good. Choose two. Sometimes choose only one.

Newbie? Be sure to read the thread Newbie? Start here!