Number of Cycles for _divmodsi4

Go To Last Post
16 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi guys,

I need to know the number of cycles for instruction _divmodsi4 in ATMega8 with -s optimization. does anybody around here know the number of cycles for _divmodsi4?

Thank you so much

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This is clearly a school question.

Write some C code that calls _divmodsi4.
Then run the Simulator in AS6.

This will give you the number of cycles.
If you set the actual clock value, you can see how many microseconds it takes.

Hint. You need to perform % with a variable to ensure that the Compiler generates runtime code.

You can see the generated ASM code in the Disassembly window.

I bet that the function will take the same number of cycles regardless of Optimisation settings. i.e. _divmodsi4 is probably a library function.

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I am sorry but i do not have any idea about assembly code. I hope there is someone have tried to calculate it or knows about it.

Thanks you so much for helping me.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

I am sorry but i do not have any idea about assembly code.

You can still run the C program in the simulator, and get the timing from that.

But I'd wager that the answer would be the same as to the question: "How long is a piece of string?" The answer might depend on the AVR model being used, the width of the operands; the signedness of the operands. (I'd guess that the "si4" tells something about that.)

But the big difference is that the number of cycles may well depend on the >>value<< of the operands, as division is often implemented as multiple subtraction. Thus there is no answer about the cycles for the function but rather the function call with a particular set of operands.

So, try it in the simulator with selected operand values, and tell us what you get.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You don't need to know assembly code.

We can't just give you the answer. I gave you a hint about what C statement might use _divmodsi4.
And I told you how to find the answer.

But I suppose you could just think of a number and hope for the best.

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

_divmodsi4 appears to be the internal name for ldiv().

There have been some threads on FP cycles but not much that I can recall on division. ( [l]div() is the same as a divide or a mod, with both quotient and remainder returned.)

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just to note that all the library code is precompiled so your own choice of -Os will have little influence apart from the register setup before the call.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I did your homework for you. The results were not as operand-dependent as I thought, but I didn't make a big selection of operand values.

#include <avr/io.h>
#define F_CPU 1000000
#include <stdlib.h>

int main(void)
{
long dividend;
volatile long divisor;
volatile ldiv_t result;

dividend = 12345678l;
divisor = 123l;
result = ldiv(dividend, divisor);

divisor = 654321l;
result = ldiv(dividend, divisor);

dividend = 456l;
divisor = 123l;
result = ldiv(dividend, divisor);

divisor = 1; // place for breakpoint
while(1)
{
}
}

The cycle counts were 696, 684, and 684 for the three invocations. This is for the entire C line which would include setup/teardown of the function call. And my particular test program configuration. YMMV

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I did the same. An interesting result is when the divisor is 0. I expected a quick exit but it counted 720 cycles. Similarly a divisor of 1 gave 620 cycles.

Perhaps the authors thought it was a waste to check for the "special cases"?

BTW this code is in libgcc.a not AVR-LibC so may be sub-optimal for an AVR implementation.

I do wonder, if OP is not interested in looking at Asm, what the point of this little exercise is?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It seems pretty clear to me. The OP has got a school assignment. She just wants to know the answer without any effort or understanding on her part.

As always, I am more rude about the teacher than the lazy student.

What is the point of knowing the number of cycles?

Yes, it is useful to know that using an inappropriate width of variable costs you time and efficiency. e.g. don't use a long when an int will do.

But why should a teacher give an assignment that is fairly pointless to a class of disinterested students?

If the student spent her time creating a useful Arduino sketch, she might summon up some enthusiasm for the subject. On the other hand, her time might be better spent with nailcare, ecology, hair management, ...

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

...about the teacher...

What is the point of knowing the number of cycles?


Well, to be fair, we don't know the context.

If in a course on "microcontroller algorithms" then this question could be part of establishing a baseline against which to measure various approaches.

(My guess? OP has a chunk of oft-repeated code that is TDS [Too Darned Slow] with some kind of naive implementation.)

Quote:

BTW this code is in libgcc.a not AVR-LibC

Like Will Rogers, all I know is what I read in the papers. A Google search uncovered in avr-libc documentation:
Quote:
ldiv_t ldiv (long __num, long __denom) __asm__("__divmodsi4")

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Like all other topics, every possible AVR-related topic has been discussed here and a consensus reached:
https://www.avrfreaks.net/index.php?name=PNphpBB2&file=viewtopic&t=137522

Sprinter didn't give a link to the source, though. ;)

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

True ldiv() is in AVR-LibC. Want to take a peek?...

http://svn.savannah.nongnu.org/viewvc/trunk/avr-libc/libc/stdlib/ldiv.S?...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

True ldiv() is in AVR-LibC.

I thought WriteFlyer just said it isn't?!? No wonder I'm confused...

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I fear you missed my point ldiv() in AVR-LibC is a single JMP instruction to the routine _divmodsi4() which is located in libgcc.a

The stuff in libc.a is hand crafted for the AVR target. The stuff in libgcc.a is generic GCC source that is used across targets and is mostly in C not Asm.

My point was that because 99.999% of ldiv() lives in libgcc.a it may not be as efficient as a hand crafted, target oriented implementation.

EDIT: OK I'm wrong - it is written in AVR Asm anyway....

http://gcc.gnu.org/viewcvs/gcc/trunk/libgcc/config/avr/lib1funcs.S?annot...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Running __divmodsi4 in a simulator or wherever does not give an upper bound for the time needed. Running for particular values just gives a lower bound, and running for all 2^64 inputs is a bit beyond current computation performance: on handware with 20MHz clock it will take > 20.000.000 years. If your run a simulator on your 20GHz PC which can simulate 1 AVR instruction every clock, you are still left with 20.000 years...

As __divmodsi4 is implemented in assembler in libgcc, it's easy enough to estimate the number of cycles:

__divmodsi4 calls __udivmodsi4, the unsigned div-and-mod which basically loops 32 times 20 ticks. __divmodsi4 adds some overhead by adjusting the sign(s) and calling __udivmodsi4. The actual upper bound will be around 700 ticks, perhaps a bit more for older implementations. (I tweaked __divmodsi4 a while ago, but there is not much room for improvement because most of the functions in lib1funcs*.S are strictly optimized for size).

Please take into account that the "700 ticks" above is just a guess; in order to get a safe upper bound you will have to statically(!) analyze the functions and take into account call overhead. And of course, you have to take into account delays caused by interrupts which add some challenge to the analysis.

avrfreaks does not support Opera. Profile inactive.