Mixing C with ASM Assembler

Go To Last Post
83 posts / 0 new

Pages

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Dan, in your negation optimization, you can change JMP R248XT to BRTC R248XT. It doesn't save time, but it does save 2 bytes.

Quote:
I predict it will fail when Y1 and Y2 are far apart and a big difference between X and X1.

I think that it will fail only when the difference between Y1 and Y2 is greater than 32768.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Sorry that should have been an RJMP, would that still save two bytes...
I'm not that familiar with the opcode set yet.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

BTW, thanks Steve for restoring my faith!

Let me explain....

I took one look at the original program and was heart-broken.
I figured that if that's what passed as good Assembly Code then this
old dinosaur was ready for the bone-yard!

It honestly took me an entire day going through it line-by-line because this OLD F-RT
did not want to give-up the farm just yet.

I'm glad to see that there are others not following the same confusing coding conventions.
Not that there is anything wrong with them, I'm just not used to seeing such bizzare looking ASM code. Bizzare that is to an old-timer.

Thanks again!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I found another optimization for your version Steve.

;ORIGINAL ROUTINE
SUBANYWAY:
    SUB RC,X2
    ROL ANS
    ROL ANSH
    XOR ANS,ONE
    DEC N
      BRNE DIVIDE
;
;ALTERNATE:
SUNANYWAY:
    SUB RC,X2
    LSL ANS     ;CHANGED!
    ROL ANSH
    DEC N
    BRNE DIVIDE

In the Original you were ROLing in a one and inverting it.
In alternate version we are LSL in a zero.

This should save one instruction inside loop (when it goes this route),
for an estimate of 8, and we no longer need the ONE/X1 register so we save by
NOT initializing it and this saves one instruction outside the loop.
For an average savings of 9 clocks and two program statements LDI X1,1 and EOR,X1
(4 bytes?)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Looks good Steve!

I see that you're treating Y,Y1,Y2, and the result as signed. Otherwise we're pretty much on the same page.

Although, I have to admit that I have not yet grokked your rounding algorithm. :)

--Brian

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

There's a speedup that can apply to both of our functions: share storage between the Dividend and the low-word of the Quotient. That way, the ROLs can do dual-duty.

Here's my new division loop:

; *** do {
ip_DivLoop:
	; Shift the Quotient AND roll the Dividend bit into the LSB
	rol QuotientB0		;On the first loop, the Dividend bit is garbage
	rol QuotientB1
	rol QuotientB2		;High bit is shifted into carry

	brcs ip_BigQuotient	;if C=1, Quotient is guaranteed to be larger than Divisor

	;Determine if Quotient is smaller than Divisor
	cp QuotientB2, DivisorB2
	
	;if( QuotientB2 < DividendB2 ) then don't do subtraction
	brcs ip_DontSub

ip_BigQuotient:
	sub QuotientB2, DivisorB2	;Carry is unknown after this
	clc				;Ensure that Carry=0 after a successful subtraction

ip_DontSub:	;if this branch _was_ taken, carry is set

	;The Carry now holds the calculated Dividend Bit

; *** } while (--DivCount)
	dec DivCount
	brne ip_DivLoop

	; Roll the final Dividend bit in, and shift the first garbage bit out.
	rol QuotientB0
	rol QuotientB1

	; QuotientB0/B1 now holds the DividendB0/B1

This saves 30 cycles and 2 regs without changing the program size.

--Brian

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Wow, way to go Brian! That's an excellent algorithm!
Great work!

I''ve even found a way to optimize one step further.

DIVLUPE: ROL DXY
         ROL DXYH
         ROL DXY2
         CP DXY2,DX2
          BRCS NOSUB
DOSUB:   SUB DXY2,DX2
         CLC
NOSUB:   DEC N
          BRNE DIVLUPE
         ROL DXY         ;GRAB FINAL [c]
         ROL DXYH

I skip the comparison after the first ROL and go ahead and do the compare.
This eliminates another program step

Here is how I handle the Negative Slope at end of Division.
Rather than covert to a negative then ADD I just go right to a SUBtraction
saving the conversion process.

           BRTS SLPNEG    ;CHECK NEG-SLOPE FLAG
         ADD Y1,DXY
         ADC YH,DXYH
          RET     
SLPNEG:   SUB Y1,DXY
         SBC Y1H,DXYH
           RET

.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks Dan!

Having several people refine the same code can produce many insights.

--Brian

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Dan, without the BRCS DOSUB, I think your routine can fail when (x2-x1) > 128.

DIVLUPE: ROL DXY
         ROL DXYH
         ROL DXY2
;        BRCS DOSUB	;Force a Sub, if a '1' was shifted out of quotient
         CP DXY2,DX2
         BRCS NOSUB
DOSUB:   SUB DXY2,DX2
         CLC
NOSUB:   DEC N
         BRNE DIVLUPE
         ROL DXY         ;GRAB FINAL [c]
         ROL DXYH

Here's an example in binary:

Given:
Quotient= 01000000 01000000 00000000 (dec 4210688)
Divisor= 10000001 (dec 129)

Consider the following division steps:

ITERATION 1
--
Step 1: LSL Quotient
Q=10000000 10000000 00000000
C=0

Step 2: Is the Quotient high-byte[10000000] < Divisor[10000001]?
Yes, skip subtraction. Dividend LSB = 0.

Step 3: Subtraction skipped

Step 4: ROL in Dividend LSB.

ITERATION 2
--
Step 1: LSL Quotient
Q=00000001 00000000 00000000
C=1

Step 2: Is the Quotient high-byte[00000001] < Divisor[10000001]?
Yes, skip subtraction. Dividend LSB = 0.

Step 3: Subtraction skipped

Step 4: ROL in Dividend LSB.

===============
This failed because the dividend needs to be compared against the combined 9 bit value of the carry and Quotient high-byte. So step 2 should have been:

Step 2: Is [1 00000001] < Divisor[10000001]?
No, Do subtraction. Dividend LSB = 1.

Does that make sense?

--Brian

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Couild be(?)
I only tested it on a limited number of data and results came back good.
You predict it will fail when Delta-X exceeds 128?
Let me try it...

Wow, you are right, it failed misserably!!!
(Kids don't attempt this at home!)

I guess we're trying to squeeze blood from a stone at this point.
I have another idea that might cut execution time quite a bit...

Let me see what I can do and throw another bug-ridden example at you

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
In the Original you were ROLing in a one and inverting it.
In alternate version we are LSL in a zero.

I should have caught that one! I think it was pretty late when I wrote it.

Your other optimization of subtraction instead of negation is great, but I can do one better. My original optimization with rounding and negation had to do with the fact that the processing after the divide involves a complement, adding one for rounding, and negation. But negation is complement and adding one, so some things canceled. By adding your subtraction idea, things simplify even more. Things cancel out if you do the opposite of what you expect. You add 1 if you don't round, and subtract if the answer is positive, and there is no complement involved at all!

Quote:
Having several people refine the same code can produce many insights.

Yes, we have been doing quite well! I have one more that I am working on. I had mentioned earlier that we could pre-shift if we knew the max difference of the data, reducing the number of times through the loop. But the data itself can tell us if we can pre-shift. I now run another loop before the multiply to shift the y difference until the high bit is one. Each shift reduces the number of times through the divide loop by one. At the moment the shift loop is 2 cycles shorter than the divide loop, which makes it worthwhile, especially with smaller y differences. I'm still trying to see if I can refine it though. I'll try and post the code tomorrow, once I've worked out the kinks.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I upoaded a quick rough-draft of my idea for a Division Routine with only 8 iterations which theoretically could be much faster than the "Compare and Subract" routine we are working on now.

I stuck it in a new thread because I thought it might get confusing talling about two totally different routines in the same thread.

Any help would be greatly appreciated. It still fails sometimes and when it doesn't sometimes the accuracy leaves something to be desired.

Thanks for any help, this is fun!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Koshchi wrote:

I have one more that I am working on. I had mentioned earlier that we could pre-shift if we knew the max difference of the data, reducing the number of times through the loop. But the data itself can tell us if we can pre-shift. I now run another loop before the multiply to shift the y difference until the high bit is one. Each shift reduces the number of times through the divide loop by one.

Hey, that's a fantastic idea! You're making the division routine "scalable" if size warrants it, it becomes a 10x8 Divide later it might be just a 6x8 Divide, etc.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If it's not too much bother could you walk me through the logic of your rounding routine?
I've got a print-out of your code right here and I still don't get it.

I've been up-all-night perhaps it's just fatigue.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If absolutely necessary, could you separate the instructions over a period of interrupt calls to get the same job done periodically while other events are allowed to happen?

In other words...what's more important, this calculation or the other events? Or are they equally important?

You guys have been working real hard at that code, which has improved amazingly over time. However, if you can put off needing the result of the calculation for some time, you should be able to use a timer to spread out your calculation over time, thus buying your other instructions the time they need to run.

This is kind of a shot in the dark. It would only work if you don't need to calculate the numbers immediately :-/.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:
However, if you can put off needing the result of the calculation for some time, you should be able to use a timer to spread out your calculation over time

We don't really need it for anything. We started out just trying to help the OP come up with something. Since then we've just been playing with it to see what we could do with it. Personally I just like trying to squeeze out every last clock cycle from a routine, even if there is really no reason to do so.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Here is my latest version. It still assumes that y2 - y1 is less than 32768 (including in the normalization routine that I added), but this could certainly be changed if needed. The normalization saves 2 cycles for every pre-shift that it does. You can save 3 more clock cycles per shift if you don't shift 1's into the dividend but instead set them with a routine after the shift that gets the proper values from a look-up table. The look-up costs extra cycles, so you wouldn't see savings until the 3rd or 4th shift. But for small y differences the savings would be significant. It would also cost 32 bytes of memory for the table, plus you would need to use the z register, which means saving it somewhere and restoring it.

Interp:
	clr RA
	clr RB
	clt			;clear neg flag
;find (x - x1)
	sub x0, x1		;subtract x1 from x
	breq EarlyOut
;find (x2 - x1)
	sub x2, x1		;subtract x1 from x
;find (y2 - y1)
	sub y2L, y1L		;subtract y1 from y2	
	sbc y2H, y1H
	breq EarlyOut
	brpl NotNeg		;branch if we are not negative
;y dif is negative, so negate and set flag
	neg y2L		;negate yDif
	adc y2H, RA		;adjust high byte for carry
	neg y2H
	set			;set neg flag
NotNeg:
	ldi x1, 16
Normalize:
	dec	x1
	sec
	rol RA
	rol RB
	lsl y2L
	rol y2H
	brpl Normalize
Multiply:
;find (x dif * y dif) - 24 bit result
	mul y2L, x0		;multiply low byte
	or RA, r0		;or so we don't lose the partial divedend
	or RB, r1
	mul y2H, x0		;multiply high byte
	clr x0
	add RB, r0		;add mid byte
	adc RC, x0		;adjust high byte for overflow
	sec
Divide:				;24bit by 8bit divide, unsigned
	rol RA			;roll the quotient and dividend left
DividePlus1:	
	rol RB	
	rol RC
	brcs SubAnyways
	cp RC, x2		;compare divisor with high byte
	brcs NotLess	 	;branch if we can't subtract
	sub RC, x2		;do the subtraction
NotLess:
	dec x1			;decrement the loop count
	brne Divide		;branch if not done
	rol RA			;catch the last bit
	rol RB
;Since the output of the divide routine is the complement of 
;what we want, and the fact that rounding is adding one and
;negation is complement plus one, we can simplify to adding
;1 if we don't round, and subtracting when we have a positive
;and adding when we have a negative
RoundAndNegate:
	inc x2
	lsr x2			;shift dividend
	cpc x2, RC		;compare
	brcs Round		;branch if we need to round
	adiw RA, 1		;add when we don't want to round
Round:
	brtc SubY1
AddY1:
	add RA, y1L		;add if we're negative
	adc RB, y1H
	ret
SubY1:
	sub y1L, RA		;subtract if we're positive
	sbc y1H, RB
	movw RA, y1L
	ret

;take care of 9 bit subtraction
SubAnyways:
	sub RC, x2		;do the subtraction	
	lsl RA			;put 0 into the answer
	dec x1			;decrement the loop count
	brne DividePlus1	;branch if not done
	rjmp RoundAndNegate

;y difference is 0 or x - x1 is 0, so answer is y1
EarlyOut:
	movw RA, y1L
	ret

Total clocks - min 163, max 199
61 commands, 122 bytes

My round and negate algorithm goes something like this:

There are 4 posibilites for what we need to do for the answer after the divide:
1) no negate, no round, add to y1
2) no negate, round, add to y1
3) negate, no round, add to y1
4) negate, round, add to y1

Negating is complement and add 1, and rounding is add 1 (or sub 1 if the number is negative). But the answer after the divide is the complement of the real answer. So the actions we need to take for the for the cases are:
1) cmp, add to y1
2) cmp, add 1 (or sub 1, cmp), add to y1
3) cmp, cmp, add 1, add to y1
4) cmp, add1, sub 1, add to y1

But for the last two, things cancel, leaving:
3) add 1, add to y1
4) add to y1

This should explain my first variation. Now if we take Dan's idea of subtracting, we can use it on the first two cases. That is, we can subtract the negative of the actual answer. To make these two negative, we need these actions (including the previous actions):
1) cmp, cmp, add 1, sub from y1
2) cmp, add 1, cmp, add 1, sub from y1

In the first, the complements cancel, in the second, it is two negations that also cancel. The final result is:
1) add 1, sub from y1
2) sub from y1
3) add 1, add to y1
4) add to y1

I hope that this explains everything.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

What do you think is the optimal way to implement rounding into the following.

         LDI  N,16        ;CALC (Y2-Y1)(IX-X1)/(X2-X1)
LUPIE:   ROL Y2
         ROL Y2H
         ROL Y2HH
         CP Y2HH,X2
          BRCS SKPSUB
         SUB Y2HH,X2
         CLC
SKPSUB:  DEC N
          BRNE LUPIE
         ROL Y2        
         ROL Y2H
         COM Y2
         COM Y2H
           BRTS SLPNEG    ;NEG-SLOPE?
         ADD Y1,Y2        ;ANS=Y1+(Y2-Y1)(IX-X1)/(X2-X1)
         ADC YH,Y2H
          RJMP DONE     
SLPNEG:  SUB Y1,Y2        ;ANS=Y1-(Y2-Y1)(IX-X1)/(X2-X1)
         SBC Y1H,Y2H
          RJMP DONE
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Ok, Steve. I think we're bordering on extreme navel-gazing here.

My arbritary priorities for my original implementation were in order of size, clarity, and speed.

But in the relentless pursuit of speed, I have some changes for your divide loop. 8)

Insert this code where your "Divide:" label was originally located:

   rol RA         ;roll the quotient and dividend left
   rjmp Divide   ;skip over subtraction code and enter Divide loop

;take care of 9 bit and 8 bit subtraction
SubAnyways:
   sub RC, x2      ;do the subtraction   
   lsl RA         ;Shift a 0 into dividend
   dec x1         ;decrement the loop count
   breq DivideDone

Divide:            ;24bit by 8bit divide, unsigned
   rol RB   
   rol RC
   brcs SubAnyways
   cp RC, x2      ;compare divisor with high byte
   brcc SubAnyways ;branch if we can subtract

   rol RA         ;roll the quotient and dividend left

   dec x1         ;decrement the loop count
   brne Divide      ;branch if not done

DivideDone:
   rol RB

The goal of this loop is to have _exactly_ one followed-branch per loop.

It saves ~6.8 cycles per function call in my testing. (and one word!)

--Brian

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

bcb wrote:
It saves ~6.8 cycles per function call in my testing. (and one word!)
--Brian

Brian, how did you get these figures? Is there a way to get them from the Studio4 Software -- I've been counting clocks by hand and it's tedious and prone to errors.

Thanks.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In the "I/O View"==>"Processor" section, there's a "Cycle Counter" item that's available while debugging.

--Brian

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Wow, how Cool!! And a Stop-Watch too!!

Ya' know, I like to rib people about the software for fun, but the Debug within the Studio4 really is excellent!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

How do I stop the processor?

BREAK ans SLEEP don't work and I can't find a HALT instruction.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Not bad Brian! There are three paths through the loop.
1) 8 bit subtraction, 10 clocks (which is the same as my loop)
2) no subtraction, 9 clocks
3) 9 bit subtraction, 8 clocks

If the divisor is less that 128, the 9 bit subtraction will never be hit. For >= 128, the higher the number, the more often 9 bit subtraction will occur. So I would guess that it would save on average a little less than one clock per loop over my code. You do lose 2 or 3 clocks (the rjmp to start, and the possible branch out if the last loop is a subtraction). So the 6.8 sound about right (it might even be a hair low).

On the rounding, the way I did it is probably optimal (except with one change that I just noticed from Brian's original code).

         COM Y2 
         COM Y2H

         LSR X2        ;divide divisor by 2
         CPC Y2HH, X2  ;compare to remainder
         BRCS TESTNEG  ;branch if not rounding
         ADIW Y2, 1    ;round

TESTNEG:
         BRTS SLPNEG    ;NEG-SLOPE? 
         ADD Y1,Y2        ;ANS=Y1+(Y2-Y1)(IX-X1)/(X2-X1) 
         ADC YH,Y2H 
         RJMP DONE      
SLPNEG:  SUB Y1,Y2        ;ANS=Y1-(Y2-Y1)(IX-X1)/(X2-X1) 
         SBC Y1H,Y2H 
         RJMP DONE 

And if you skip the complement, reverse both branches to clear instead of set.

Regards,
Steve A.

The Board helps those that help themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Okay, thanks so much Steve. Now I understand! I got lost in all this bit shuffling.

The idea is if the Remainder is more than 1/2 of original Divisor then we Round Upward, makes prefect sense to me now. Thanks for clearing that up for me.

Did you get a chance to look at my amateur attempt at Division-by-Inverse-Multiply routine I tried? Accuracy is terrible, perhaps your rounding modification will improve results.

Since it only requires 8 interations to invert the Dividend, it should save a lot of
time on larger divides like 24x8, 32x8 and beyond.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

To Martin, the original poster (OP):

I have done four different version of the Bilinear Interpolation program that you requested. Each Bilinear Interpolation requires an internal 3 Linear Interpolations. The slowest version uses the standard "Compare and Divide" method for long division. The fastest uses Binary Division but Data Points for your BIN map must be by multiples of 2. The others use two different versions of a "Division by Inversion" method that I have beeb experimenting with.

Here are the results:

METHOD: Size in Bytes @ Speed: Best Case - Worst Case

BINARY METHOD: 100 @ 104 3.8uS to 10 uS
DIV by INV (a): 130 @ 3.8uS to 23 uS
DIV by INV (b): 120 @ 3.8uS to 24 uS
STANDARD: 110 @ 3.8uS to 39 uS

The RPM and MAP readings are Binary Usigned Inputs and Output is an Ignition Timing Rate in the range 0 to 65,536 (2 Bytes) based on a Binary Interpolation of the Four Nearest Data Points in your !2x12 BIN map.

I hope this helps your project along, as I am taking a lot of "heat" for asking too many questions. Although I have never programmed for the AVRs before (my first one arrived about week-and-a-half ago) I think the AVR has an excellent "core" with not too many "quirks." and even though it was quite frustrating at times being under a time constraint, it was an excellent learning experience.

PS. Are you glad you didn't have to recode your entire project in C just for a few complex math routines?

TTYL

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

MartinM57 wrote:

Hi Dan

This looks like the one! A reasonable amount of testing done and it
seems to work great :-) Same results in a real AVR via my VB2005
stimulator and Studio4.

As I've said before, the program really is great and I'm very IMPRESSED. Thanks
very much for your help.

Quick Q: If the table entries were only 8-bit, would there be a
reduction in code size/speed? I only ask, because I can *think* I need
BINTERP with both 16-bit and 8-bit contents - I can obviously use the
16-bit version with 8-bit contents, but I was just wondering....

You might want to publish this code in the AVRFreaks Academy section.

Thanks again for everything

You are very welcome. I am sorry it took so long to write, but trying to learn everything I needed to know about Assembly Programming for AVRs in a week what a bit frustrating, but I am extremely delighted that you are happy with the results.

If the Bilinear Interpolation (BINTERP) speed of 4uS to 39 uS is not fast enough I have some new ideas to speed it up even more. BTW - YES!!!! Switching you 8 bit table entries would drastically speed the program up!

I'm not sure what the AVR Academy is, perhaps I'll look into making BINTERP available there.

Glad to hear my BINTERP will be used in real-life control application. It was an excellent learning experience for me. Have a great day!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks Dan.

Just to re-iterate, Dan's done a great job for me.

I'm hoping for a few great days as I'm off to Colorado tonight for two weeks skiing - but how will I get on without access to AVRFreaks???? 8)

If you have the spare time, a BINTERP8 would be very nice!

Martin

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

MartinM57 wrote:
Thanks Dan.

Just to re-iterate, Dan's done a great job for me.

- Martin

Thank you very much for the recommendation!

Please, pass my name around, I'd like to find a "Paid Gig" doing this someday.

However, I don't imagine there's much call for Super-Fast, Super-Small AVR code is there?

BTW - I took original ButterFly BootLoader (over 1500 words) and did ASM version in under 255 (1/6 size) for practice.

It has been a wonderful learning experience.

MartinM57 wrote:

If you have the spare time, a BINTERP8 would be very nice!

- Martin

I have completed BINTERP8 (8 bit version of BINTERP) and from initial test the best to slowest speeds were 4uS to 20uS - so about twice the speed for a worst-case scenario over the 16-Bit Version at 39uS. Not bad for three internal linear interpretations and I'm not excluding all the overhead of loading all the registers, variables, stack. et al, from the above calculations. If it's possible for you to get me figures for just the routine-time excecution-time, that would be great.

Hope you find BINTERP and BINTERP/8 useful, please let me know where my progeny ends up. Thanks.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

BINTERP8 received - no time to do the testing, got to pack the bags right now...

I'll be back in just over 2 weeks ... 8)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Great! Have a wonderful trip and when you get back please let me know.

One thing I'd still like to know is how important speed is in your application? Is there any point in trying to increase the speed of the Bilinear Interpretations?

BTW - Martin, if you can get manage to get my program to fail or produce any errors, please let me know right away.

TTYL

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Does anyone else have a need for an ASM program to do Bilinear Interpolations?

Pages