Need some help understanding how to move the interrupt vector

Go To Last Post
32 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm trying to use this approach.
https://www.raphnet.net/programm...

If I use the normal ISR I'm good

 

Working method.


ISR (INT2_vect )
{
PORTF |= 0x01;	PORTF &= ~0x01;//debug
PORTF |= 0x01;	PORTF &= ~0x01;//debug
PORTF |= 0x01;	PORTF &= ~0x01;//debug
PORTF |= 0x01;	PORTF &= ~0x01;//debug
PORTF |= 0x01;	PORTF &= ~0x01;//debug
}

EICRA = (1<<ISC20);
EIMSK = (1<<INT2);

sei();//enable interrupts. 

When my pin changes I get my debug.

 

Here is how I changed it, but I'm not understanding the MCUCR register from the documentation nor do I know how to convert this over to my chip (atmega32u4).

 

Code wise I think I do the same thing.

void fastint(void) __attribute__((naked)) __attribute__((section(".boot")));
void fastint(void)
{
PORTF |= 0x01;	PORTF &= ~0x01;//debug
PORTF |= 0x01;	PORTF &= ~0x01;//debug
PORTF |= 0x01;	PORTF &= ~0x01;//debug

asm volatile( --- )::);
}

EICRA = (1<<ISC20);
EIMSK = (1<<INT2);

MCUCR |= (1<<IVCE);
MCUCR |= (1<<IVSEL); 

sei();//enable interrupts. 

 

 

He says

The atmega8 supports moving the interrupt vector from address 0x0000 to the start of the bootloader section. The effective address depends on how the "fuses" are configured. In my case, the address is 0x1800 (Word address 0xC00).

I'm using a atmega32u4 and it has its own boot FLIP. So confused there...

fuses

0xf4
0xd9
0x5e

 

 

Than he says

I created a .boot section by adding -Wl,--section-start=.boot=0x1800 when linking. The interrupt handler "function" that I will place there will therefore have to be marked with __attribute__((section(".boot"))).

So does that mean I need to add to the make file. I'm just using a basic LUFA make.

MCU          = atmega32u4
ARCH         = AVR8
BOARD        = MINIMUS
F_CPU        = 16000000
F_USB        = $(F_CPU)
OPTIMIZATION = s
TARGET       = bridge
SRC          = bridge.c  conPad.c psx.c wii.c wiimote.c gc.c gen.c nes.c ../shared/bridge_protocal.c  $(LUFA_SRC_USB)
LUFA_PATH    = ../shared/LUFA
CC_FLAGS     = -DUSE_LUFA_CONFIG_HEADER -IConfig/
LD_FLAGS     =

# Default target
all:

# Include LUFA build script makefiles
include $(LUFA_PATH)/Build/lufa_core.mk
include $(LUFA_PATH)/Build/lufa_sources.mk
include $(LUFA_PATH)/Build/lufa_build.mk
include $(LUFA_PATH)/Build/lufa_cppcheck.mk
include $(LUFA_PATH)/Build/lufa_doxygen.mk
include $(LUFA_PATH)/Build/lufa_dfu.mk
include $(LUFA_PATH)/Build/lufa_hid.mk
include $(LUFA_PATH)/Build/lufa_avrdude.mk

Do I put this line in the optimization replace the s? He also mentioned using -Os

 

 

Last Edited: Sat. Dec 19, 2020 - 08:17 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

A section-start is a linker option so you'd put it on LD_FLAGS

 

(You won't need the -Wl, in this case as that is implied/obvious)

 

Note however this "trick" of using IVSEL and the BLS IVT is only going to work if the BLS is "empty". If this us 32U4 and you are relying on something like DFU up there you are screwed.

 

IMAO a far better approach is in appland, go with -nostartfiles so it ditches gcrt1.S, provide your own CRT and IVT with the time sensitive code positioned on the vector in question. Do it all in .S. While .org in avr-as is only section relative, not absolute it is effectively absolute if you arrange for that sector to be at the fixed 0x0000 address. (Which is what happens for the existing ".vectors" in gcrt1.S

Last Edited: Sat. Dec 19, 2020 - 09:01 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

How do I determine what is a safe address?

http://ww1.microchip.com/downloa...

Am I reading that as 0x3800  ?

 

And a stupid question here but  can I call the memory section anything I like? I do not like the word boot.

Last Edited: Sat. Dec 19, 2020 - 09:04 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

See my edit. -nostartfiles is a far better way to achieve this.

 

BTW is your code so time sensitive that one extra JMP/RJMP really matters?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ah, so this approach is lost with c code? Way too messy for me to do it all in S. Yeah, the timing is a must.

https://www.avrfreaks.net/forum/...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If you have the room in the vectors (vectors unused where code will go), you can replace existing vectors.

 

copy avr5.xn linker script to project folder, add linker option to use new script-

-Xlinker -script=avr5.xn 

 

in the script change .vectors name to something else, .vector0 in this case. The original .vectors will disappear (they no longer have a place to park)-

 

    *(.vector0)
    KEEP(*(.vector0))

 

Now you have to create the vector table yourself, putting in the normal jmp's, adding your code (which will eliminate the use of irq's that are now occupied by code). It may be easier in asm, but C will do also-

 

void fastint(void) __attribute__((naked, section(".vector0")));
void fastint(void)
{
    asm("jmp __init"); //RESET
    asm("jmp __bad_interrupt"); //INT0
    asm("jmp __bad_interrupt"); //INT1
    PORTF |= 0x01;    PORTF &= ~0x01;// INT2 <- we end up here for INT2 irq 
    PORTF |= 0x01;    PORTF &= ~0x01;// INT3 (cannot use INT3)
    PORTF |= 0x01;    PORTF &= ~0x01;// reserved
    asm("reti"); //reserved

    //you have to setup the jmp to the __vector_n
    //or to __bad_interrupt for the remaining irq's
    //asm("jmp __bad_interrupt"); //INT6
    // ...
    // asm("jmp __vector_10"); //USB General
    // asm("jmp __vector_11"); //USB Endpoint
    // ...
}

 

result-

 

00000000 <fastint>:
   0:    0c 94 0d 00     jmp    0x1a    ; 0x1a <__ctors_end>
   4:    0c 94 17 00     jmp    0x2e    ; 0x2e <__bad_interrupt>
   8:    0c 94 17 00     jmp    0x2e    ; 0x2e <__bad_interrupt>
   c:    88 9a           sbi    0x11, 0    ; 17
   e:    88 98           cbi    0x11, 0    ; 17
  10:    88 9a           sbi    0x11, 0    ; 17
  12:    88 98           cbi    0x11, 0    ; 17
  14:    88 9a           sbi    0x11, 0    ; 17
  16:    88 98           cbi    0x11, 0    ; 17
  18:    18 95           reti

 

 

I only have a vague idea of what is going on, but you could probably poll the pcint2 pin faster than using an irq (which takes 5 cycles to get to the isr it seems). May not work for whatever else is going on, but if it does work it would be simpler.

 

    for(;;){
        while( ! (PINB & 1<<2) ); //rising edge //sbis 0x03,2; rjmp .-4
        PORTF |= 0x01;    PORTF &= ~0x01;
        PORTF |= 0x01;    PORTF &= ~0x01;
        PORTF |= 0x01;    PORTF &= ~0x01;
        while( (PINB & 1<<2) ); //falling edge
        PORTF |= 0x01;    PORTF &= ~0x01;
        PORTF |= 0x01;    PORTF &= ~0x01;
        PORTF |= 0x01;    PORTF &= ~0x01;
    };

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I used the exact code you refer to. I was able to get things very close but was still not good enough. I was hoping all this relocating of vectors would help me but I'm starting to concluded it is sort of a rabbit hole. I think the author only tested it in a hand full of situations and honestly didnt do it any better then a while loop. 

 

My  exact code using defines is something like this.

	        //|___|~~~~|
		timer =1;
		WHILE_PULSE_IS_HI;//find f_low_state_portd_second falling for second polls
		PORTD = _low_state_portd_first;

		//|___|~~~~|____|
		timer =1;
		WHILE_PULSE_IS_LO;//move out of second pulse
		PORTD = _low_state_portd_second;

		//|___|~~~~|____|~~~~~|
		timer =1;
		WHILE_PULSE_IS_HI;//find falling for second pulls
		PORTD = _6button;//same as above or this to set 6 button mode.

		//|___|~~~~|____|~~~~~|____| - check 6 button states.
		timer =1;
		WHILE_PULSE_IS_LO;//move out of third pulse
		PORTD = _low_state_portd_6button;

		//|___|~~~~|____|~~~~~|____|~~~~~| -  
		timer =1;
		WHILE_PULSE_IS_HI;//find falling for second pulse
		//PORTD = _low_state_portd_first;

		//|___|~~~~|____|~~~~~|____|~~~~|____|
		timer =1;
		WHILE_PULSE_IS_LO;//move out of last pulse, idle state
		PORTD = _low_state_portd_second;

If moving around vectors isn't going t help me I'm going with my proposed hardware solution in that other thread.  The main issue with the code above is the setting of the port register takes a long time. My limited knowledge of things could not offer a faster way to set the register. Something like storing the values elsewhere and speeding up "PORTD = xx". Things are very close as is, I miss a beat here and there.

Last Edited: Sun. Dec 20, 2020 - 01:59 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I expect PORTF is out of range of SBI and CBI.

In that case, some of the manipulations of PORTF could

be replaced by toggles, i.e., assignments to PINF.

Two cycles.

 

In assembly, the WHILE_PULSE loops can be 3 or 5 cycles,

depending on the pin being tested.

In the case of 5 cycles, a register will also be needed.

The assignments to PORTD can be single cycles.

At most 4 registers will be required.

Depending on the minimum length of a phase,

you might get away with just one,

but using four would produce code easier to reason about.

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

While is on port d. but can be moved to any port. 

#define PULSE 0x04    
#define PULSE_PIN PIND
#define PULSE_PORT PORTD

#define WHILE_PULSE_IS_LO  while ( (PULSE_PIN & PULSE)==0x00) {};    
#define WHILE_PULSE_IS_HI  while ( (PULSE_PIN & PULSE)      ) {};    

 

The PORT I'm out on is also currently d but also could be changed. Are you suggesting I move my outward register to port f?

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'd assumed, apparently incorrectly, that the PORTD assignment was for the entire register.

An OUT instruction would have done the job in a single cycle.

To change some bits of PORTD, but not others, use an OUT to PIND.

Still a single cycle.

If PULSE_PIN is PIND, the spin waits can be 3-cycle loops.

 

Some of the posts mention PORTF.

I was I'm pretty sure this is outside the range of SBI, CBI and OUT.

An explicit RMW would cost at least 3 5 cycles.

An assignment to PINF would cost only 1 3.

 

Code that pushes the speed limit should be in assembler.

Even if it can be made to work in C, a close case such as

this will require reading the output assembler on every build.

 

Also, some of the code is rather repetitive.

That is why Odin invented macros.

 

Moderation in all things. -- ancient proverb

Last Edited: Mon. Dec 21, 2020 - 01:06 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

OH port F is debug ;)

 

So possible misunderstanding of what you said here but thinking it's more of a misunderstanding in general... Are you saying I can influence the PIN registers to get the same affect as the ports? PIND |= x will bring bits high on the external pins. I'm well aware that PINx ports can be used to get a reading but I never really gave that any thought. I always figured setting pin ports high or low we depend on the PORT registers?

i.e 

PORTD = 1;

PIND = 0;

this will still be a hi bit because of the internal pull up, will it not?

 

I'm in my learning circle here. I'm following most of what you are explaining but not all. I'm trying to keep up :) 

 

Yes I should be using ASM and macros here, had most all of this in C for debugging and easy understanding till I worked out my issues. You're encouraging me to go back and review my code now, I mean it was really working well, just need a few cycles shorter. I may just revert to where I had it and give this one more hurrah before I go with a hardware solution.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

>Some of the posts mention PORTF. I'm pretty sure this is outside the range of SBI, CBI and OUT.

 

I'm pretty sure it is in range as the asm in post #5 shows (cbi/sbi), and the register summary in the datasheet shows.

 

 

If using PINn to toggle a bit (via out) or simply assigning the port a value, that is a single cycle out vs 2 for sbi/cbi, but you then have to already have the data ready in a register to take advantage.

 

Not quite sure if these values are known before the pulse of interest (or if it makes a difference at that point), but if so then simply get them into a register before the pulse watching so your out instruction is already to go-

 

    timer =1;

    uint8_t v = _low_state_portd_first;

    WHILE_PULSE_IS_HI;//find f_low_state_portd_second falling for second polls
    PORTD = v;

 

    timer =1;

    v = _low_state_portd_second;
    WHILE_PULSE_IS_LO;//move out of second pulse
    PORTD = v;

 

  42:    81 e0           ldi    r24, 0x01    ; timer=1
  44:    80 93 00 01     sts    0x0100, r24

 

  48:    90 91 03 01     lds    r25, 0x0103    ; 0x800103 <_low_state_portd_first>

  4c:    4a 99           sbic    0x09, 2
  4e:    fe cf           rjmp    .-4

  50:    9b b9           out    0x0b, r25

 

  52:    80 93 00 01     sts    0x0100, r24    ; timer=1
  56:    90 91 02 01     lds    r25, 0x0102    ; 0x800102 <_low_state_portd_second>

  5a:    4a 9b           sbis    0x09, 2
  5c:    fe cf           rjmp    .-4 
  5e:    9b b9           out    0x0b, r25    ; 11

  ...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

curtvm wrote:
>Some of the posts mention PORTF. I'm pretty sure this is outside the range of SBI, CBI and OUT.

 

I'm pretty sure it is in range as the asm in post #5 shows (cbi/sbi), and the register summary in the datasheet shows.

I sit corrected.
Quote:
If using PINn to toggle a bit (via out) or simply assigning the port a value, that is a single cycle out vs 2 for sbi/cbi, but you then have to already have the data ready in a register to take advantage.
Hence the possible need for four registers.

I'm not clear on how many bits of PORTD OP is changing.

If more than one, cbi/sbi will not do the trick. LDI, OUT is just as fast.

OUT is even faster.

Quote:
Not quite sure if these values are known before the pulse of interest (or if it makes a difference at that point), but if so then simply get them into a register before the pulse watching so your out instruction is already to go-

 

    timer =1;

    uint8_t v = _low_state_portd_first;

    WHILE_PULSE_IS_HI;//find f_low_state_portd_second falling for second polls
    PORTD = v;

This will work if the minimum pulse is long enough.

 

Again, if time is tight, I recommend doing this directly in assembly:

Six invocations of the same two-parameter macro.

IIRC avr-as macros will not take OP codes as parameters,

so they would need to be C preprocessor macros.

 

What is timer?

That might be important.

 

Edit: quote correction

Moderation in all things. -- ancient proverb

Last Edited: Wed. Dec 23, 2020 - 01:27 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Taking this one step at a time here.. Timer is not used anymore, it was in the loop to prevent a freeze but my new code does not need it. Its also before the loop anyways.

 

"This will work if the minimum pulse is long enough." Pulse is always 6-7us.

 

Once I get this close I'd like to move the loops to ASM. Since the time between pulses are no less then 6us Calling a macro should be ok. Could I get some help on how the macro could look? I'm a bit rusty on ASM, still learning.

 

 

So few things here, with my project I do not see my lst file (or was it lss)? Normally AVR studio makes them for me, do I need to do something to make it out put them? I do not know any other way to see my ASM.

 

 

Interestinglt this code half works. The rising edge catches it in less then 400 ns but falling is over 500 closer to 600(see image).

 

        //|___|~~~~|____|
        loadRegister=_low_state_portd_second;
        WHILE_PULSE_IS_LO;
        PORTD = loadRegister;
    
        //|___|~~~~|____|~~~~~|
        loadRegister=_6button;
        WHILE_PULSE_IS_HI;
        PORTD = loadRegister;

 

but yeah going to need to see the ASM here, how can I do this?

Attachment(s): 

Last Edited: Tue. Dec 22, 2020 - 01:36 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

OH I see the lss file is merged in to one file now, used to be separate files... Looks like I had one timer check left in there.

	//|___|~~~~|____|
		timer =1;
		loadRegister=_low_state_portd_second;
		WHILE_PULSE_IS_LO;
    21de:	36 99       	sbic	0x06, 6	; 6
    21e0:	03 c0       	rjmp	.+6      	; 0x21e8 <Poll+0xa8>
    21e2:	01 97       	sbiw	r24, 0x01	; 1
    21e4:	00 97       	sbiw	r24, 0x00	; 0
    21e6:	d9 f7       	brne	.-10     	; 0x21de <Poll+0x9e>
		PORTD = loadRegister;
    21e8:	2b b9       	out	0x0b, r18	; 11
    21ea:	8f e2       	ldi	r24, 0x2F	; 47
    21ec:	95 e7       	ldi	r25, 0x75	; 117

		//|___|~~~~|____|~~~~~|
		timer =1;
		loadRegister=_6button;
		WHILE_PULSE_IS_HI;
    21ee:	36 9b       	sbis	0x06, 6	; 6
    21f0:	03 c0       	rjmp	.+6      	; 0x21f8 <Poll+0xb8>
    21f2:	01 97       	sbiw	r24, 0x01	; 1
    21f4:	00 97       	sbiw	r24, 0x00	; 0
    21f6:	d9 f7       	brne	.-10     	; 0x21ee <Poll+0xae>
		PORTD = loadRegister;
    21f8:	4b b9       	out	0x0b, r20	; 11

 

Last Edited: Tue. Dec 22, 2020 - 01:50 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yeah that got it down.

 

 

   //|___|~~~~|
        loadRegister=_low_state_portd_first;
        WHILE_PULSE_IS_HI_NO_CHECK;
    21cc:    36 99           sbic    0x06, 6    ; 6
    21ce:    fe cf           rjmp    .-4          ; 0x21cc <Poll+0x8c>
        PORTD = loadRegister;
    21d0:    3b b9           out    0x0b, r19    ; 11

        //|___|~~~~|____|
        loadRegister=_low_state_portd_second;
        WHILE_PULSE_IS_LO_NO_CHECK;
    21d2:    36 9b           sbis    0x06, 6    ; 6
    21d4:    fe cf           rjmp    .-4          ; 0x21d2 <Poll+0x92>
        PORTD = loadRegister;
    21d6:    8b b9           out    0x0b, r24    ; 11

 

not working %100 yet but it is about as fast as I'm going to get it, no?

 

 

 

Last Edited: Tue. Dec 22, 2020 - 02:07 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

#define fred(opcode, changes)  \
    LDI R16, changes $1: opcode PIND, 6 $ RJMP 1b $ OUT PIND, R16

 

$ is the statement separator for avr-as .
opcode should be SBIS or SBIC .
Each loop will be 3 cycles.
An OUT to PORTD might work.
On input pins, the out would only affect internal pullups.

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

>not working %100 yet but it is about as fast as I'm going to get it, no?

 

The pin reading having to be synchronized throws another delay in there (something like 1/2 to 1 1/2 clocks), and depending when the original async signal took place in relation to the avr clock and current instruction, you may end up with worst case as maybe in the 400ns area (just a guess), before you finally get to the out instruction. Since you have a logic analyzer, I assume you can get a feel for the best/worst case times. You are probably simply up against limits of what can be done with your mcu.

 

>Pulse is always 6-7us.

 

If that means you know when the next pulse shows up, maybe you can kill time to get somewhere toward the end of a pulse period/level then set the data before you check the pin for the next level. Assuming that point where you preset the data is past a time where its state is checked, then you will be as fast as anything else (pin state already set, 0ns response) and you are then simply checking where you are in the pin level progression. The trailing time precision is probably more forgiving, so detecting the pulse edges on a leading edge is no longer time critical.

 

portd = something; //preset

while( pin_is_high() ); //edge is here, data was already set

//kill some time close to the end of a pulse time, but before the next

portd = something_else; //preset

while( pin_is_low() ); //edge is here, data was already set

//kill some time

...

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

maybe you can kill time to get somewhere toward the end of a pulse

 yeah I wanted to play with this but I think it changes to some degree. but yes I didnt think of doing the while after setting data to keep in sync, I like that.

 

 

skeeve thx, ill try my hand at the macro.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

S_K_U_N_X wrote:
"This will work if the minimum pulse is long enough." Pulse is always 6-7us.
At 16 MHz, that is a variation of 16 cycles.

Something like this might be useful:

  delay 5.5 microseconds
  .rept 16  ; 2 microseconds
    skip
    RJMP 1f
  .endr
  1:
  OUT PIND, --

I think this has the same longest case delay as the loops,

but the shortest will be a cycle longer,

reducing the jitter.

 

What this code is not is fault-tolerant.

If you can live with loops,

I suggest using them.

If not, you need to decide whether you can live with rather fragile code.

If not, you might be SOL.

 

All this assumes general purpose pins.

If the pulse pin is an input capture pin,

improvement might be possible.

In the above code, replace RJMP with OUT PORTD.

PIND will not work.  PORTD needs to have a known next value.

Replace the fixed delay with a computation based on the captured input.

This should do as well as a 2-cycle loop

It is still not very fault-tolerant.

 

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

All of the tweaks to this point have helped there is just the matter of the first falling edge that I think I can fix to sync the entire loop all together.

 

Here is the code

void Poll(void)
{

	unsigned char _low_state_portd_first =0xF3;//prep bits
	unsigned char _low_state_portd_second =0xF3;//prep bits
	unsigned char _low_state_portd_6button=0xF3;//prep bits
	unsigned char loadRegister=0;
	//set direction
	DDRD |= 0xF3;

	_low_state_portd_first &= ~0xc0; //controller detection.  always low no matter what.

	//These are the only two that matter during the first pulse other then the detection.
	if ( reportBuffer[BUTTON_ROW_1] & 0x20) {_low_state_portd_first &= ~0x01;}//Start
	if ( reportBuffer[BUTTON_ROW_1] & 0x01) {_low_state_portd_first &= ~0x02;}//A

	loadRegister=_low_state_portd_first;
	WHILE_PULSE_IS_HI_NO_CHECK
	PORTD = loadRegister;

	.
	.
	

This produces the following ASM as the first edge check

 

loadRegister=_low_state_portd_first;
    WHILE_PULSE_IS_HI_NO_CHECK
    PORTD = loadRegister;
    //prep second
    _low_state_portd_second = 0xf3;
    2162:    83 2f           mov    r24, r19
    2164:    88 1f           adc    r24, r24
    2166:    88 27           eor    r24, r24
    2168:    88 1f           adc    r24, r24
    216a:    93 ef           ldi    r25, 0xF3    ; 243
    216c:    98 1b           sub    r25, r24
    216e:    89 2f           mov    r24, r25

 

 

and for that reason all other detection  (even those they are 3 cycles and match) is pointless. That first detection is critical and less forgiving as the other. So I need to make this a bit more efficient.

 

Going back to my 3 cycle code in ASM, my hope is to use that.

 

asm volatile(
        
        "    sbic   0x06, 6 \n"
        "    rjmp     .-4   \n"
        "    out   0x0b, r24\n"
            ::);
}

 

but I do not understand all this 0x06,-4,0x0b stuff? Normally I fill in my registers as presented in the : : section. Maybe like this ?

asm volatile(
        "    mov   r16, %0   \n"
        " start%=:     "
        "    sbic   0x06, 6  \n" <-- assuming 0x06 is a memory map?
        "    rjmp   start%=  \n"    
        "    out   0x0B, r16\n" <-- assuming 0x0b is a memory map?
        : : "w" (_low_state_portd_first)
        : "r16"
    );

 

That gets me 500ns, I need to be closer to 300.

 

 

 

skeeve , missed that post, thx for the repeat time, that will help a lot. Was a bit concerned about freeing code.

Last Edited: Tue. Dec 22, 2020 - 10:32 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I expect it would be good to put most of this pulse stuff in a .S file.

If it includes the infinite loop, any upper register can be the one needed.

If not, R30 or R31 would do the trick.

Functions are allowed to clobber Z.

 

Again, use macros.

Saves a lot of editing.

 

Of course, setup can be in C.

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I wouldn't do any inline asm, if you absolutely cannot get the c compiler to do what you want then create an asm file. Since you know how to read an asm listing, and in many cases C code can do the job (you can monitor what it does in the asm), I would stick to C until it no longer can do the job. You probably do not want a big mix of macros/defines/inline asm, as it just becomes more fragile over time and you lose sight of what code is actually doing (both the big picture and the details).

 

I took your code and changed a little-

https://godbolt.org/z/K5ThWj

 

It looks like ~7 instructions to setup the first loop in any case, and when you get to the loop it produces the same loop code as asm would do. As also can be seen, not everything has to be a macro as I put those pulse pin checks into a function. The compiler is smart, so you can take advantage of it while still watching what gets produced. Notice you can also let delay.h put in us delay code if wanted, or if need more than us resolution just use __builtin_avr_delay_cycles(), without any need to do this in asm.

Last Edited: Wed. Dec 23, 2020 - 06:18 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

After a lot of observations with the real hardware using a scope, it turns out there is only a concern at the first falling edge.  And it is only for some cases not all). The other edges seem to be 500ns torrential which the ASM can do just fine.

 

There are two forms, the longer and shorter.

 

//|___|~~~~|____|~~~~~|____|~~~~|____|

  |

 critical

 

//|___|

  |

 very critical

 

The longer form does not seem to be as critical as the shorter one. At least so far... And I know what form is coming and I track that. So, for the short one I set PORTD before the fall because I must react in under 500ns and there are times the ASM below just takes longer then that. So I get a rapid effect where it thinks the response is being pressed off and on. Setting the PORTD and then waiting for the falling edge, does work in this case. As for the longer form, that trick is not working but fortunately timings not as critical. 

 

if (_console_pulling_mode == 3) PORTD=_low_state_portd_first;

asm volatile(
        "    mov    r31, %0   \n"
        " start%=:     "
        "    sbic   0x06, 6  \n"
        "    rjmp   start%=  \n"    
        "    out   0x0B,  r31\n"
        : : "w" (_low_state_portd_first)
        : "r31"
    );

 

This works out rather well, but it's a band-aid for the lack of speed. It would be much more efficient if I could get it working without this.  I'll settle for less I guess... Watching the scope I see a range of 300~-500~ ns, guessing 3-5 clock cycles. If there was a way to get just one cycle faster I'd have the perfect solution.  

 

 

 

 

produced code

2228:    36 99           sbic    0x06, 6    ; 6
222a:    fe cf           rjmp    .-4          ; 0x2228 <start374>
222c:    fb b9           out    0x0b, r31    ; 11

 

Other though t I had was adding a pull down resister to speed up the lowering of the pulse, thinking its just taking too long to fall. But never played with that idea.  As it is currently there is no resister on the MCU but presumably in the device I'm connected to has a pull up (could measure it).

 

If I'm at my limit, so be it.

 

 

 

 

Last Edited: Fri. Dec 25, 2020 - 11:08 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Show the C code.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I wrote that pieces in asm ?

 

 

Did you mean the surrounding code?

void Poll(void)
{
	unsigned char _low_state_portd_first =0xF3;//prep bits
	unsigned char _low_state_portd_second =0xF3;//prep bits
	unsigned char _low_state_portd_6button_First;//prep bits
	unsigned char _low_state_portd_6button_Second=0xF3;//prep bits
	unsigned char _low_state_portd_6button_Second_Third=0xF3;//prep bits
	
	//set direction
	DDRD |= 0xF3;

	if ( reportBuffer[BUTTON_ROW_1] & 0x20) {_low_state_portd_first &= ~0x01;}//Start
	if ( reportBuffer[BUTTON_ROW_1] & 0x01) {_low_state_portd_first &= ~0x02;}//A

	_low_state_portd_first &= ~0xc0; //controller detection.  
	
	//prep second
	_low_state_portd_second = 0xf3;
	
	if ( reportBuffer[BUTTON_ROW_1] & 0x80) _low_state_portd_second &= ~0x01; //C 
	if ( reportBuffer[BUTTON_ROW_1] & 0x02) _low_state_portd_second &= ~0x02; //B 
	if ( reportBuffer[HAT] == DPAD_RIGHT || reportBuffer[HAT] == DPAD_DOWNRIGHT || reportBuffer[HAT] == DPAD_UPRIGHT) {_low_state_portd_6button_Second_Third |= 0x80; _low_state_portd_first &= ~0x80; _low_state_portd_second  &= ~0x80; }
	if ( reportBuffer[HAT] == DPAD_LEFT || reportBuffer[HAT] == DPAD_DOWNLEFT || reportBuffer[HAT] == DPAD_UPLEFT )	  {_low_state_portd_6button_Second_Third |= 0x40; _low_state_portd_first &= ~0x40; _low_state_portd_second  &= ~0x40; }
	if ( reportBuffer[HAT] == DPAD_UP || reportBuffer[HAT] == DPAD_UPLEFT || reportBuffer[HAT] == DPAD_UPRIGHT)    	  {_low_state_portd_6button_Second_Third |= 0x10; _low_state_portd_first &= ~0x10; _low_state_portd_second  &= ~0x10; }
	if ( reportBuffer[HAT] == DPAD_DOWN || reportBuffer[HAT] == DPAD_DOWNLEFT || reportBuffer[HAT] == DPAD_DOWNRIGHT) {_low_state_portd_6button_Second_Third |= 0x20; _low_state_portd_first &= ~0x20; _low_state_portd_second  &= ~0x20; }

	
	if ( reportBuffer[BUTTON_ROW_1] & 0x40) _low_state_portd_6button_Second  &= ~0x10;//z
	if ( reportBuffer[BUTTON_ROW_1] & 0x08) _low_state_portd_6button_Second  &= ~0x20;//y
	if ( reportBuffer[BUTTON_ROW_1] & 0x04) _low_state_portd_6button_Second  &= ~0x40;//x
	if ( reportBuffer[BUTTON_ROW_1] & 0x10) _low_state_portd_6button_Second  &= ~0x80;//mode

if (_console_pulling_mode == 3) PORTD=_low_state_portd_first;
	asm volatile(
		"	mov    r31, %0   \n"
		" start%=:     "
		"	sbic   0x06, 6  \n"
		"	rjmp   start%=  \n"	
		"	out   0x0B,  r31\n"
		: : "w" (_low_state_portd_first) 
        : "r31"
    );

	if ( _console_pulling_mode==6) 
	{
	_low_state_portd_6button_First=_low_state_portd_first;
	_low_state_portd_6button_First &= ~0xf0;
	
		//|___| 
		asm volatile("start%=:\n sbis 0x06,6 \n rjmp start%=\n":::);//sync to rise.
		PORTD = _low_state_portd_second;
 
		//|___|~~~~|
		asm volatile("start%=:\n sbic 0x06,6 \n rjmp start%=\n":::);//sync to fall.
		PORTD = _low_state_portd_first; 
 
		//|___|~~~~|____|
		asm volatile("start%=:\n sbis 0x06,6 \n rjmp start%=\n":::);//sync to rise.
		PORTD=_low_state_portd_second;

		//|___|~~~~|____|~~~~~|
		asm volatile("start%=:\n sbic 0x06,6 \n rjmp start%=\n":::);//sync to fall.
		PORTD=_low_state_portd_6button_First; 

		//|___|~~~~|____|~~~~~|____|  
		asm volatile("start%=:\n sbis 0x06,6 \n rjmp start%=\n":::);//sync to rise.
		PORTD=_low_state_portd_6button_Second;

		//|___|~~~~|____|~~~~~|____|~~~~~|  
		asm volatile("start%=:\n sbic 0x06,6 \n rjmp start%=\n":::);//sync to fall.
		PORTD = _low_state_portd_6button_Second_Third;
	}
	

	asm volatile("start%=:\n sbis 0x06,6 \n rjmp start%=\n":::);//sync to rise.
	PORTD=_low_state_portd_second;
 

}

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I think the suggestion is to ditch inline and put the entire algorithm as .S in a standalone Asm source (obviously you can use -save-temps as a starting point to initially generate it then hand optimize)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Will that change the interaction time for that first falling edge?

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

curtvm wrote:
I wouldn't do any inline asm, if you absolutely cannot get the c compiler to do what you want then create an asm file.
Unless one likes reading assembly after each recompile, not a good criterion.

If one has to struggle to get the compiler to emit the assembly one wants,

'tis time for assembly.

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I would not recommend putting all of Poll in a .S file.
I'd just do its tail,
starting with "if ( reportBuffer[BUTTON_ROW_1] & 0x40) ...//z"

The existing code has explicit constants that should probably have names.
There are at least six very similar asm statements.
A macro would be good whether or not they move a .S file.
Those asm statements also contain some of the explicit constants I mentioned.

Moderation in all things. -- ancient proverb

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The asm is doing nothing special that the C is not already doing, and there is no struggle to get C to do what is wanted here. If the time constraint simply boils down to the checking of the pin state (if unable to preset), you cannot escape that loop time (sbis/c, rjmp). You can probably shave a clock off for some amount of time if you are willing to do a bunch of consecutive if checks instead of a while loop, but you still would have to contain that in a loop so eventually still need an rjmp. You would increase your odds, but still a chance the pulse takes place when the rjmp happens.

something like-

https://godbolt.org/z/fdfYYG

 

Copied latest code, modified-

https://godbolt.org/z/MrobPf

https://godbolt.org/z/8GxK3z

the checking for port bit set/clr can be done with a C function, and is exactly the same as what is being done in asm. In this case, asm is gaining nothing.

(I'm making some guesses in these codes, the second one is easier to read/understand and although I didn't take the time to figure out the details they both compile to the same amount of asm lines)

 

why do this-

asm volatile("start%=:\n sbis 0x06,6 \n rjmp start%=\n":::);//sync to rise.

 

when this is the exact same thing (and if decide on a different pin, change the function and all the other uses are taken care of)-

while( pulseIsLow() );//sync to rise.

 

or

if (_console_pulling_mode == 3) PORTD=_low_state_portd_first;
    asm volatile(
        "    mov    r31, %0   \n"
        " start%=:     "
        "    sbic   0x06, 6  \n"
        "    rjmp   start%=  \n"    
        "    out   0x0B,  r31\n"
        : : "w" (_low_state_portd_first) 
        : "r31"
    );

 

when this is the same thing-

    uint8_t v = _low_state_portd_first;
    if( _console_pulling_mode == 3 ) PORTD = v; //preset only in 3 button mode (?)
    while( pulseIsHigh() );
    PORTD = v;

 

 

Last Edited: Sun. Dec 27, 2020 - 10:41 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

curtvm wrote:
when this is the exact same thing (and if decide on a different pin, change the function and all the other uses are taken care of)-
Emphasis added.

+1

 

I'm still dubious about relying on C to always to produce the 3-cycle loop.

Compilers do strange and apparently inconsistent things sometimes.

Sometimes can include the time one forgets to read the assembly.

 

Also, OP has stated that only the first transition need to be precise.

A delay followed by 2 microseconds of skips and outs should do the trick.

The rest can be done with loops.

Moderation in all things. -- ancient proverb