Atmega32u4 16 to 20 or greater.

Go To Last Post
37 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I have hit a brick wall with my current project and I have no choice but to up my clock from the standard 16MHz. So just brain storing here.

 

1) Adjusting my code, ill have to address any time sensitive code, should not be an issue.

2) I use Lufa, and imagine there is a clock setting, simple enough I'm guessing?

3) Bootloader, not really sure here. I'd imaging the 16 was in the makefile on build, so I'll need to find the source for the dfu that comes with the chip and recompile/flash.

 

Anything I'm not thinking of here? I'll need at least a 20MHz but if a 32 is something I can come by it may make adjusting time sensitive code easier (doubling it). Not really sure where to find the ATMEL boot source. Would it be easier to make a switch from 16 to 20 clocks from xtail? I could tie that to a DPDT switch and simultaneously drop HWB lo.

Last Edited: Sat. Nov 28, 2020 - 01:48 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Per SOA, 16MHz is max.

ATmega32U4 - 8-bit AVR Microcontrollers

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Well that's that then.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yeah saw that a bit of an upgrade. Wish there was a better option with the same form-factor. I could also consider a co-processor for my issue.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Have you done a thorough review of your code architecture?
Analyzed the flow for the bottleneck that's causing you to need to go faster than 16 MHz? 
Tried ASM for critical time functions?
Looked at other code optimization hints and ideas?

Unless you're doing DSP, high speed motor control, or other math intensive applications, 16 MHz should be able to handle any human controlled project.

If you do need the extra processing, don't do a little jump. Go with 32-bit or at least a DSP like the dsPIC33 and make your life easier.
Today's devices are too inexpensive to hobble your software development time (which can be much more expensive than hardware).

Just my $0.02

"If you find yourself in an even battle, you didn't plan very well."
https://www.gameactive.org
https://github.com/CmdrZin

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Why do you say you need to increase speed?  Use a faster type of calculation instead.

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0


It started here. And yes, that did work but it got even tighter as the timing is based on a "games". The project is to emulate a video game controller and in this case each game apparently can be more aggressive. I had to give up on simulating the hardware ( 74HCT04 ) as the expected response time is now around 100 ns. The AVR just can not deal with it alone (hence my comment about a co -processor ). I'm very open to suggestions here but I'm afraid 16MHz will not cut it. Happy to go in to more detail on that other topic or this one, should anyone want to dive deeper in to the rabbit hole. 


There is a poorly but useful write up here on the controller I'm trying to emulate.

https://www.raspberryfield.life/...

but see the image here

The select line is from the console and the other are responses to it. In the shortest scene, when the select line falls, I need to rise or lower the other 6 lines accordingly. At the moment they are on PORTD and PORTC, and I may be able to rewire it so they are all on PORTD to save code time, but as I mentioned I must react with in 100-200ns. I tried to react and time it so the 6 pins were set just before the fall but that didnt please all games. I think I'm just going to have to play by the rules and find a faster solution. 

 

 

Last Edited: Sat. Nov 28, 2020 - 05:27 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I assume you picked 32U4 for USB. If so it needs to be either an 8MHz or 16MHz crystal to run the USB correctly.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yes, mainly for lufa support. I was not aware of the 8/16 but not surprised.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The way USB works us that either 8 is multiplied by 6 (using PLL) to make 48MHz or 16MHz is multiplied by 3 to get to the same 48MHz then in either case it is divided by 4 to get to the 12MHz that is actually required for USB. So you have to start with 8 or 16.

 

As noted above many Xmega run at 32MHz but also come with USB as standard. However note that until recently most are 3.3V only. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

S_K_U_N_X wrote:
I had to give up on simulating the hardware ( 74HCT04 ) as the expected response time is now around 100 ns.
Brad's territory ... likely do-able.

S_K_U_N_X wrote:
The AVR just can not deal with it alone (hence my comment about a co -processor ).
Might gain just enough logic and speed via CCL.

S_K_U_N_X wrote:
I'm very open to suggestions here but I'm afraid 16MHz will not cut it.
An AVR D (24MHz) with USB by mega32U4 (functions reside on one or the other)

cPLD and FPGA don't have USB; essential minimal set of functions on programmable logic with the remainder on the mega32U4.

 


XMega cranks out NTSC color and digital stereo sound! | AVR Freaks

CCL - Configurable Custom Logic | Migration from the megaAVR® to AVR® Dx Microcontroller Families

Icestorm | AVR Freaks

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Might gain just enough logic and speed via CCL.

 

Correct me if I'm wrong but that is not an option for the mega32u4 ?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'd use a different approach.
From a review of the timing diagrams in the raspberry link, some things stand out.

 

The message transfer from Controller to Host takes about 40us out of 20ms. ~0.2% duty cycle. So the code can stay inside 
the ISR for the entire transfer.
The SELECT pulse IDLE state is SELECT (HIGH). Pulse 1 is /SELECT > SELECT ... Pulse 4 is /SELECT > SELECT (IDLE)
Sample near the middle of the pulse using ~2us (32 cycles) delay after an edge of the pulse.
 

Button Pressed
Up         Pin1 HIGH during 4th /SELECT only if it is LOW during the 2rd /SELECT pulse.
Down     Pin2 HIGH during 4th /SELECT only if it is LOW during the 2rd /SELECT pulse.
LEFT      Pin3 HIGH during 4th /SELECT only if it is LOW during the 2rd  SELECT pulse.
RIGHT    Pin4 HIGH during 4th /SELECT only if it is LOW during the 2rd  SELECT pulse.
A           Pin6 LOW during 4th /SELECT only if it is LOW during the 2rd  /SELECT pulse.
B           Pin6 HIGH during 4th /SELECT only if it is LOW during the 2rd  SELECT pulse.
START    Pin9 LOW during 4th /SELECT only if it is LOW during the 2rd  /SELECT pulse.
C           Pin9 HIGH during 4th /SELECT only if it is LOW during the 2rd  SELECT pulse.
X           Pin3 LOW during 3th  SELECT pulse.
Y           Pin2 LOW during 3th  SELECT pulse.
Z           Pin1 LOW during 3th  SELECT pulse.
MODE    Pin4 LOW during 3th  SELECT pulse.
 

So, collect IO pin data during the 2nd /SELECT, 2rd  SELECT, 3th  SELECT, and 4th /SELECT pulses during the ISR. (40us)

Set a flag that the transfer took place.
On flag, take these four patterns and process them through a series of IF statements setting button flags for which buttons are pressed.
Twelve IFs should not take much time at all out of the 20ms available.

"If you find yourself in an even battle, you didn't plan very well."
https://www.gameactive.org
https://github.com/CmdrZin

Last Edited: Sat. Nov 28, 2020 - 10:13 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Not an option other than what logic exists within the peripherals.

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

CmdrZin, Same of the info is not explained well, and also not in the thread but from what you said I think known more would show why I'm not convinced things will work.

 

I pretty much do all my prep work before the falls. I then react with a while high then set the ports. I have experimented with ASM and a pin change interrupt with no acceptable results.

 

one example

WHILE_PULSE_IS_HI_NO_CHECK ( a while loop check a pin state)
PORTD = _low_state_portd_first;       

 

The issue I'm having is that the fall minus the check is less then 200ns.

 

123456789012345678

|||||||||||||||||||||||||||||   1--ns segments

~~~~~~~~|_______

 

Take for example the fall at the second '1' call this the 10th clock if you will. Clock number 12 is where the console makes the check for the other 6 pins.  I do not have 100ns to do the search for fall and set port I need 3 or 4 MCU cycles for that and that adds up over 100ns.

 

 

 

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

 but as I mentioned I must react with in 100-200ns.

All of these pulses are way longer than that...have you seen this number mentioned or specified somewhere? When exactly is the read event compared to the edge?

I suppose the pulse width could be very long, but maybe the game unit does the reading of the switches (which you hope to emulate), right after the edges of the clock (select) pulses??...so maybe their wide width does nothing.

 

However, you don't have to wait for the pulse to put your switch data out there

...do it way ahead of the rising edge....the rising edge will come along from the game, but you were ahead of the game & your switch data is already there...right after the edge, the game reads the data within 100ns of the edge... next, in a quick wink you spit out the falling edge  switch data....way before the falling edge....then the falling edge arrives from the game and immediately thereafter does a read, but you were already 10 miles ahead with the info. After 100ns from the edge, the game reads your data & you race to beat it to the next edge. 

 

Each edge is your starting pistol to get new data out before the next edge.....but there is loads of time between edges.   Just be sure to leave the data linger after the edge long enough so the game has time to read it. 

Now you will have lots of microseconds (or even milliseconds) to get ready--no 100ns rush is needed.

 

I might be off my spaceship, but this seems doable without going supernova.

 

maybe this will inspire you

https://www.youtube.com/watch?v=...

 

 

 

 

 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

Last Edited: Sun. Nov 29, 2020 - 09:35 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

DOH, just ignore my suggestion. For some reason I thought you were trying to decode the controller, not generate it's signals.

"If you find yourself in an even battle, you didn't plan very well."
https://www.gameactive.org
https://github.com/CmdrZin

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

avrcandies wrote:

However, you don't have to wait for the pulse to put your switch data out there

...do it way ahead of the rising edge..

I'm not sure that's an option because S_K_U_N_X would be changing data lines whilst the "Game" could be reading it.

 

Is this a "Double-Data-Rate" hack where the lines are read an arbitrary delay after each SELECT line transition ? (as opposed to only a low going edge)

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm not sure that's an option because S_K_U_N_X would be changing data lines whilst the "Game" could be reading it.

 From the description of the keypad it appears that the rising edge & falling edges each cause different data to be output in a step-by-step parade ...at some short time after each edge (maybe the 100ns being tossed around) the controller reads whatever has been put there*.  Instead you can put the data there ahead of the edge (but after the 100ns reading of the prior edge). The edges are separated by long time intervals, maybe 10 or 20us at least.  So if you wait 5us post-edge (to allow the sampling), you still have 5-10us to get ready for the next edge....much more than the 100ns.  It would be really nice to know when the reading (sampling) occurs relative to the high & low edges.   Even if you could only pre-position the data 2us ahead of the edges, that is a big improvement.

 

   * I'd suspect in the TTL days, the edges would cause a FF & gates to immediately react and post some data, well within a 100ns sampling setup time..

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

Last Edited: Sun. Nov 29, 2020 - 10:42 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

 (maybe the 100ns being tossed around)

 Precisely what confused me at first, its not up to hardware. This is also not the first time I have seen this with video game consoles. Some how or another the poll rate/check is set by the game.  So each time I adjusted the "clocked out bits" if you will (in other works the condition of the port register) I find another case more demanding. I suspect what is going on here is the pulse is hardware generated and the game software reads the the 74HCT04 chip. If that game is light weight. it is going to happen faster. This is in line with the games I choose, less graphically intense, shorter window. So in the short of It I'm giving up an AVR to do the job. Even if I did a IRQ, and had the called function set the ports I bet it would still not be fast enough. If I could abuse the SPI, maybe but that path is unclear as there is not enough pins to fake a SPI simulation. Software just can not react in time in all cases. 

 

I suppose in all practical extends and purposes I could use a  74HCT04 chip if I can find one on the market.  Or maybe a  CD74HCT04

 

Last Edited: Sun. Nov 29, 2020 - 04:19 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Well, given this schematics I found, I can see that you really need a hardware solution as you've concluded.

Schematic for Sega Controller

If the host is generating the SELECT line, you'll never know how fast they sample the lines after the edges. A slow system may be 1us. A fast system may be 50ns.
I'd go with a mux and two 4-bit registers. Preload the registers per the button pattern for different controllers and let the hardware do the work.
May need a mux for the SELECT line also.

"If you find yourself in an even battle, you didn't plan very well."
https://www.gameactive.org
https://github.com/CmdrZin

Last Edited: Sun. Nov 29, 2020 - 04:55 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

 

 here is the timing of the select (TH) line, from sega

     

     this is from page 133 of the technical information

   https://fabiensanglard.net/anoth...

 

To me, it seems like you can start putting out the next data after 2us after the clock HI or LO edge...you will have plenty of time before the next edge occurs to get it done.

 

 

 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

Last Edited: Sun. Nov 29, 2020 - 06:40 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Nice find.

avrcandies wrote:
To me, it seems like you can start putting out the next data after 2us after the clock HI or LO edge...you will have plenty of time before the next edge occurs to get it done.

I'd agree that's the way to go or at least try out. Good luck.

"If you find yourself in an even battle, you didn't plan very well."
https://www.gameactive.org
https://github.com/CmdrZin

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'd agree that's the way to go or at least try out. Good luck.

The 2us is probably to allow margin between getting the pulse, putting out data, & reading the data in 2us later. Maybe some games violate this & read much quicker, leading to a 100ns time crunch.  My suggestion is a proposed workaround--give it a try, nothing to lose!!  You could possibly output several updates, until the next pulse arrives (would not be much benefit, unless the waiting interval is really long, like 20ms).

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Evidence to the contrary but damn nice find.  I'm well before 2us on my scope. One trick I tried was to set the states before the fall so they were ready but somehow games defeated this. Or at least it didn't work. I almost felt like it was check before and after the poll. I used this trick on PC PACMAN because it was checking to soon after the fall and this trick did work. But on a slow game like Mortal Kombat 3, it would mess things up. Since game 1 was a 3 button and game 2 was a 6 button I change my code based on that. That worked, but I then found another game that failed in yet another way. So as I said above, I'm throwing in the towel on this and going with hardware.

Last Edited: Mon. Nov 30, 2020 - 12:20 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

 One trick I tried was to set the states before the fall so they were ready but somehow games defeated this.

You'd have to do the same on the rise as well.  Have the data ready to go before either type of edge.  However,  not toooo early before the edge (in other words, not before the game has time to read the old data) .  you'sd also have to keep good track of exactly which edge you are dealing with (out of the 5 or 6 edges), or things will get completely hosed. 

 

this guy made a USB controller for 3 brands of controllers & an AVR...maybe useful?

https://digitalcommons.calpoly.e...

 

a nice thing to look at (but does reading instead of emulate)

https://github.com/jonthysell/Se...

 

this guy seem to sort of have my idea (or my idea is junk)  

(look at) Preparing in advance for the next transition    https://www.raphnet.net/programm...

 

a nice link

http://www.msarnoff.org/gen2usb/

 

 

 

 

 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

To make the AVR impersonator work reliably for ALL games S_K_U_N_X would have to emulate the hardware pretty closely because I'll bet the farm that some games don't follow the rules.

I would say that emulating the hardware multiplexor using a 16MHz AVR is impossible. Using a 20MHz AVR - also impossible. Using a 32MHz XMEGA - An interesting challenge. Using a 60MHz SAM or PIC32 fairly easy.

 

Adding the multiplexor seems the easier solution, but may need more I/O pins.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

N.Winterbottom wrote:

I would say that emulating the hardware multiplexor using a 16MHz AVR is impossible. Using a 20MHz AVR - also impossible. Using a 32MHz XMEGA - An interesting challenge. Using a 60MHz SAM or PIC32 fairly easy.

 


Then a 240MHz ESP32 should do the job nicely or failing that get a 600MHz Teensy 4!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You'd have to do the same on the rise as well.  Have the data ready to go before either type of edge.  However,  not toooo early before the edge

 This is precisely what I tried and came a bit obvious to try it with great hope it would work, alas,  did not.  Many are coming to the same conclusion here. I ordered a few multiplexors to play with and just messing around with the code for fun. Every-time I make one game work, another fails. A hardware add-on in this case will work with my design and I wager it's best.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

and came a bit obvious to try it with great hope it would work, alas,  did not.

We'll, I should send you a few kegs of beer for trying. 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

An option is to attach the equivalent to the AVR.

GreenPAK | Dialog Semiconductor (mixed-signal; sPLD plus analog)

Some GreenPAKTM are the price of a small tinyAVR.

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Going to dive back in the code here for a few reasons.

 

1) To get a 6 button controller working I'm going to need a few multiplexers and the BOM is getting a bit over budget and taking up too much space. It's not a show stopper but I'd rather this be my backup.

2) I reread avrcandies post and didnt see the links the first time. I know raphnet pretty well and if he pulled it off, it probably worked. He does not have the game collection I do, so ill put it to the test.

3) Reading over his design and what avrcandies said, it may just work. It is going to come down to what N.Winterbottom  said that some games just do not play by the book and I'm pretty sure I have an idea what goes those would be.

4) There are a number of advantages if I make this work.

 

Though, if no, I have a back up plan. So I'm doing this mainly for the experience and learning exercise.  That said, I wired up the pin change for a test and the c style code the call to the ISR has a 1us delay from reaction to debug. That's slower then my design, so I'm guessing I need to understand a bit more of what is going on here. So I have a few questions.

 

1) IS a PCINT any slower the INT ?

 

PCICR |= (1 << PCIE0);

PCMSK0 |= (1 << PCINT7);

 

vs

 

EICRA = (1<<ISC00);
EIMSK = (1<<INT0);

 

?

 

2) What does this do?

/* Move the vector to the bootloader section where we have direct code for INT0. */
    MCUCR = (1<<IVCE); <---confused by these because mcucr normal talks about bits 0 and 1 being the pin condition to watch for (isc00 and isc01)?
    MCUCR = (1<<IVSEL); <---

 

Are we making the vector more easily accessible? Will that interfere with my bootloader that is running? How do I do the equivalent for my PCINT ?

 

3) Currently I'm using c style code. I of course realize ASM will give me more control but is the Pinchange going to jump in to the code any faster just because it's done in ASM ? Perhaps the fact the compiler does not inline the function.

 

I tried to use the code below to make this work

 

PCMSK0 |= (1 << PCINT7); // Enable PCINT7
PCICR |= (1 << PCIE0);

MCUCR |= (1<<IVCE);
MCUCR |= (1<<IVSEL);

sei();//enalbe interupts.
_delay_ms(17);
PCMSK0 &= ~(1 << PCINT7);

but all that does is reset my atmega32u4 after 200us. 

 

 

 

 

Last Edited: Sat. Dec 5, 2020 - 04:14 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

S_K_U_N_X wrote:
1) To get a 6 button controller working I'm going to need a few multiplexers and the BOM is getting a bit over budget and taking up too much space.
GreenPAK have several to maybe many multiplexers.

8-bit Multiplexer | Dialog Semiconductor

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

For a completely ASM project the author could reserve a CPU register or two so that an ISR merely has to OUT that reserved register to PORTC etc.

 

    out PORTC,r2
    out PORTD.r3
    reti

Because OUT doesn't affect SREG there is no need to save it -> Super fast.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Register reservation in FSF AVR GCC?

Fixed Registers | avr-gcc - GCC Wiki

...

User-defined global registers by means of global register asm and / or -ffixed-n won't be saved or restored in function pro- and epilogue.

https://gcc.gnu.org/onlinedocs/gcc-5.5.0/gcc/Code-Gen-Options.html#index-ffixed

 

"Dare to be naïve." - Buckminster Fuller