So I moved my initial pdp8 emulation code over to the XMEGA 128A1U running at 29.4912 MHz, and it began at around 108K instructions per second running my enigma test code set to never stop. I found that optimize most instead of optimize for size made a decent improvement with a few other things I've tried and it is outputting 133K instructions per second.
I'm attaching the project so far for anyone willing to look through it and give me any performance tips.
My goal is the 0.333 MIPS or 333K instructions per second which may or may not be obtainable, but looking at the LSS there are a lot of instructions it is running.
I tried replacing the uint8_t pdp8 registers with some GPIORX using defines, but it made it slower not faster.
I know that putting a cycle count inside the pdp8_execute loop could eliminate all of its starting pushes/pops, so that is one thing I can do.
Any tips based on what you see?