Any optimization tips for this...

Go To Last Post
27 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi Everyone,

 

So I moved my initial pdp8 emulation code over to the XMEGA 128A1U running at 29.4912 MHz, and it began at around 108K instructions per second running my enigma test code set to never stop.  I found that optimize most instead of optimize for size made a decent improvement with a few other things I've tried and it is outputting 133K instructions per second.

 

I'm attaching the project so far for anyone willing to look through it and give me any performance tips.

 

My goal is the 0.333 MIPS or 333K instructions per second which may or may not be obtainable, but looking at the LSS there are a lot of instructions it is running.

 

I tried replacing the uint8_t pdp8 registers with some GPIORX using defines, but it made it slower not faster.

 

I know that putting a cycle count inside the pdp8_execute loop could eliminate all of its starting pushes/pops, so that is one thing I can do.

 

Any tips based on what you see?

 

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Why use Xmega if you need performance?

Use something like ARM Cortex M4 @400MHz and you'll have plenty of performance.

There are also plenty of ARM processors in between with different speed options.

I like the "Blue Pill" and "Maple Mini" boards, mainly because they're cheap, especially for their performance.

But don't expect too much for that price. Today I looked at a new unused board from Ali, and the STM32 had 2 solder shorts between it's pins.

A bit of optical inspection with a decent stereo microscope is a usefull tool.

 

 

GCC's optimisation settings are also based a bit on guesswork. If you really want to optimize here, you first need speed benchmarks to identify the slow bits of the code. Maybe you have some sillly algorithms somewhere, maybe you can offload the processor by clever use of a timer. It may help to try different optimisation settings for different parts of the programs and benchmark them separately.

But if the overall code is written decently the difference wont be very big.

Also, if you Xmega runs at 30MHz, and it simulates 100k instructions/s of the pdp8 then you have a ratio of 300:1 which seems pretty lousy but I don't know how much those architectures differ (probably a lot).

 

Writing decent emulators is a real art, and it has been done many times before. How closely have you studied other projects? Were they any good? Were there big differences between those projects, and why? You can learn a lot by studying such things.

Doing magic with a USD 7 Logic Analyser: https://www.avrfreaks.net/comment/2421756#comment-2421756

Bunch of old projects with AVR's: http://www.hoevendesign.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I really want to do it with an AVR if I can.

 

I'm up to 212765 instructions per second with things I've tried.

 

Does anyone know how to modify these to read and write in words?  I've tried to understand these before but haven't had the best luck...

 

#define far_mem_read(addr)                            \
        (__extension__({                                \
    uint32_t temp32 = (uint32_t)(addr);     \
    uint8_t result;                         \
    asm volatile(                           \
      "in __tmp_reg__, %2"     "\n\t" \
      "out %2, %C1"            "\n\t" \
      "movw r30, %1"           "\n\t" \
      "ld %0, Z"               "\n\t" \
      "out %2, __tmp_reg__"    "\n\t" \
      : "=r" (result)                 \
      : "r" (temp32),                 \
        "I" (_SFR_IO_ADDR(RAMPZ))     \
      : "r30", "r31"                  \
    );                                      \
    result;                                 \
  }))

#define far_mem_write(addr, data)                     \
        (__extension__({                                \
    uint32_t temp32 = (uint32_t)(addr);     \
    asm volatile(                           \
      "in __tmp_reg__, %1"     "\n\t" \
      "out %1, %C0"            "\n\t" \
      "movw r30, %0"           "\n\t" \
      "st Z, %2"               "\n\t" \
      "out %1, __tmp_reg__"           \
      :                               \
      : "r" (temp32),                 \
        "I" (_SFR_IO_ADDR(RAMPZ)),    \
        "r" ((uint8_t)(data))           \
      : "r30", "r31"                  \
    );                                      \
  }))

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I think I might have it:

 

#define far_mem_read_word(addr)                            \
        (__extension__({                                \
    uint32_t temp32 = (uint32_t)(addr);     \
    uint16_t result;                         \
    asm volatile(                           \
      "in __tmp_reg__, %2"     "\n\t" \
      "out %2, %C1"            "\n\t" \
      "movw r30, %1"           "\n\t" \
      "ld %A0, Z+"               "\n\t" \
      "ld %B0, Z"               "\n\t" \
      "out %2, __tmp_reg__"    "\n\t" \
      : "=w" (result)                 \
      : "r" (temp32),                 \
        "I" (_SFR_IO_ADDR(RAMPZ))     \
      : "r30", "r31"                  \
    );                                      \
    result;                                 \
  }))

#define far_mem_write_word(addr, data)                     \
        (__extension__({                                \
    uint32_t temp32 = (uint32_t)(addr);     \
    asm volatile(                           \
      "in __tmp_reg__, %1"     "\n\t" \
      "out %1, %C0"            "\n\t" \
      "movw r30, %0"           "\n\t" \
      "st Z+, %A2"               "\n\t" \
      "st Z, %B2"               "\n\t" \
      "out %1, __tmp_reg__"           \
      :                               \
      : "r" (temp32),                 \
        "I" (_SFR_IO_ADDR(RAMPZ)),    \
        "w" ((uint16_t)(data))           \
      : "r30", "r31"                  \
    );                                      \
  }))

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

alank2 wrote:
I know that putting a cycle count inside the pdp8_execute loop could eliminate all of its starting pushes/pops, so that is one thing I can do.
Extend to have EBI SRAM back internal SRAM?

Reason : one extra clock cycle for EBI SRAM

Cache(s)?

Speculative execution (two temporary short duration PDP-8 instruction streams)?

 

DMA can easily reach 2*333KB/sec; DMA controller will transition in parallel with CPU (there's a DMA arbitration hit that's already built-in to AVR instruction timing on XMEGA)

 

edit :

Maximize stack use as stack should be totally in internal SRAM.

 

"Dare to be naïve." - Buckminster Fuller

Last Edited: Thu. Apr 25, 2019 - 11:52 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Paulvdh wrote:
If you really want to optimize here, you first need speed benchmarks to identify the slow bits of the code.
or in-lieu of benchmarks, dynamic analysis.

An improved emulator on a PC "should" be quicker on an XMEGA.

Valgrind Home

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I was going to say the very same thing!

 

PDP8 address range is 4k words. There is some extra hardware in the PDP8 to extend via paging to 32k words. Access to the hardware would give a hint when to move a page from external ram to internal ram.

 

When I did a PDP8 emulation, I had functions that would pack/unpack the 12 bit words in order to save ram space. ie: 4k 12 bit words became 6k bytes vs 8k bytes. This had a significant (negative)  impact on performance. 

 

Some suggestions:

 

declare your core_read and write functions as 'inline'

 

Think about the impact of your AField shift. Think about how you can avoid this and precompute a pointer only when it changes.

 

 

//decode

c1=(uint8_t)(inst>>9); -> how does the compiler translate this? as a byte shuffle and 1 shift or as 9 shifts? Hopefully the former. The AVR has only 1 bit shifts, so a number of shifts is going to be expensive vs the ARM which has a barrel shifter.

Last Edited: Fri. Apr 26, 2019 - 01:02 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm up to 270K so much improved over what I had.

 

I will follow dig through all the responses and see what else I can tweak!

 

Here is the memory functions I'm using now - I'm going to make the word ones have word addresses do I don't have to shift the address by 1 bit each time and see what that does next.

 

#ifndef FARMEM
#define FARMEM

#include <avr/io.h>

#define extmem_read_byte(addr)               \
    (__extension__({                         \
    uint8_t result;                          \
    asm volatile(                            \
      "in __tmp_reg__, %2\n"                 \
      "ldi r30,1\n"                          \
      "out %2, r30\n"                        \
      "movw r30, %1\n"                       \
      "ld %0, Z\n"                           \
      "out %2, __tmp_reg__\n"                \
      : "=r" (result)                        \
      : "r" (addr),                          \
        "I" (_SFR_IO_ADDR(RAMPZ))            \
      : "r30", "r31"                         \
    );                                       \
    result;                                  \
  }))

#define extmem_read_word(addr)               \
    (__extension__({                         \
    uint16_t result;                         \
    asm volatile(                            \
      "in __tmp_reg__, %2\n"                 \
      "ldi r30,1\n"                          \
      "out %2, r30\n"                        \
      "movw r30, %1\n"                       \
      "add r30, r30\n"                       \
      "adc r31, r31\n"                       \
      "ld %A0, Z+\n"                         \
      "ld %B0, Z\n"                          \
      "out %2, __tmp_reg__\n"                \
      : "=r" (result)                        \
      : "r" (addr),                          \
        "I" (_SFR_IO_ADDR(RAMPZ))            \
      : "r30", "r31"                         \
    );                                       \
    result;                                  \
  }))

#define extmem_write_byte(addr, data)        \
        (__extension__({                     \
    asm volatile(                            \
      "in __tmp_reg__, %1\n"                 \
      "ldi r30,1\n"                          \
      "out %1, r30\n"                        \
      "movw r30, %0\n"                       \
      "st Z+, %2\n"                          \
      "out %1, __tmp_reg__\n"                \
      :                                      \
      : "r" (addr),                          \
        "I" (_SFR_IO_ADDR(RAMPZ)),           \
        "r" ((uint8_t)(data))                \
      : "r30", "r31"                         \
    );                                       \
  }))

#define extmem_write_word(addr, data)        \
        (__extension__({                     \
    asm volatile(                            \
      "in __tmp_reg__, %1\n"                 \
      "ldi r30,1\n"                          \
      "out %1, r30\n"                        \
      "movw r30, %0\n"                       \
      "add r30, r30\n"                       \
      "adc r31, r31\n"                       \
      "st Z+, %A2\n"                         \
      "st Z, %B2\n"                          \
      "out %1, __tmp_reg__\n"                \
      :                                      \
      : "r" (addr),                          \
        "I" (_SFR_IO_ADDR(RAMPZ)),           \
        "r" (data)                           \
      : "r30", "r31"                         \
    );                                       \
  }))

#endif

edit: modified the above for word addressing on the word macros.

Last Edited: Fri. Apr 26, 2019 - 03:25 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Is there a word shift left instruction?

 

edit: figured it out:

 

add r,r

adc r,r

 

 

Last Edited: Fri. Apr 26, 2019 - 02:42 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Up to 315K.  This is at a baud friendly 29.4912 MHz, but I forgot that the scalable UART on the XMEGA can do very nice baud rates even at a 32 MHz clock.  That speed change should put me over the 333K at least with what it has now.  The bottom line is that I think the XMEGA will be workable which is much better than my initial instruction speed.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

There might be some advantage to putting the pdp8 regs in a struct. Since the avr has base+offset read/write, this might give the compiler an easier time accessing them. A study of the lss will show if the compiler is always loading up x,y or z to access the individual pdp8 regs.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Here is the pdp8_execute() function only:

void pdp8_execute(uint16_t ACount)
{
     c5a:	6f 92       	push	r6
     c5c:	7f 92       	push	r7
     c5e:	8f 92       	push	r8
     c60:	9f 92       	push	r9
     c62:	af 92       	push	r10
     c64:	bf 92       	push	r11
     c66:	cf 92       	push	r12
     c68:	df 92       	push	r13
     c6a:	ef 92       	push	r14
     c6c:	ff 92       	push	r15
     c6e:	0f 93       	push	r16
     c70:	1f 93       	push	r17
     c72:	cf 93       	push	r28
     c74:	df 93       	push	r29
  uint16_t operand_field;
  uint16_t operand_addr;
  uint16_t t1;

  //we are not halted
  pdp8_HALT=0;
     c76:	10 92 0d 21 	sts	0x210D, r1	; 0x80210d <pdp8_HALT>
     c7a:	40 91 15 21 	lds	r20, 0x2115	; 0x802115 <inst_op>
     c7e:	60 91 13 21 	lds	r22, 0x2113	; 0x802113 <pdp8_PC>
     c82:	70 91 14 21 	lds	r23, 0x2114	; 0x802114 <pdp8_PC+0x1>
     c86:	50 91 10 21 	lds	r21, 0x2110	; 0x802110 <pdp8_IF>
     c8a:	c0 90 12 21 	lds	r12, 0x2112	; 0x802112 <pdp8_IBR>
                              pdp8_LINK_AC|=pdp8_SFR;
                              break;

                            case 04:
                              //rmf
                              pdp8_IBR=pdp8_SFR>>4;
     c8e:	20 91 11 21 	lds	r18, 0x2111	; 0x802111 <pdp8_SFR>
     c92:	72 2e       	mov	r7, r18
     c94:	72 94       	swap	r7
     c96:	af e0       	ldi	r26, 0x0F	; 15
     c98:	7a 22       	and	r7, r26
                              pdp8_DF=pdp8_SFR & 0007;
     c9a:	32 2f       	mov	r19, r18
     c9c:	37 70       	andi	r19, 0x07	; 7
     c9e:	63 2e       	mov	r6, r19
                              pdp8_LINK_AC|=pdp8_IF << 3;
                              break;

                            case 03:
                              //rib
                              pdp8_LINK_AC|=pdp8_SFR;
     ca0:	82 2e       	mov	r8, r18
     ca2:	91 2c       	mov	r9, r1
     ca4:	d0 90 18 21 	lds	r13, 0x2118	; 0x802118 <pdp8_DF>
     ca8:	c0 91 16 21 	lds	r28, 0x2116	; 0x802116 <pdp8_LINK_AC>
     cac:	d0 91 17 21 	lds	r29, 0x2117	; 0x802117 <pdp8_LINK_AC+0x1>
     cb0:	00 91 0e 21 	lds	r16, 0x210E	; 0x80210e <pdp8_MQ>
     cb4:	10 91 0f 21 	lds	r17, 0x210F	; 0x80210f <pdp8_MQ+0x1>
                    if (inst & PDP8_BIT_04)
                      pdp8_LINK_AC&=MASK_LINK; //cla

                    //osr hlt
                    if (inst & PDP8_BIT_09)
                      pdp8_LINK_AC|=pdp8_SR; //osr
     cb8:	a0 90 0b 21 	lds	r10, 0x210B	; 0x80210b <pdp8_SR>
     cbc:	b0 90 0c 21 	lds	r11, 0x210C	; 0x80210c <pdp8_SR+0x1>
  uint16_t t1;

  //we are not halted
  pdp8_HALT=0;

  while (ACount--)
     cc0:	00 97       	sbiw	r24, 0x00	; 0
     cc2:	09 f4       	brne	.+2      	; 0xcc6 <pdp8_execute+0x6c>
     cc4:	70 c0       	rjmp	.+224    	; 0xda6 <pdp8_execute+0x14c>
    {
      //fetch this instruction
      inst=extmem_read_word((pdp8_IF<<12) | pdp8_PC);
     cc6:	a5 2f       	mov	r26, r21
     cc8:	b0 e0       	ldi	r27, 0x00	; 0
     cca:	7d 01       	movw	r14, r26
     ccc:	fe 2c       	mov	r15, r14
     cce:	ee 24       	eor	r14, r14
     cd0:	f2 94       	swap	r15
     cd2:	f0 ef       	ldi	r31, 0xF0	; 240
     cd4:	ff 22       	and	r15, r31
     cd6:	97 01       	movw	r18, r14
     cd8:	26 2b       	or	r18, r22
     cda:	37 2b       	or	r19, r23
     cdc:	0b b6       	in	r0, 0x3b	; 59
     cde:	e1 e0       	ldi	r30, 0x01	; 1
     ce0:	eb bf       	out	0x3b, r30	; 59
     ce2:	f9 01       	movw	r30, r18
     ce4:	ee 0f       	add	r30, r30
     ce6:	ff 1f       	adc	r31, r31
     ce8:	21 91       	ld	r18, Z+
     cea:	30 81       	ld	r19, Z
     cec:	0b be       	out	0x3b, r0	; 59

      //decode
      inst_op=inst>>9;
     cee:	43 2f       	mov	r20, r19
     cf0:	46 95       	lsr	r20

      //operand
      if (inst_op<=INST_JMP)
     cf2:	46 30       	cpi	r20, 0x06	; 6
     cf4:	08 f0       	brcs	.+2      	; 0xcf8 <pdp8_execute+0x9e>
     cf6:	7a c0       	rjmp	.+244    	; 0xdec <pdp8_execute+0x192>
        {
          //keep lowest 7 bits (becomes page zero)
          operand_addr=inst & 00177;
     cf8:	d9 01       	movw	r26, r18
     cfa:	af 77       	andi	r26, 0x7F	; 127
     cfc:	bb 27       	eor	r27, r27

          //if the memory page bit is set (current page), add the five bits from the instruction address
          if (inst & PDP8_BIT_04)
     cfe:	27 ff       	sbrs	r18, 7
     d00:	05 c0       	rjmp	.+10     	; 0xd0c <pdp8_execute+0xb2>
            operand_addr|=pdp8_PC & 07600;
     d02:	fb 01       	movw	r30, r22
     d04:	e0 78       	andi	r30, 0x80	; 128
     d06:	ff 70       	andi	r31, 0x0F	; 15
     d08:	ae 2b       	or	r26, r30
     d0a:	bf 2b       	or	r27, r31

          //if the indirect bit is set, get the value from memory
          if (inst & PDP8_BIT_03)
     d0c:	30 ff       	sbrs	r19, 0
     d0e:	a9 c0       	rjmp	.+338    	; 0xe62 <pdp8_execute+0x208>
            {         
              //increment auto increment memory locations if necessary
              if ((operand_addr & 07770)==00010)
     d10:	9d 01       	movw	r18, r26
     d12:	28 7f       	andi	r18, 0xF8	; 248
     d14:	3f 70       	andi	r19, 0x0F	; 15
     d16:	28 30       	cpi	r18, 0x08	; 8
     d18:	31 05       	cpc	r19, r1
     d1a:	09 f4       	brne	.+2      	; 0xd1e <pdp8_execute+0xc4>
     d1c:	75 c1       	rjmp	.+746    	; 0x1008 <pdp8_execute+0x3ae>
     d1e:	9d 01       	movw	r18, r26
     d20:	2e 29       	or	r18, r14
     d22:	3f 29       	or	r19, r15
                extmem_write_word((pdp8_IF<<12) | operand_addr,(extmem_read_word((pdp8_IF<<12) | operand_addr)+1) & MASK_12BITS);

              //indirect
              operand_addr=extmem_read_word((pdp8_IF<<12) | operand_addr);
     d24:	0b b6       	in	r0, 0x3b	; 59
     d26:	e1 e0       	ldi	r30, 0x01	; 1
     d28:	eb bf       	out	0x3b, r30	; 59
     d2a:	f9 01       	movw	r30, r18
     d2c:	ee 0f       	add	r30, r30
     d2e:	ff 1f       	adc	r31, r31
     d30:	21 91       	ld	r18, Z+
     d32:	30 81       	ld	r19, Z
     d34:	0b be       	out	0x3b, r0	; 59
     d36:	d9 01       	movw	r26, r18

              operand_field=pdp8_DF<<12;
     d38:	2d 2d       	mov	r18, r13
     d3a:	30 e0       	ldi	r19, 0x00	; 0
     d3c:	32 2f       	mov	r19, r18
     d3e:	22 27       	eor	r18, r18
     d40:	32 95       	swap	r19
     d42:	30 7f       	andi	r19, 0xF0	; 240
            }
          else operand_field=pdp8_IF<<12;

          //increment pc
          pdp8_PC=(pdp8_PC+1) & MASK_12BITS;
     d44:	6f 5f       	subi	r22, 0xFF	; 255
     d46:	7f 4f       	sbci	r23, 0xFF	; 255
     d48:	7f 70       	andi	r23, 0x0F	; 15


          //instruction
          switch (inst_op)
     d4a:	43 30       	cpi	r20, 0x03	; 3
     d4c:	09 f4       	brne	.+2      	; 0xd50 <pdp8_execute+0xf6>
     d4e:	90 c0       	rjmp	.+288    	; 0xe70 <pdp8_execute+0x216>
     d50:	44 30       	cpi	r20, 0x04	; 4
     d52:	08 f0       	brcs	.+2      	; 0xd56 <pdp8_execute+0xfc>
     d54:	9e c0       	rjmp	.+316    	; 0xe92 <pdp8_execute+0x238>
     d56:	41 30       	cpi	r20, 0x01	; 1
     d58:	09 f4       	brne	.+2      	; 0xd5c <pdp8_execute+0x102>
     d5a:	34 c1       	rjmp	.+616    	; 0xfc4 <pdp8_execute+0x36a>
     d5c:	42 30       	cpi	r20, 0x02	; 2
     d5e:	09 f0       	breq	.+2      	; 0xd62 <pdp8_execute+0x108>
     d60:	21 c1       	rjmp	.+578    	; 0xfa4 <pdp8_execute+0x34a>
              case INST_TAD:
                pdp8_LINK_AC=(pdp8_LINK_AC+extmem_read_word(operand_field | operand_addr)) & MASK_13BITS;
                break;

              case INST_ISZ:
                t1=(extmem_read_word(operand_field | operand_addr)+1) & MASK_12BITS;
     d62:	2a 2b       	or	r18, r26
     d64:	3b 2b       	or	r19, r27
     d66:	0b b6       	in	r0, 0x3b	; 59
     d68:	e1 e0       	ldi	r30, 0x01	; 1
     d6a:	eb bf       	out	0x3b, r30	; 59
     d6c:	f9 01       	movw	r30, r18
     d6e:	ee 0f       	add	r30, r30
     d70:	ff 1f       	adc	r31, r31
     d72:	e1 90       	ld	r14, Z+
     d74:	f0 80       	ld	r15, Z
     d76:	0b be       	out	0x3b, r0	; 59
     d78:	ef ef       	ldi	r30, 0xFF	; 255
     d7a:	ee 1a       	sub	r14, r30
     d7c:	fe 0a       	sbc	r15, r30
     d7e:	ef e0       	ldi	r30, 0x0F	; 15
     d80:	fe 22       	and	r15, r30
                extmem_write_word(operand_field | operand_addr,t1);
     d82:	0b b6       	in	r0, 0x3b	; 59
     d84:	e1 e0       	ldi	r30, 0x01	; 1
     d86:	eb bf       	out	0x3b, r30	; 59
     d88:	f9 01       	movw	r30, r18
     d8a:	ee 0f       	add	r30, r30
     d8c:	ff 1f       	adc	r31, r31
     d8e:	e1 92       	st	Z+, r14
     d90:	f0 82       	st	Z, r15
     d92:	0b be       	out	0x3b, r0	; 59
                if (t1==0)
     d94:	ef 28       	or	r14, r15
     d96:	19 f4       	brne	.+6      	; 0xd9e <pdp8_execute+0x144>
                  pdp8_PC=(pdp8_PC+1) & MASK_12BITS;
     d98:	6f 5f       	subi	r22, 0xFF	; 255
     d9a:	7f 4f       	sbci	r23, 0xFF	; 255
     d9c:	7f 70       	andi	r23, 0x0F	; 15
     d9e:	01 97       	sbiw	r24, 0x01	; 1
  uint16_t t1;

  //we are not halted
  pdp8_HALT=0;

  while (ACount--)
     da0:	00 97       	sbiw	r24, 0x00	; 0
     da2:	09 f0       	breq	.+2      	; 0xda6 <pdp8_execute+0x14c>
     da4:	90 cf       	rjmp	.-224    	; 0xcc6 <pdp8_execute+0x6c>
     da6:	50 93 10 21 	sts	0x2110, r21	; 0x802110 <pdp8_IF>
     daa:	60 93 13 21 	sts	0x2113, r22	; 0x802113 <pdp8_PC>
     dae:	70 93 14 21 	sts	0x2114, r23	; 0x802114 <pdp8_PC+0x1>
     db2:	40 93 15 21 	sts	0x2115, r20	; 0x802115 <inst_op>
     db6:	d0 92 18 21 	sts	0x2118, r13	; 0x802118 <pdp8_DF>
     dba:	c0 93 16 21 	sts	0x2116, r28	; 0x802116 <pdp8_LINK_AC>
     dbe:	d0 93 17 21 	sts	0x2117, r29	; 0x802117 <pdp8_LINK_AC+0x1>
     dc2:	c0 92 12 21 	sts	0x2112, r12	; 0x802112 <pdp8_IBR>
     dc6:	00 93 0e 21 	sts	0x210E, r16	; 0x80210e <pdp8_MQ>
     dca:	10 93 0f 21 	sts	0x210F, r17	; 0x80210f <pdp8_MQ+0x1>
                  }
                break;
            }
        }
    }
}
     dce:	df 91       	pop	r29
     dd0:	cf 91       	pop	r28
     dd2:	1f 91       	pop	r17
     dd4:	0f 91       	pop	r16
     dd6:	ff 90       	pop	r15
     dd8:	ef 90       	pop	r14
     dda:	df 90       	pop	r13
     ddc:	cf 90       	pop	r12
     dde:	bf 90       	pop	r11
     de0:	af 90       	pop	r10
     de2:	9f 90       	pop	r9
     de4:	8f 90       	pop	r8
     de6:	7f 90       	pop	r7
     de8:	6f 90       	pop	r6
     dea:	08 95       	ret
            }
        }
      else
        {
          //increment pc
          pdp8_PC=(pdp8_PC+1) & MASK_12BITS;
     dec:	6f 5f       	subi	r22, 0xFF	; 255
     dee:	7f 4f       	sbci	r23, 0xFF	; 255
     df0:	7f 70       	andi	r23, 0x0F	; 15

          //instruction
          switch (inst_op)
     df2:	46 30       	cpi	r20, 0x06	; 6
     df4:	09 f4       	brne	.+2      	; 0xdf8 <pdp8_execute+0x19e>
     df6:	57 c0       	rjmp	.+174    	; 0xea6 <pdp8_execute+0x24c>
     df8:	47 30       	cpi	r20, 0x07	; 7
     dfa:	89 f6       	brne	.-94     	; 0xd9e <pdp8_execute+0x144>
                      break;
                  }
                break;

              case INST_OPR:
                if ((inst & PDP8_BIT_03)==0)
     dfc:	30 fd       	sbrc	r19, 0
     dfe:	8e c0       	rjmp	.+284    	; 0xf1c <pdp8_execute+0x2c2>
                  {
                    //group 1

                    //cla cll
                    if (inst & PDP8_BIT_04)
     e00:	27 ff       	sbrs	r18, 7
     e02:	02 c0       	rjmp	.+4      	; 0xe08 <pdp8_execute+0x1ae>
                      pdp8_LINK_AC&=MASK_LINK; //cla
     e04:	cc 27       	eor	r28, r28
     e06:	d0 71       	andi	r29, 0x10	; 16
                    if (inst & PDP8_BIT_05)
     e08:	26 fd       	sbrc	r18, 6
                      pdp8_LINK_AC&=MASK_12BITS; //cll
     e0a:	df 70       	andi	r29, 0x0F	; 15

                    //cma cml
                    if (inst & PDP8_BIT_06)
     e0c:	25 ff       	sbrs	r18, 5
     e0e:	03 c0       	rjmp	.+6      	; 0xe16 <pdp8_execute+0x1bc>
                      pdp8_LINK_AC^=MASK_12BITS; //cma
     e10:	c0 95       	com	r28
     e12:	ef e0       	ldi	r30, 0x0F	; 15
     e14:	de 27       	eor	r29, r30
                    if (inst & PDP8_BIT_07)
     e16:	24 ff       	sbrs	r18, 4
     e18:	02 c0       	rjmp	.+4      	; 0xe1e <pdp8_execute+0x1c4>
                      pdp8_LINK_AC^=MASK_LINK; //cml
     e1a:	e0 e1       	ldi	r30, 0x10	; 16
     e1c:	de 27       	eor	r29, r30

                    //iac
                    if (inst & PDP8_BIT_11)
     e1e:	20 ff       	sbrs	r18, 0
     e20:	02 c0       	rjmp	.+4      	; 0xe26 <pdp8_execute+0x1cc>
                      pdp8_LINK_AC=(pdp8_LINK_AC+1) & MASK_13BITS; //iac
     e22:	21 96       	adiw	r28, 0x01	; 1
     e24:	df 71       	andi	r29, 0x1F	; 31
              
                    //rar ral rtr rtl bsw
                    switch (inst & (PDP8_BIT_08 | PDP8_BIT_09 | PDP8_BIT_10))
     e26:	2e 70       	andi	r18, 0x0E	; 14
     e28:	33 27       	eor	r19, r19
     e2a:	26 30       	cpi	r18, 0x06	; 6
     e2c:	31 05       	cpc	r19, r1
     e2e:	09 f4       	brne	.+2      	; 0xe32 <pdp8_execute+0x1d8>
     e30:	71 c1       	rjmp	.+738    	; 0x1114 <pdp8_execute+0x4ba>
     e32:	08 f0       	brcs	.+2      	; 0xe36 <pdp8_execute+0x1dc>
     e34:	0f c1       	rjmp	.+542    	; 0x1054 <pdp8_execute+0x3fa>
     e36:	22 30       	cpi	r18, 0x02	; 2
     e38:	31 05       	cpc	r19, r1
     e3a:	09 f4       	brne	.+2      	; 0xe3e <pdp8_execute+0x1e4>
     e3c:	2f c1       	rjmp	.+606    	; 0x109c <pdp8_execute+0x442>
     e3e:	24 30       	cpi	r18, 0x04	; 4
     e40:	31 05       	cpc	r19, r1
     e42:	09 f0       	breq	.+2      	; 0xe46 <pdp8_execute+0x1ec>
     e44:	ac cf       	rjmp	.-168    	; 0xd9e <pdp8_execute+0x144>
                        case PDP8_BIT_08 | PDP8_BIT_10:
                          pdp8_LINK_AC=((pdp8_LINK_AC>>2) | (pdp8_LINK_AC<<11)) & MASK_13BITS; //rtr
                          break;

                        case PDP8_BIT_09:
                          pdp8_LINK_AC=((pdp8_LINK_AC<<1) | ((pdp8_LINK_AC>>12) & 01)) & MASK_13BITS; //ral
     e46:	9e 01       	movw	r18, r28
     e48:	23 2f       	mov	r18, r19
     e4a:	33 27       	eor	r19, r19
     e4c:	22 95       	swap	r18
     e4e:	2f 70       	andi	r18, 0x0F	; 15
     e50:	21 70       	andi	r18, 0x01	; 1
     e52:	33 27       	eor	r19, r19
     e54:	cc 0f       	add	r28, r28
     e56:	dd 1f       	adc	r29, r29
     e58:	c2 2b       	or	r28, r18
     e5a:	d3 2b       	or	r29, r19
     e5c:	df 71       	andi	r29, 0x1F	; 31
     e5e:	01 97       	sbiw	r24, 0x01	; 1
     e60:	9f cf       	rjmp	.-194    	; 0xda0 <pdp8_execute+0x146>
              //indirect
              operand_addr=extmem_read_word((pdp8_IF<<12) | operand_addr);

              operand_field=pdp8_DF<<12;
            }
          else operand_field=pdp8_IF<<12;
     e62:	97 01       	movw	r18, r14

          //increment pc
          pdp8_PC=(pdp8_PC+1) & MASK_12BITS;
     e64:	6f 5f       	subi	r22, 0xFF	; 255
     e66:	7f 4f       	sbci	r23, 0xFF	; 255
     e68:	7f 70       	andi	r23, 0x0F	; 15


          //instruction
          switch (inst_op)
     e6a:	43 30       	cpi	r20, 0x03	; 3
     e6c:	09 f0       	breq	.+2      	; 0xe70 <pdp8_execute+0x216>
     e6e:	70 cf       	rjmp	.-288    	; 0xd50 <pdp8_execute+0xf6>
                if (t1==0)
                  pdp8_PC=(pdp8_PC+1) & MASK_12BITS;
                break;

              case INST_DCA:
                extmem_write_word(operand_field | operand_addr,pdp8_LINK_AC & MASK_12BITS);
     e70:	2a 2b       	or	r18, r26
     e72:	3b 2b       	or	r19, r27
     e74:	de 01       	movw	r26, r28
     e76:	bf 70       	andi	r27, 0x0F	; 15
     e78:	0b b6       	in	r0, 0x3b	; 59
     e7a:	e1 e0       	ldi	r30, 0x01	; 1
     e7c:	eb bf       	out	0x3b, r30	; 59
     e7e:	f9 01       	movw	r30, r18
     e80:	ee 0f       	add	r30, r30
     e82:	ff 1f       	adc	r31, r31
     e84:	a1 93       	st	Z+, r26
     e86:	b0 83       	st	Z, r27
     e88:	0b be       	out	0x3b, r0	; 59
                pdp8_LINK_AC&=MASK_LINK;
     e8a:	cc 27       	eor	r28, r28
     e8c:	d0 71       	andi	r29, 0x10	; 16
     e8e:	01 97       	sbiw	r24, 0x01	; 1
     e90:	87 cf       	rjmp	.-242    	; 0xda0 <pdp8_execute+0x146>
          //increment pc
          pdp8_PC=(pdp8_PC+1) & MASK_12BITS;


          //instruction
          switch (inst_op)
     e92:	44 30       	cpi	r20, 0x04	; 4
     e94:	09 f4       	brne	.+2      	; 0xe98 <pdp8_execute+0x23e>
     e96:	a6 c0       	rjmp	.+332    	; 0xfe4 <pdp8_execute+0x38a>
     e98:	45 30       	cpi	r20, 0x05	; 5
     e9a:	09 f0       	breq	.+2      	; 0xe9e <pdp8_execute+0x244>
     e9c:	83 c0       	rjmp	.+262    	; 0xfa4 <pdp8_execute+0x34a>
                pdp8_PC=(operand_addr+1) & MASK_12BITS;
                pdp8_IF=pdp8_IBR;
                break;

              case INST_JMP:
                pdp8_PC=operand_addr;
     e9e:	bd 01       	movw	r22, r26
          //increment pc
          pdp8_PC=(pdp8_PC+1) & MASK_12BITS;


          //instruction
          switch (inst_op)
     ea0:	5c 2d       	mov	r21, r12
     ea2:	01 97       	sbiw	r24, 0x01	; 1
     ea4:	7d cf       	rjmp	.-262    	; 0xda0 <pdp8_execute+0x146>

          //instruction
          switch (inst_op)
            {
              case INST_IOT:
                switch ((inst & 00700)>>6)
     ea6:	f9 01       	movw	r30, r18
     ea8:	e0 7c       	andi	r30, 0xC0	; 192
     eaa:	f1 70       	andi	r31, 0x01	; 1
     eac:	e0 38       	cpi	r30, 0x80	; 128
     eae:	f1 05       	cpc	r31, r1
     eb0:	09 f0       	breq	.+2      	; 0xeb4 <pdp8_execute+0x25a>
     eb2:	75 cf       	rjmp	.-278    	; 0xd9e <pdp8_execute+0x144>
                        }
                      break;

                    case 02:
                      //62XX
                      if (inst & PDP8_BIT_11)
     eb4:	20 ff       	sbrs	r18, 0
     eb6:	0a c0       	rjmp	.+20     	; 0xecc <pdp8_execute+0x272>
                        pdp8_DF=(inst & 00070)>>3;
     eb8:	f9 01       	movw	r30, r18
     eba:	e8 73       	andi	r30, 0x38	; 56
     ebc:	ff 27       	eor	r31, r31
     ebe:	f6 95       	lsr	r31
     ec0:	e7 95       	ror	r30
     ec2:	f6 95       	lsr	r31
     ec4:	e7 95       	ror	r30
     ec6:	f6 95       	lsr	r31
     ec8:	e7 95       	ror	r30
     eca:	de 2e       	mov	r13, r30
                      if (inst & PDP8_BIT_10)
     ecc:	21 ff       	sbrs	r18, 1
     ece:	0a c0       	rjmp	.+20     	; 0xee4 <pdp8_execute+0x28a>
                        pdp8_IBR=(inst & 00070)>>3;
     ed0:	f9 01       	movw	r30, r18
     ed2:	e8 73       	andi	r30, 0x38	; 56
     ed4:	ff 27       	eor	r31, r31
     ed6:	f6 95       	lsr	r31
     ed8:	e7 95       	ror	r30
     eda:	f6 95       	lsr	r31
     edc:	e7 95       	ror	r30
     ede:	f6 95       	lsr	r31
     ee0:	e7 95       	ror	r30
     ee2:	ce 2e       	mov	r12, r30
                      if (inst & PDP8_BIT_09)
     ee4:	22 ff       	sbrs	r18, 2
     ee6:	5b cf       	rjmp	.-330    	; 0xd9e <pdp8_execute+0x144>
                        switch ((inst & 00070)>>3)
     ee8:	28 73       	andi	r18, 0x38	; 56
     eea:	33 27       	eor	r19, r19
     eec:	36 95       	lsr	r19
     eee:	27 95       	ror	r18
     ef0:	36 95       	lsr	r19
     ef2:	27 95       	ror	r18
     ef4:	36 95       	lsr	r19
     ef6:	27 95       	ror	r18
     ef8:	22 30       	cpi	r18, 0x02	; 2
     efa:	31 05       	cpc	r19, r1
     efc:	09 f4       	brne	.+2      	; 0xf00 <pdp8_execute+0x2a6>
     efe:	1f c1       	rjmp	.+574    	; 0x113e <pdp8_execute+0x4e4>
     f00:	08 f4       	brcc	.+2      	; 0xf04 <pdp8_execute+0x2aa>
     f02:	f8 c0       	rjmp	.+496    	; 0x10f4 <pdp8_execute+0x49a>
     f04:	23 30       	cpi	r18, 0x03	; 3
     f06:	31 05       	cpc	r19, r1
     f08:	09 f4       	brne	.+2      	; 0xf0c <pdp8_execute+0x2b2>
     f0a:	15 c1       	rjmp	.+554    	; 0x1136 <pdp8_execute+0x4dc>
     f0c:	24 30       	cpi	r18, 0x04	; 4
     f0e:	31 05       	cpc	r19, r1
     f10:	09 f0       	breq	.+2      	; 0xf14 <pdp8_execute+0x2ba>
     f12:	45 cf       	rjmp	.-374    	; 0xd9e <pdp8_execute+0x144>
                              pdp8_LINK_AC|=pdp8_SFR;
                              break;

                            case 04:
                              //rmf
                              pdp8_IBR=pdp8_SFR>>4;
     f14:	c7 2c       	mov	r12, r7
                              pdp8_DF=pdp8_SFR & 0007;
     f16:	d6 2c       	mov	r13, r6
     f18:	01 97       	sbiw	r24, 0x01	; 1
     f1a:	42 cf       	rjmp	.-380    	; 0xda0 <pdp8_execute+0x146>
                          pdp8_LINK_AC=( ((pdp8_LINK_AC>>6)&00077) | ((pdp8_LINK_AC<<6)&07700) | (pdp8_LINK_AC & MASK_LINK) ) & MASK_13BITS; //bsw
                          break;
                      }
                  }
                else
                if ((inst & PDP8_BIT_11)==0)
     f1c:	f9 01       	movw	r30, r18
     f1e:	e1 70       	andi	r30, 0x01	; 1
     f20:	ff 27       	eor	r31, r31
     f22:	20 fd       	sbrc	r18, 0
     f24:	89 c0       	rjmp	.+274    	; 0x1038 <pdp8_execute+0x3de>

                    //sma sza snl spa sna szl
                    if (((((inst & PDP8_BIT_05) && (pdp8_LINK_AC & PDP8_BIT_00)) ||    //sma spa
                          ((inst & PDP8_BIT_06) && (pdp8_LINK_AC & MASK_12BITS)==0) || //sza sna
                          ((inst & PDP8_BIT_07) && (pdp8_LINK_AC & MASK_LINK))         //snl szl
                       ) ? 0 : PDP8_BIT_08)==(inst & PDP8_BIT_08))
     f26:	26 ff       	sbrs	r18, 6
     f28:	2b c0       	rjmp	.+86     	; 0xf80 <pdp8_execute+0x326>
                if ((inst & PDP8_BIT_11)==0)
                  {
                    //group 2

                    //sma sza snl spa sna szl
                    if (((((inst & PDP8_BIT_05) && (pdp8_LINK_AC & PDP8_BIT_00)) ||    //sma spa
     f2a:	d3 ff       	sbrs	r29, 3
     f2c:	29 c0       	rjmp	.+82     	; 0xf80 <pdp8_execute+0x326>
     f2e:	d9 01       	movw	r26, r18
     f30:	a8 70       	andi	r26, 0x08	; 8
     f32:	bb 27       	eor	r27, r27
     f34:	ea 17       	cp	r30, r26
     f36:	fb 07       	cpc	r31, r27
     f38:	89 f1       	breq	.+98     	; 0xf9c <pdp8_execute+0x342>
                          ((inst & PDP8_BIT_07) && (pdp8_LINK_AC & MASK_LINK))         //snl szl
                       ) ? 0 : PDP8_BIT_08)==(inst & PDP8_BIT_08))
                      pdp8_PC=(pdp8_PC+1) & MASK_12BITS;

                    //cla
                    if (inst & PDP8_BIT_04)
     f3a:	27 ff       	sbrs	r18, 7
     f3c:	02 c0       	rjmp	.+4      	; 0xf42 <pdp8_execute+0x2e8>
                      pdp8_LINK_AC&=MASK_LINK; //cla
     f3e:	cc 27       	eor	r28, r28
     f40:	d0 71       	andi	r29, 0x10	; 16

                    //osr hlt
                    if (inst & PDP8_BIT_09)
     f42:	22 ff       	sbrs	r18, 2
     f44:	02 c0       	rjmp	.+4      	; 0xf4a <pdp8_execute+0x2f0>
                      pdp8_LINK_AC|=pdp8_SR; //osr
     f46:	ca 29       	or	r28, r10
     f48:	db 29       	or	r29, r11

                    if (inst & PDP8_BIT_10)
     f4a:	21 ff       	sbrs	r18, 1
     f4c:	28 cf       	rjmp	.-432    	; 0xd9e <pdp8_execute+0x144>
     f4e:	50 93 10 21 	sts	0x2110, r21	; 0x802110 <pdp8_IF>
     f52:	60 93 13 21 	sts	0x2113, r22	; 0x802113 <pdp8_PC>
     f56:	70 93 14 21 	sts	0x2114, r23	; 0x802114 <pdp8_PC+0x1>
     f5a:	87 e0       	ldi	r24, 0x07	; 7
     f5c:	80 93 15 21 	sts	0x2115, r24	; 0x802115 <inst_op>
     f60:	d0 92 18 21 	sts	0x2118, r13	; 0x802118 <pdp8_DF>
     f64:	c0 93 16 21 	sts	0x2116, r28	; 0x802116 <pdp8_LINK_AC>
     f68:	d0 93 17 21 	sts	0x2117, r29	; 0x802117 <pdp8_LINK_AC+0x1>
     f6c:	c0 92 12 21 	sts	0x2112, r12	; 0x802112 <pdp8_IBR>
     f70:	00 93 0e 21 	sts	0x210E, r16	; 0x80210e <pdp8_MQ>
     f74:	10 93 0f 21 	sts	0x210F, r17	; 0x80210f <pdp8_MQ+0x1>
                      {
                        pdp8_HALT=1; //hlt              
     f78:	81 e0       	ldi	r24, 0x01	; 1
     f7a:	80 93 0d 21 	sts	0x210D, r24	; 0x80210d <pdp8_HALT>
                        return;
     f7e:	27 cf       	rjmp	.-434    	; 0xdce <pdp8_execute+0x174>
                if ((inst & PDP8_BIT_11)==0)
                  {
                    //group 2

                    //sma sza snl spa sna szl
                    if (((((inst & PDP8_BIT_05) && (pdp8_LINK_AC & PDP8_BIT_00)) ||    //sma spa
     f80:	25 fd       	sbrc	r18, 5
     f82:	84 c0       	rjmp	.+264    	; 0x108c <pdp8_execute+0x432>
                          ((inst & PDP8_BIT_06) && (pdp8_LINK_AC & MASK_12BITS)==0) || //sza sna
     f84:	24 ff       	sbrs	r18, 4
     f86:	02 c0       	rjmp	.+4      	; 0xf8c <pdp8_execute+0x332>
                          ((inst & PDP8_BIT_07) && (pdp8_LINK_AC & MASK_LINK))         //snl szl
     f88:	d4 fd       	sbrc	r29, 4
     f8a:	d1 cf       	rjmp	.-94     	; 0xf2e <pdp8_execute+0x2d4>
                       ) ? 0 : PDP8_BIT_08)==(inst & PDP8_BIT_08))
     f8c:	e8 e0       	ldi	r30, 0x08	; 8
     f8e:	f0 e0       	ldi	r31, 0x00	; 0
                if ((inst & PDP8_BIT_11)==0)
                  {
                    //group 2

                    //sma sza snl spa sna szl
                    if (((((inst & PDP8_BIT_05) && (pdp8_LINK_AC & PDP8_BIT_00)) ||    //sma spa
     f90:	d9 01       	movw	r26, r18
     f92:	a8 70       	andi	r26, 0x08	; 8
     f94:	bb 27       	eor	r27, r27
     f96:	ea 17       	cp	r30, r26
     f98:	fb 07       	cpc	r31, r27
     f9a:	79 f6       	brne	.-98     	; 0xf3a <pdp8_execute+0x2e0>
                          ((inst & PDP8_BIT_06) && (pdp8_LINK_AC & MASK_12BITS)==0) || //sza sna
                          ((inst & PDP8_BIT_07) && (pdp8_LINK_AC & MASK_LINK))         //snl szl
                       ) ? 0 : PDP8_BIT_08)==(inst & PDP8_BIT_08))
                      pdp8_PC=(pdp8_PC+1) & MASK_12BITS;
     f9c:	6f 5f       	subi	r22, 0xFF	; 255
     f9e:	7f 4f       	sbci	r23, 0xFF	; 255
     fa0:	7f 70       	andi	r23, 0x0F	; 15
     fa2:	cb cf       	rjmp	.-106    	; 0xf3a <pdp8_execute+0x2e0>
          //instruction
          switch (inst_op)
            {
             
              case INST_AND:
                pdp8_LINK_AC&=extmem_read_word(operand_field | operand_addr) | MASK_LINK;
     fa4:	2a 2b       	or	r18, r26
     fa6:	3b 2b       	or	r19, r27
     fa8:	0b b6       	in	r0, 0x3b	; 59
     faa:	e1 e0       	ldi	r30, 0x01	; 1
     fac:	eb bf       	out	0x3b, r30	; 59
     fae:	f9 01       	movw	r30, r18
     fb0:	ee 0f       	add	r30, r30
     fb2:	ff 1f       	adc	r31, r31
     fb4:	21 91       	ld	r18, Z+
     fb6:	30 81       	ld	r19, Z
     fb8:	0b be       	out	0x3b, r0	; 59
     fba:	30 61       	ori	r19, 0x10	; 16
     fbc:	c2 23       	and	r28, r18
     fbe:	d3 23       	and	r29, r19
     fc0:	01 97       	sbiw	r24, 0x01	; 1
     fc2:	ee ce       	rjmp	.-548    	; 0xda0 <pdp8_execute+0x146>
                break;

              case INST_TAD:
                pdp8_LINK_AC=(pdp8_LINK_AC+extmem_read_word(operand_field | operand_addr)) & MASK_13BITS;
     fc4:	2a 2b       	or	r18, r26
     fc6:	3b 2b       	or	r19, r27
     fc8:	0b b6       	in	r0, 0x3b	; 59
     fca:	e1 e0       	ldi	r30, 0x01	; 1
     fcc:	eb bf       	out	0x3b, r30	; 59
     fce:	f9 01       	movw	r30, r18
     fd0:	ee 0f       	add	r30, r30
     fd2:	ff 1f       	adc	r31, r31
     fd4:	21 91       	ld	r18, Z+
     fd6:	30 81       	ld	r19, Z
     fd8:	0b be       	out	0x3b, r0	; 59
     fda:	c2 0f       	add	r28, r18
     fdc:	d3 1f       	adc	r29, r19
     fde:	df 71       	andi	r29, 0x1F	; 31
     fe0:	01 97       	sbiw	r24, 0x01	; 1
     fe2:	de ce       	rjmp	.-580    	; 0xda0 <pdp8_execute+0x146>
                extmem_write_word(operand_field | operand_addr,pdp8_LINK_AC & MASK_12BITS);
                pdp8_LINK_AC&=MASK_LINK;
                break;

              case INST_JMS:
                extmem_write_word((pdp8_IF<<12) | operand_addr,pdp8_PC);
     fe4:	ea 2a       	or	r14, r26
     fe6:	fb 2a       	or	r15, r27
     fe8:	0b b6       	in	r0, 0x3b	; 59
     fea:	e1 e0       	ldi	r30, 0x01	; 1
     fec:	eb bf       	out	0x3b, r30	; 59
     fee:	f7 01       	movw	r30, r14
     ff0:	ee 0f       	add	r30, r30
     ff2:	ff 1f       	adc	r31, r31
     ff4:	61 93       	st	Z+, r22
     ff6:	70 83       	st	Z, r23
     ff8:	0b be       	out	0x3b, r0	; 59
                pdp8_PC=(operand_addr+1) & MASK_12BITS;
     ffa:	bd 01       	movw	r22, r26
     ffc:	6f 5f       	subi	r22, 0xFF	; 255
     ffe:	7f 4f       	sbci	r23, 0xFF	; 255
    1000:	7f 70       	andi	r23, 0x0F	; 15
                pdp8_IF=pdp8_IBR;
                break;
    1002:	5c 2d       	mov	r21, r12
    1004:	01 97       	sbiw	r24, 0x01	; 1
    1006:	cc ce       	rjmp	.-616    	; 0xda0 <pdp8_execute+0x146>
          //if the indirect bit is set, get the value from memory
          if (inst & PDP8_BIT_03)
            {         
              //increment auto increment memory locations if necessary
              if ((operand_addr & 07770)==00010)
                extmem_write_word((pdp8_IF<<12) | operand_addr,(extmem_read_word((pdp8_IF<<12) | operand_addr)+1) & MASK_12BITS);
    1008:	9d 01       	movw	r18, r26
    100a:	2e 29       	or	r18, r14
    100c:	3f 29       	or	r19, r15
    100e:	0b b6       	in	r0, 0x3b	; 59
    1010:	e1 e0       	ldi	r30, 0x01	; 1
    1012:	eb bf       	out	0x3b, r30	; 59
    1014:	f9 01       	movw	r30, r18
    1016:	ee 0f       	add	r30, r30
    1018:	ff 1f       	adc	r31, r31
    101a:	a1 91       	ld	r26, Z+
    101c:	b0 81       	ld	r27, Z
    101e:	0b be       	out	0x3b, r0	; 59
    1020:	11 96       	adiw	r26, 0x01	; 1
    1022:	bf 70       	andi	r27, 0x0F	; 15
    1024:	0b b6       	in	r0, 0x3b	; 59
    1026:	e1 e0       	ldi	r30, 0x01	; 1
    1028:	eb bf       	out	0x3b, r30	; 59
    102a:	f9 01       	movw	r30, r18
    102c:	ee 0f       	add	r30, r30
    102e:	ff 1f       	adc	r31, r31
    1030:	a1 93       	st	Z+, r26
    1032:	b0 83       	st	Z, r27
    1034:	0b be       	out	0x3b, r0	; 59
    1036:	76 ce       	rjmp	.-788    	; 0xd24 <pdp8_execute+0xca>
                else
                  {
                    //group 3

                    //cla
                    if (inst & PDP8_BIT_04)
    1038:	27 ff       	sbrs	r18, 7
    103a:	02 c0       	rjmp	.+4      	; 0x1040 <pdp8_execute+0x3e6>
                      pdp8_LINK_AC&=MASK_LINK; //cla
    103c:	cc 27       	eor	r28, r28
    103e:	d0 71       	andi	r29, 0x10	; 16

                    //mqa mql
                    t1=pdp8_MQ;
                    if (inst & PDP8_BIT_07)
    1040:	24 fd       	sbrc	r18, 4
    1042:	1f c0       	rjmp	.+62     	; 0x1082 <pdp8_execute+0x428>
    1044:	f8 01       	movw	r30, r16
                      {
                        pdp8_MQ=pdp8_LINK_AC & MASK_12BITS; //mql
                        pdp8_LINK_AC&=MASK_LINK;
                      }
                    if (inst & PDP8_BIT_05)
    1046:	26 ff       	sbrs	r18, 6
    1048:	02 c0       	rjmp	.+4      	; 0x104e <pdp8_execute+0x3f4>
                      pdp8_LINK_AC|=t1; //mqa
    104a:	c0 2b       	or	r28, r16
    104c:	d1 2b       	or	r29, r17
    104e:	8f 01       	movw	r16, r30
    1050:	01 97       	sbiw	r24, 0x01	; 1
    1052:	a6 ce       	rjmp	.-692    	; 0xda0 <pdp8_execute+0x146>
                    //iac
                    if (inst & PDP8_BIT_11)
                      pdp8_LINK_AC=(pdp8_LINK_AC+1) & MASK_13BITS; //iac
              
                    //rar ral rtr rtl bsw
                    switch (inst & (PDP8_BIT_08 | PDP8_BIT_09 | PDP8_BIT_10))
    1054:	28 30       	cpi	r18, 0x08	; 8
    1056:	31 05       	cpc	r19, r1
    1058:	09 f4       	brne	.+2      	; 0x105c <pdp8_execute+0x402>
    105a:	40 c0       	rjmp	.+128    	; 0x10dc <pdp8_execute+0x482>
    105c:	2a 30       	cpi	r18, 0x0A	; 10
    105e:	31 05       	cpc	r19, r1
    1060:	09 f0       	breq	.+2      	; 0x1064 <pdp8_execute+0x40a>
    1062:	9d ce       	rjmp	.-710    	; 0xd9e <pdp8_execute+0x144>
                        case PDP8_BIT_08:
                          pdp8_LINK_AC=((pdp8_LINK_AC>>1) | (pdp8_LINK_AC<<12)) & MASK_13BITS; //rar
                          break;

                        case PDP8_BIT_08 | PDP8_BIT_10:
                          pdp8_LINK_AC=((pdp8_LINK_AC>>2) | (pdp8_LINK_AC<<11)) & MASK_13BITS; //rtr
    1064:	9e 01       	movw	r18, r28
    1066:	36 95       	lsr	r19
    1068:	27 95       	ror	r18
    106a:	36 95       	lsr	r19
    106c:	27 95       	ror	r18
    106e:	dc 2f       	mov	r29, r28
    1070:	cc 27       	eor	r28, r28
    1072:	dd 0f       	add	r29, r29
    1074:	dd 0f       	add	r29, r29
    1076:	dd 0f       	add	r29, r29
    1078:	c2 2b       	or	r28, r18
    107a:	d3 2b       	or	r29, r19
    107c:	df 71       	andi	r29, 0x1F	; 31
    107e:	01 97       	sbiw	r24, 0x01	; 1
    1080:	8f ce       	rjmp	.-738    	; 0xda0 <pdp8_execute+0x146>

                    //mqa mql
                    t1=pdp8_MQ;
                    if (inst & PDP8_BIT_07)
                      {
                        pdp8_MQ=pdp8_LINK_AC & MASK_12BITS; //mql
    1082:	fe 01       	movw	r30, r28
    1084:	ff 70       	andi	r31, 0x0F	; 15
                        pdp8_LINK_AC&=MASK_LINK;
    1086:	cc 27       	eor	r28, r28
    1088:	d0 71       	andi	r29, 0x10	; 16
    108a:	dd cf       	rjmp	.-70     	; 0x1046 <pdp8_execute+0x3ec>
                  {
                    //group 2

                    //sma sza snl spa sna szl
                    if (((((inst & PDP8_BIT_05) && (pdp8_LINK_AC & PDP8_BIT_00)) ||    //sma spa
                          ((inst & PDP8_BIT_06) && (pdp8_LINK_AC & MASK_12BITS)==0) || //sza sna
    108c:	de 01       	movw	r26, r28
    108e:	bf 70       	andi	r27, 0x0F	; 15
    1090:	ab 2b       	or	r26, r27
    1092:	09 f0       	breq	.+2      	; 0x1096 <pdp8_execute+0x43c>
    1094:	77 cf       	rjmp	.-274    	; 0xf84 <pdp8_execute+0x32a>
                          ((inst & PDP8_BIT_07) && (pdp8_LINK_AC & MASK_LINK))         //snl szl
                       ) ? 0 : PDP8_BIT_08)==(inst & PDP8_BIT_08))
    1096:	e0 e0       	ldi	r30, 0x00	; 0
    1098:	f0 e0       	ldi	r31, 0x00	; 0
    109a:	49 cf       	rjmp	.-366    	; 0xf2e <pdp8_execute+0x2d4>
                        case PDP8_BIT_09 | PDP8_BIT_10:
                          pdp8_LINK_AC=((pdp8_LINK_AC<<2) | ((pdp8_LINK_AC>>11) & 03)) & MASK_13BITS; //rtl
                          break;

                        case PDP8_BIT_10:
                          pdp8_LINK_AC=( ((pdp8_LINK_AC>>6)&00077) | ((pdp8_LINK_AC<<6)&07700) | (pdp8_LINK_AC & MASK_LINK) ) & MASK_13BITS; //bsw
    109c:	fe 01       	movw	r30, r28
    109e:	00 24       	eor	r0, r0
    10a0:	ee 0f       	add	r30, r30
    10a2:	ff 1f       	adc	r31, r31
    10a4:	00 1c       	adc	r0, r0
    10a6:	ee 0f       	add	r30, r30
    10a8:	ff 1f       	adc	r31, r31
    10aa:	00 1c       	adc	r0, r0
    10ac:	ef 2f       	mov	r30, r31
    10ae:	f0 2d       	mov	r31, r0
    10b0:	ef 73       	andi	r30, 0x3F	; 63
    10b2:	ff 27       	eor	r31, r31
    10b4:	9e 01       	movw	r18, r28
    10b6:	00 24       	eor	r0, r0
    10b8:	36 95       	lsr	r19
    10ba:	27 95       	ror	r18
    10bc:	07 94       	ror	r0
    10be:	36 95       	lsr	r19
    10c0:	27 95       	ror	r18
    10c2:	07 94       	ror	r0
    10c4:	32 2f       	mov	r19, r18
    10c6:	20 2d       	mov	r18, r0
    10c8:	20 7c       	andi	r18, 0xC0	; 192
    10ca:	3f 70       	andi	r19, 0x0F	; 15
    10cc:	2e 2b       	or	r18, r30
    10ce:	3f 2b       	or	r19, r31
    10d0:	cc 27       	eor	r28, r28
    10d2:	d0 71       	andi	r29, 0x10	; 16
    10d4:	c2 2b       	or	r28, r18
    10d6:	d3 2b       	or	r29, r19
    10d8:	01 97       	sbiw	r24, 0x01	; 1
    10da:	62 ce       	rjmp	.-828    	; 0xda0 <pdp8_execute+0x146>
              
                    //rar ral rtr rtl bsw
                    switch (inst & (PDP8_BIT_08 | PDP8_BIT_09 | PDP8_BIT_10))
                      {
                        case PDP8_BIT_08:
                          pdp8_LINK_AC=((pdp8_LINK_AC>>1) | (pdp8_LINK_AC<<12)) & MASK_13BITS; //rar
    10dc:	9e 01       	movw	r18, r28
    10de:	36 95       	lsr	r19
    10e0:	27 95       	ror	r18
    10e2:	dc 2f       	mov	r29, r28
    10e4:	cc 27       	eor	r28, r28
    10e6:	d2 95       	swap	r29
    10e8:	d0 7f       	andi	r29, 0xF0	; 240
    10ea:	c2 2b       	or	r28, r18
    10ec:	d3 2b       	or	r29, r19
    10ee:	df 71       	andi	r29, 0x1F	; 31
    10f0:	01 97       	sbiw	r24, 0x01	; 1
    10f2:	56 ce       	rjmp	.-852    	; 0xda0 <pdp8_execute+0x146>
                      if (inst & PDP8_BIT_11)
                        pdp8_DF=(inst & 00070)>>3;
                      if (inst & PDP8_BIT_10)
                        pdp8_IBR=(inst & 00070)>>3;
                      if (inst & PDP8_BIT_09)
                        switch ((inst & 00070)>>3)
    10f4:	21 30       	cpi	r18, 0x01	; 1
    10f6:	31 05       	cpc	r19, r1
    10f8:	09 f0       	breq	.+2      	; 0x10fc <pdp8_execute+0x4a2>
    10fa:	51 ce       	rjmp	.-862    	; 0xd9e <pdp8_execute+0x144>
                          {
                            case 01:
                              //rdf
                              pdp8_LINK_AC|=pdp8_DF << 3;
    10fc:	2d 2d       	mov	r18, r13
    10fe:	30 e0       	ldi	r19, 0x00	; 0
    1100:	22 0f       	add	r18, r18
    1102:	33 1f       	adc	r19, r19
    1104:	22 0f       	add	r18, r18
    1106:	33 1f       	adc	r19, r19
    1108:	22 0f       	add	r18, r18
    110a:	33 1f       	adc	r19, r19
    110c:	c2 2b       	or	r28, r18
    110e:	d3 2b       	or	r29, r19
    1110:	01 97       	sbiw	r24, 0x01	; 1
    1112:	46 ce       	rjmp	.-884    	; 0xda0 <pdp8_execute+0x146>
                        case PDP8_BIT_09:
                          pdp8_LINK_AC=((pdp8_LINK_AC<<1) | ((pdp8_LINK_AC>>12) & 01)) & MASK_13BITS; //ral
                          break;

                        case PDP8_BIT_09 | PDP8_BIT_10:
                          pdp8_LINK_AC=((pdp8_LINK_AC<<2) | ((pdp8_LINK_AC>>11) & 03)) & MASK_13BITS; //rtl
    1114:	9e 01       	movw	r18, r28
    1116:	23 2f       	mov	r18, r19
    1118:	33 27       	eor	r19, r19
    111a:	26 95       	lsr	r18
    111c:	26 95       	lsr	r18
    111e:	26 95       	lsr	r18
    1120:	23 70       	andi	r18, 0x03	; 3
    1122:	33 27       	eor	r19, r19
    1124:	cc 0f       	add	r28, r28
    1126:	dd 1f       	adc	r29, r29
    1128:	cc 0f       	add	r28, r28
    112a:	dd 1f       	adc	r29, r29
    112c:	c2 2b       	or	r28, r18
    112e:	d3 2b       	or	r29, r19
    1130:	df 71       	andi	r29, 0x1F	; 31
    1132:	01 97       	sbiw	r24, 0x01	; 1
    1134:	35 ce       	rjmp	.-918    	; 0xda0 <pdp8_execute+0x146>
                              pdp8_LINK_AC|=pdp8_IF << 3;
                              break;

                            case 03:
                              //rib
                              pdp8_LINK_AC|=pdp8_SFR;
    1136:	c8 29       	or	r28, r8
    1138:	d9 29       	or	r29, r9
    113a:	01 97       	sbiw	r24, 0x01	; 1
    113c:	31 ce       	rjmp	.-926    	; 0xda0 <pdp8_execute+0x146>
                              pdp8_LINK_AC|=pdp8_DF << 3;
                              break;

                            case 02:
                              //rif
                              pdp8_LINK_AC|=pdp8_IF << 3;
    113e:	aa 0f       	add	r26, r26
    1140:	bb 1f       	adc	r27, r27
    1142:	aa 0f       	add	r26, r26
    1144:	bb 1f       	adc	r27, r27
    1146:	aa 0f       	add	r26, r26
    1148:	bb 1f       	adc	r27, r27
    114a:	ca 2b       	or	r28, r26
    114c:	db 2b       	or	r29, r27
    114e:	01 97       	sbiw	r24, 0x01	; 1
    1150:	27 ce       	rjmp	.-946    	; 0xda0 <pdp8_execute+0x146>

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

alank2 wrote:
This is at a baud friendly 29.4912 MHz, but I forgot that the scalable UART on the XMEGA can do very nice baud rates even at a 32 MHz clock.
USB wouldn't be retro cheeky

FBRG is also in unified memory AVR along with auto-baud; both are also in the more recent USB PIC along with an accurate crystal-less oscillator.

Some USB UART can create an accurate reference clock from the USB SOF signal that XMEGA can PLL (48MHz / 2^n into PLL) (12MHz or less)

 

PIC18F25K50 - Microcontrollers and Processors

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

When I (for fun) was making a AVR emulator running on a AVR, (so code could be in EEPROM or RAM etc.). I did it by reading the two bytes for the instruction (High byte into ZL and low Byte into a normal register, and ZH had a known value ), and made a IJMP, to a table of rjmp's to the "instruction" that then did the register decoding etc, as I remember most instructions took about 15-20 clk (nop about 10), on a normal AVR, so at 32MHz that would be close to 2 MHz.  

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 I love the idea of that!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This topic is way over my head...

 

But, that said, is the Xmega going to use any of its analog modules, (ADC, DAC, AC, etc.), or the EEPROM?

 

If not, then you might use a 2 or 4 MHz external Xtal and the PLL to bump the Xmega's clock up a little bit past its 32 MHz in-spec rating.

 

Atomic Zombie swears they run in the low 60 MHz range, (at room temperature).

 

I ran a project in the upper 40 MHz range, (room temperature), without difficulty.

 

Obviously it depends on your end game.

How many do you wish to make?

Where are they expected, (temperature wise), to operate?

Are you willing to run you own test code to validate that the chip will run properly for you at X MHz?

Are you willing to sample a few chips to weed out any that don't like running at the increased clock frequency?

 

If you needed the EEPROMs for start up data, then with the Xmega it is obviously easy to start up at 2 MHz, switch to 32 MHz and run your EEPROM reads, and then switch to the overclocked frequency for the rest of the program.

 

Just a thought or two since you seem to be very close to hitting your target emulation rate!

 

JC 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks JC - it doesn't have to be perfect, just in the ballpark, so I am satisfied already.  I _didn't_ have to try some attempt at assembly code though that would probably teach me quite abit if I could pull it off.

 

Good point on overclocking it - I'm not using any of those things, so that might be an option to explore!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I have never looked at the pdp8 before so I had a look at wiki page.

 

How fast is a OPR instruction, alway as 1 instruction or 1 for each bit that is 1  or it depends ? 

 

  

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I can't help noticing a lot of shifts going on in the instruction decode. Could this be optimised by a one time set of shifting to get the various parts of the instruction into separate pieces then use those?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

sparrow2 - the OPR instruction has different groups in it based on how it is configured, within each of those groups is has multiple operations that occur in an order.  For example, group 1 has these subgroups/phases: cla cll, cma cml, iac, rar ral rtr rtl bsw (separated by commas).

 

clawson - thanks for finding that - I will look into it - I did find that decoding the main instruction (first 3 bits of the 12 bit instruction) in an uint8_t did help a bit, but as the other groups of bits are looked at (probably most for that OPR/IOT), I haven't dug into them too much.  This may be where you are seeing the excessive shifting that I could eliminate if I analyze it and see what can be done.

 

edit: I had tried making the _IF and _DF both uint16_t and shifting them once and stored that way instead of stored as uint8_t, but it didn't improve it...

Last Edited: Fri. Apr 26, 2019 - 06:35 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

My question was does a instruction like :

CLA CLL IAC RAL  

That basically mean load AC with 2, does it take 1 clk or 4 ?

 

It's just if you want to make it time correct perhaps you should optimize some instructions with a LUT or perhaps have some low registers with some special fixed values, and make simple code for other instructions.

 

About odd shifts, other that LUT's , sometimes the MUL instruction is good(mul with a power of 2), it will split a byte into two bytes with the rest of the bits cleared. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I think it was done in 1 instruction cycle.

 

Details in the small computer handbook, page 3-17 onward.

 

https://archive.org/details/bits...

 

Last Edited: Sun. Apr 28, 2019 - 01:51 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 2

So, I got it doing something last night:

 

https://youtu.be/C3gzhxVC_g0

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

How cool!

 

Maybe Version 2 will have big old toggle / paddle switches on the "panel" !

 

JC

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks JC - I appreciate the feedback.

 

I've already got the more authentic big version - Oscar's PiDP-8:

 

https://obsolescence.wixsite.com...

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Very nice Kartman - that is indeed SMALL!!!