As the execution units mature and get integrated as one block, things become clear, at least concerning the computation instructions. I'm currently focusing on the 16-bit flavour of YASEP and I expect that the following will hold true for YASEP32.

The ALU16 is nearing completion, though feature creep is still rampant. But I have identified a bunch of instructions that will not change much in the future, and they are gathered here :

- ROP2 : AND, OR, XOR, ANDN, ORN, XNOR, NAND, NOR
- ASU : ADD, SUB, ADDS1, SUBB1, ADDS2, SUBS2, MIN, MAX
- SHL : SHR, SHL, ROR, ROL, SAR  + MUL : MUL8L, MUL8H, MULINIT
- IE : MOV, SB, LSB, LZB (16/32b) SH, SHH, LSH, LZH (32 bits only)

This nice and square table represents the large majority of the used instructions, and this fits into 4 groups of 8 instead of the planned 8 groups. So...

This saves a bit that is used to encode other addressing modes. In 2008, there were 2 modes : short mode (RR) and long mode (RRImm16). Now, it is also possible to encode a short immediate in the short mode (RImm4, the register is replaced by a value), or use another register as a destination in the long mode (but 12 bits are unused).

Yes there are now 4 addressing modes and most code should feel their binary size shrink ! Furthermore, the datapath complexity is not impacted and the 3-registers version should reduce the number of cycles for a given portion of code.

How this affects usual code :

- add 1, r1 ==> r1 += 1

now takes 2 bytes instead of 4. The constant can range from -8 to +7.

- add r1, r2, r3 ==> r1 = r2 + r3

It takes 4 bytes as previously but it saves 1 clock cycle, compared to

- mov r2, r1
- add r3, r1

Note that the yasep.org site is not yet updated, I'll wait until things settle down.