(edit 20140207: some stupid typos crept into the tables)
(edit 2014-02-08 : dropping all the pre-modifications)
The instruction set of the YASEP architecture is finally frozen, after years
of fine-tuning and exploration !
In august 2013, during a discussion with JCH, I came up with a new encoding
for the 4 remaining bits of the extended instructions that were reserved for
register auto-updates. I've been struggling with the one big shortcoming of the
architecture : the very limited range of Imm4, particularly for conditional
relative jumps. I had hacked a few tricks but none were really satisfying.
JCH pointed to some autoupdate codes that didn't make sense in combination
with other flags and that's how he found a way to get 2 more bit for
I tried to simplify the system down to a few simpler codes, following these
- A little reminder : when a D register (a memory access register) is
referenced, it's the corresponding A register that gets updated, according to
the size of the accessed word (1, 2 or 4 bytes). Otherwise, A registers are
incremented by 2 or 4 bytes (depending on the datapath width, 16 or 32 bits)
and R registers are incremented by 1. It's not very orthogonal but quite
- any register may be post-incremented or post-decremented with one
instruction (handy for string/vector code)
- There must be "room" for 2 Imm bits and it should not break existing
compiled code (NOP=0000)
- Any of the 4 register fields may be affected
The important trick that JCH found is that the Imm/Reg field invalidates
certain auto-updates and frees some bits. In particular, it makes no sense to
update SI4 when this source operand is immediate, so SI4 is associated with NOP
in certain cases.
There is very little room and I had to make some compromises. For example,
the CND field can't be updated when other registers are. Pre-incrementations
are also avoided (see at the bottom why). It's not possible to increment one
register and decrement another.
The resulting format provides Imm6 and one post-update for all extended
instructions, and one to three post-updates when no immediate is present.
- iRR instructions use 2 bits to encode Imm6 along with 2 bits for updates
11 CND- (this helps loops)
- RRR instructions use 4 bits to encode more complex updates
00 01 10 11
00 NOP SND+,SI4+,DST+ SI4- SI4+
01 SND-,SI4- SND+,SI4+ SND- SND+
10 DST-,SI4- DST+,SI4+ DST- DST+
11 DST-,SND- DST+,SND+ CND- CND+
- The big advantage of this encoding is that it increases code density for a
lot of very common sequences : stack manipulation, string/vector processing,
counters... Code density increase does not always mean faster execution but it
helps. Different microarchitectures might implement these flags with different
approaches (serial or parallel)
- There are several drawbacks as well : the encoding favors density over
decoding ease (but what can we do with only 4 bits ?). The new encoding also
breaks Imm4 and a new assembler must be recoded from scratch (the current one
is aging and its flexibility has been stretched to its limits).
- In the end, it is a progress :
- Code density will increase again (maybe 20%).
- Auto-updates are an optional feature but we have freed 2 Imm6 bits
for general consumption. This will benefit all the YASEPs out there (which must
be updated, fortunately there are not a lot yet ;-D ). Post-update is a first
level of compatibility, and pre-update is more difficult to implement so it's a
second level (less expected to be available).
- This help extend the range of PC-relative conditional jumps
- This solves the limitation of Shift/rotate operations
- No more unused bits in the instructions !
- Some questions remain :
- It makes sense to update the A register of a D register that has just been
written to (to update the destination for the next write, in a string-copy
sequence for example). What about the case where an instruction writes to a R
register with post-increment ? What is the priority ? Auto-updates were
initially meant for address registers only but later extended, should this be
restricted again ? If so, would that break even more symmetry and create more
Right now, the priority is to rewrite the assembler/disassembler and keep the
simulator and VHDL up-to-date. My work system is in a bad state and it will
take time to get everything back in order.
Why no pre-increment or pre-decrement ?
Pre-modification are removed because they break the very important rule that an
instruction should not trap (or be able to trap) in the middle of the execution
In the case of pre-incrementing an address register, such as MOV -D1,
, the validity of the new address in A1
only after it is being computed, but there is no way to gracefully stop the
instruction in the middle or even restart it. The proper way to do it is to
move the -D1
into either a previous instruction using
, or simply emit a short ADD
instruction before the actual move to R1
Remember : all the operands must be directly ready for use (at decode
stage) before the instruction can proceed to execution
The previous table was :
00 01 10 11
00 NOP +SI4 SI4+ SI4-
01 SND+,SI4+ +SND SND+ SND-
10 DST+,SI4+ +DST DST+ DST-
11 DST+,SND+ +CND CND+ CND-
The new table uses the 4 pre-inc entries for 2-post-decrement and