Things are getting quite messy in the architecture now. Sorry for the long
It started with the Carry flag. I hard turned the problem in every sense and
perspective, and couldn't find a proper way to deal with it. The Golden Rule of
all modern architectures is : no f****** carry flag or status register. MIPS
has an overflow trap, F-CPU has a 2-reads-2-writes instruction. But the YASEP
can't afford this luxury...
I thought I solved it with a dirty trick : storing the carry flag in the
Program Counter's Least Significant Bit (PC's bit #0, to make the number odd or
even). But it was really too ugly (I want this critical bit for other
purposes later) so I moved the carry bit somewhere else, out of the
It's still messy but... OK, it still works. For example, if an exception
occurs, it's still easy for the same HW thread to save and restore the Carry
flag's value. It's still longer than the single byte needed to save all the
flags of an x86 CPU (PUSHF) but hey, we're RISC aren't we ?...
; save Carry to R1 :
mov 0 R1
mov 1 R1 CARRY
; restore carry from R1 :
add -1 R1
This works for a few entangled reasons. The carry flag is updated by only a
handful, adder-specific instructions. This means that it would not be updated
by mundane MOV instructions, for example, if the register set is manually saved
or restored. The PC's bit #0 can also host the carry flag when it is
automatically saved in hardware for an exception handler (that is, if I totally
forget the fact that it's SMT and the handler could simply execute from another
And before the carry stuff, there was no such flag in the early YASEP/VSP,
true to the Holy Dogma. There were carry-generating versions of ADD and SUB,
where ADDC and SUBB would set the destination to 0 or 1 depending on the
result's overflow. The problem is that unlike most RISC architectures, the
YASEP does not have a lot of available registers. ADD and SUB are often used to
compare values' magnitudes and the results require a temporary register. Since
the YASEP has only 5 "normal" registers, it means that 20% of that space gets
screwed up. Compare that to 3% for MIPS...
ADDC and SUBB have thus been replaced by CMPU and CMPS around 2009. They
work by adjusting sign bits, substracting, and not writing the result, so a
register is saved. The result goes to the carry flag, which is read by a new
OK, so now the YASEP has a Carry flag. It has found a nice place in the
condition codes map. But wait, there is another condition bit available...
At first, I thought that it could be useful as an "overflow" flag (that is :
if the result's sign was different from the operands' signs). But to this day,
I still have to find a case where it is useful. Futhermore, the CMPS and CMPU
instructions already deal with the operands' types.
Another useful bit would be a flag that indicates whether a critical section
(opened by the CRIT opcode) had been aborted. This is the quintessential flag
because it is totally context-dependent and write-only. No need to save it or
restore it : if a trap occurs inside a critical section, just flush the bit and
hopefully the critical section will be restarted by the application.
But I got lazy. Or, instead, I programmed real code and found that I would
be happy with a flag that indicates if the last result was zero.
I got lazy and wrote this in the first microYASEP VHDL code :
if WB_en='1' and FlagChangeCarry(int_opcode)='1' then
Carry <= Carry_out;
FlagZero <= zero_out;
Yes, I realise just now that it gets updated at the same time as the Carry
flag. Worse : I see only now that the zero flag gets its value not from the
Adder's result, but from the binary difference of both operands.
zero_out <= '1' when ROP2_xor=(ROP2_xor'range=>'1')
It's a nice trick because it doesn't increase the critical datapath for the
adder : the big combining ANDN (with 16 or 32 input bits) is computed in
parallel with the carry chain of the adder. But then it means that the Zero
flag makes sense ONLY with the CMPU/CMPS instructions, not ADD or SUB... Which
is a standard and expected behaviour in most other architectures :-/
But wait, in the above VHDL code, FlagZero is also updated for ADD and SUB ?
Now this could eventually explains certain curious bugs I had...
OK, this is messy now. But it's not finished.
What I would ideally want is to have the zero flag updated any time a
register is written, too. This corresponds to tying a big OR to the write bus
of the register set. This potentially adds some significant latency to the
pipeline. But... Is that useful ? If the result is written to a register then
this register can be tested anyway with the usual conditions.
So really, the Zero flag is useful only for CMPU and CMPS, that indeed are
the only computation instructions that don't write back the result. The VHDL
code must be corrected.
And how/where can one save both the carry flag and the Zero flag ?
Saving one flag was already complex enough, saving two flags is still
possible but longer.
; save Carry and Zero to R1 : 10 bytes
mov 0 R1
mov 1 R1 ZERO
add 2 R1 CARRY
; restore carry from R1 : 6 bytes
add -2 R1
; restore Zero
and 1 R1
CMPU 1 R1
Wouldn't it be easier if the flags were accessible as a single normal
register ? There are no registers left but Special Registers could do it.
; save Carry and Zero to R1 : 2 bytes
GET -1 R1
; restore carry from R1 : 2 bytes
PUT R1 -1
But should it ? Moving data to the Special Registers is a slippery slope and
when we start doing it, we want to apply this principle over and over and... it
gets even more messy ! How many status flags will end up there and will this
special register be called "status register" ?
Obviously I don't want this so I'll just avoid it and use the slow method