Things are getting quite messy in the architecture now. Sorry for the long rant !


It started with the Carry flag. I hard turned the problem in every sense and perspective, and couldn't find a proper way to deal with it. The Golden Rule of all modern architectures is : no f****** carry flag or status register. MIPS has an overflow trap, F-CPU has a 2-reads-2-writes instruction. But the YASEP can't afford this luxury...


I thought I solved it with a dirty trick : storing the carry flag in the Program Counter's Least Significant Bit (PC's bit #0, to make the number odd or even). But it was really too ugly (I want this critical bit for other purposes later) so I moved the carry bit somewhere else, out of the register set.

It's still messy but... OK, it still works. For example, if an exception occurs, it's still easy for the same HW thread to save and restore the Carry flag's value. It's still longer than the single byte needed to save all the flags of an x86 CPU (PUSHF) but hey, we're RISC aren't we ?...

; save Carry to R1 :
mov 0 R1
mov 1 R1 CARRY
; restore carry from R1 :
add -1 R1

This works for a few entangled reasons. The carry flag is updated by only a handful, adder-specific instructions. This means that it would not be updated by mundane MOV instructions, for example, if the register set is manually saved or restored. The PC's bit #0 can also host the carry flag when it is automatically saved in hardware for an exception handler (that is, if I totally forget the fact that it's SMT and the handler could simply execute from another hardware thread).

And before the carry stuff, there was no such flag in the early YASEP/VSP, true to the Holy Dogma. There were carry-generating versions of ADD and SUB, where ADDC and SUBB would set the destination to 0 or 1 depending on the result's overflow. The problem is that unlike most RISC architectures, the YASEP does not have a lot of available registers. ADD and SUB are often used to compare values' magnitudes and the results require a temporary register. Since the YASEP has only 5 "normal" registers, it means that 20% of that space gets screwed up. Compare that to 3% for MIPS...

ADDC and SUBB have thus been replaced by CMPU and CMPS around 2009. They work by adjusting sign bits, substracting, and not writing the result, so a register is saved. The result goes to the carry flag, which is read by a new condition code.

...

OK, so now the YASEP has a Carry flag. It has found a nice place in the condition codes map. But wait, there is another condition bit available...

At first, I thought that it could be useful as an "overflow" flag (that is : if the result's sign was different from the operands' signs). But to this day, I still have to find a case where it is useful. Futhermore, the CMPS and CMPU instructions already deal with the operands' types.

Another useful bit would be a flag that indicates whether a critical section (opened by the CRIT opcode) had been aborted. This is the quintessential flag because it is totally context-dependent and write-only. No need to save it or restore it : if a trap occurs inside a critical section, just flush the bit and hopefully the critical section will be restarted by the application.

But I got lazy. Or, instead, I programmed real code and found that I would be happy with a flag that indicates if the last result was zero.

I got lazy and wrote this in the first microYASEP VHDL code :

  if WB_en='1' and FlagChangeCarry(int_opcode)='1' then
Carry <= Carry_out;
FlagZero <= zero_out;
end if;

Yes, I realise just now that it gets updated at the same time as the Carry flag. Worse : I see only now that the zero flag gets its value not from the Adder's result, but from the binary difference of both operands.

  zero_out <= '1' when ROP2_xor=(ROP2_xor'range=>'1')
else '0';

It's a nice trick because it doesn't increase the critical datapath for the adder : the big combining ANDN (with 16 or 32 input bits) is computed in parallel with the carry chain of the adder. But then it means that the Zero flag makes sense ONLY with the CMPU/CMPS instructions, not ADD or SUB... Which is a standard and expected behaviour in most other architectures :-/

But wait, in the above VHDL code, FlagZero is also updated for ADD and SUB ? Now this could eventually explains certain curious bugs I had...

...

OK, this is messy now. But it's not finished.

What I would ideally want is to have the zero flag updated any time a register is written, too. This corresponds to tying a big OR to the write bus of the register set. This potentially adds some significant latency to the pipeline. But... Is that useful ? If the result is written to a register then this register can be tested anyway with the usual conditions.

So really, the Zero flag is useful only for CMPU and CMPS, that indeed are the only computation instructions that don't write back the result. The VHDL code must be corrected.

And how/where can one save both the carry flag and the Zero flag ?

Saving one flag was already complex enough, saving two flags is still possible but longer.

; save Carry and Zero to R1 : 10 bytes
mov 0 R1
mov 1 R1 ZERO
add 2 R1 CARRY

; restore carry from R1 : 6 bytes add -2 R1
; restore Zero
and 1 R1 CMPU 1 R1

Wouldn't it be easier if the flags were accessible as a single normal register ? There are no registers left but Special Registers could do it.

; save Carry and Zero to R1 : 2 bytes
GET -1 R1

; restore carry from R1 : 2 bytes PUT R1 -1

But should it ? Moving data to the Special Registers is a slippery slope and when we start doing it, we want to apply this principle over and over and... it gets even more messy ! How many status flags will end up there and will this special register be called "status register" ?

Obviously I don't want this so I'll just avoid it and use the slow method for now.