(post version : 20110108)
(update : 20110515 : environment inheritance)
Recently (2010/11/20) I found the critical elements that solve a crucial
problem that the Hurd team submitted to me in ... 2002. It took time and many
attempts but I think that the YASEP is a great place to experiment with this
idea and prove its worth.
The Hurd uses a lot of processes to separate functions, enforce security and
modularize the operating system. It uses "Inter Process Communication" (IPC)
such as message passing and this is snail slow on x86 and most other
architectures.
The YASEP uses hardware threads which is a concept close, but not identical,
to the processes of an operating system. And these last days I have found what
was missing : the "execution context" ! So with the YASEP, a process is
a hardware thread (a set of registers and special registers)
associated to an execution context (the memory mapping, the
access rights etc.)
Repeat after me : a process is a thread in a context.
This distinction is necessary because threads are activated for handling
interrupts, operating system functions, library function calls and
communication between the programs. It's a major feature of the processor which
should provide functionalities that go beyond a mere microcontroller...
So IPC is necessary to make a decent OS and it requires several hardware
threads (threads can be interleaved at the hardware level to provide with
concurrency and better performance) and several contexts (for the operating
system, device drivers, libraries, interrupt handlers...). The processor state
can jump at will from one to the other with much less latency than an usual
CPU.
The antagonistic requirements are as follows :
- A process must be able to call code from another context FAST, as fast as
possible.
- The mechanism must be totally SAFE and SECURE.
- The physical implementation must be SIMPLE.
Simple and fast go hand in hand (ask Seymour Cray. Oh, wait, too late...).
In the YASEP, communication takes place with a restricted variant of the
function call instruction. Function calls are difficult to "harden" and more
generic and specific instructions are usually found in other architectures to
provide IPC or system calls. These are quite simple to implement in a CISC
architecture like x86 because microcodes can do whatever is required... But
they are slow because several dependent memory fetches must be performed (read
the access rights table then find the address of the code to execute,
whatever...)
The YASEP is a RISC-inspired architecture and requires a new approach. What I
have found requires just 3 new opcodes :
- IPC : InterProcess Call
- IPE : InterProcess Call Entry
- IPR : InterProcess Call Return
Since the YASEP has a bank of several threads in the register set, the context
switch is a matter of a few cycles only. One way to further reduce the
execution time is to pre-calculate the destination address of the called code :
no call table or things that require several chained/dependent memory accesses.
In order to obtain the jump address, a thread must register itself in the
called process and obtain the context number and the effective address. The
calling thread can then modify its own code (update the constants) or variables
to make the proper IPC later. Here is how simple it gets :
IPC R1, R2 ; call context number R2 at address R1
IPC 1234h, R2 ; call context number R2 at immediate address 1234
Security is a bigger beast and just changing the TID (Thread ID) value is not a
good method. The first big problem is that any code can call any context at ANY
address and a security mechanism is required to block unwanted calls from
succeeding. The policies could be arbitrarily complex (depending on the OS
strategies) and don't belong in hardware (unlike x86), a software-based
authorisation system is preferred (like MIPS !). This is the role of the IPE
instruction :
- IPE provides the Thread ID and Process ID of the calling thread (it's a
kind of GET). From this, the callee can choose to accept or refuse the call,
provide a specific service or even choose to not check at all. Any software can
create its own policy, call by call !
- IPE is NECESSARY for the IPC instruction to complete. If IPC points to an
instruction that is NOT IPE, an error is triggered. This prevents all
applications from jumping anywhere in any code.
- Each thread can restrict the range of callable addresses so calls can't
enter data sections. This is the role of additional registers.
When the thread calls code from another context at the right address, the
register set is preserved (not touched) so the transmission of parameters takes
no effort. However several new issues appear.
For example, how can one thread in a different context access data from the
previous context ? The proposed solution is to provide an attribute to each
Address register : the
context number. Upon call, the newly
spawned process will modify the necessary attributes to access to both the
current and the calling process. Which means that all the previous contexts
must be kept in the processor (since interthread calls must be reentrant).
Before the call, the calling process should mark the memory ranges it accepts
to share with the called process (marking the range as "shareable"). This way,
no data copy is necessary !
The return address and thread/process/context IDs must be managed by the CPU
core itself to prevent tampering by the caller or callee. This is the last
point that needs some big work and HW real estate ... A classic stack, with a
stack pointer, stack base and stack limit, are necessary hardware resources to
add.
So let's sum up the added hardware :
- Each context must be able to mark memory ranges as data-read and/or
data-write by other threads. This can be indicated by flags for each page in
the page table. How this can be restricted to certain threads (that are in the
call stack) is still uncertain, a token scheme should be created where a
permission can be passed to (and inherited from) another thread.
- Each context has 2 registers that are compared to the called address to
restrict unwanted calls.
- Each process has a set of 3 registers for the IPC stack (pointer, base,
limit). Pointer and limit are compared for equality upon call and pointer and
base are compared during return.
- There are also 5 new thread-private registers that determine the owner
(thread number) of a pointer. They must be preserved in HW if the caller or
callee are not trusting or trusted.
That makes about 10 new registers ! How this will be implemented is still
uncertain. Maybe a hardcoded sequence of instructions will be streamed through
the instruction decoder, unless everything is done in parallel in big enough
chips. This reminds me that in the past, I wanted to add "attributes" to the
address generators of the VSP, with base/index/limit/stride, now there is the
context number that is some kind of "address space number" (ASN). We can
finally merge these ideas and in 16 bits code, we can use ASNs like segments in
x86 : one for executable data, one for the stack, several for data, and no
opcode prefix is needed.
Whatever the implementation, we're going here from a system initially designed
for libraries and system calls, extended to the next level : a micro-kernel
oriented architecture where processes can share memory they own so others can
work on it, with little overhead. Will the Hurd people be finally happy now
?