YASEP news

To content | To menu | To search

Friday 11 May 2012

The social YASEP

The quest for more gadgets never ceases !

Among some of the newest and shiniest gizmos is a connexion to a world-class bouchot, directly integrated into the YGWM GUI. You can test it yourself at

http://yasep.org/yasep2011/#!bouchot

For the record it is a direct connexion with YGLLO's bouchot, one of the "mussel"'s community many hangouts :-) You can even watch a demonstration video made by Finss.

More unrelated surprises are in store...

Sunday 29 April 2012

More JavaScript gadgets

There, I've done it !

Now you can click on example source code in the documentation windows. It creates a new editor window with this listing. No copy-paste needed and you can test and modify the examples ! And soon you'll be able to move blocks of code to/from different editor windows...

I am also adding the "Examples" menu, wich imports external .yas source files. More and more examples will be added in the future :-)

Saturday 7 April 2012

Licensing freedom

How does one license freedom ? And what freedom is there to license ?

I recently got an email that contained this bit :

"with your current license, could your microYASEP be linked into a commercial product without releasing the product VHDL ?"

I think there is a valid point here. I chose the AGPL (a slightly modified GPL) for certain reasons, and one of my goals is to foster totally free and open source designs, a bit like Arduino does. It's somehow a mission statement and I stick to that.

I know well that a good CPU with good support is a gift, a fantastic tool not only for hobbyists but also for industries and they play by different rules, that are sometimes opposite. Choosing a different licence for the whole thing is not considered, and it's too late anyway, and I like the AGPL in the context of the YASEP project. I believe that "as is", the AGPL is not inadapted to hardware designs, as it is very close to the GPL, which also spawned the LGPL that many HDL designs use. Furthermore, I believe it is best to use only one license for the whole project, otherwise it can become confusing.

I've seen other projects use "dual licensing" but I am not sure that it would work for a hardware project. It's still a good idea so I thought about something slightly different, like a "partners program". It's still an ongoing thought and it will certainly evolve but my idea looks like this :
Commercial entities who want to integrate the YASEP core in commercial products (along with other HDL) would submit me their designs (HDL and finished product) for a confidential evaluation and certification. They will also disclose on their website all the YASEP source code that they used, in exchange of an exemption agreement and mention on the YASEP site.

I know that most companies feel more comfortable with cores from ARM or Microchip or Atmel... But I already know that there are exceptions and those excited by the YASEP are sensitive to my perspective so we'll tune the partnership details together.

So far, I don't give much thought about this "issue" because without an advanced enough design, there is no point in licensing. I don't want to waste time in endless conversations about hypothesis and what-ifs. And after all I am the author so I have the final word :-P

Monday 19 March 2012

24 bits per instruction

I started writing the microYASEP for a specific project, one month ago, and started to write more software.

The benefits of writing code while designing the architecture can't be over emphasised. When you eat your own food, you are more careful of the recipes and the ingredients !

I had thought about how to use the last remaining condition code, I thought about an "overflow" bit but it's useless because the comparisons are already containing the signed/unsigned information. Writing code made me realise that the ZERO condition was not enough as many operations require a scratch register for the result of a comparison (using CMPU). So I made a "EQ" (equal) flag and I reduced the register pressure a bit.

About the title : my first real program is a bit more than 200 bytes so it is time to evaluate my early estimations. In 2009 I guessed that in average one half of the instructions are "long" (4 bytes) (and the other half are short, 2 bytes) and now I have a first result :

The program comprises 70 instructions, 34 are short and 36 are long. Pretty good guess so one can consider the YASEP architecture as a "roughly 24-bit instruction machine" :-)

Furthermore, I know that I had a loooooot of time to get used to the architecture and learn how to use it efficiently, but overall the YASEP is pretty comfortable to use. I could do almost everything with just 8 opcodes :

ADD MOV GET PUT CALL AND XOR CMPU

The only limitation I found was the very short (-8 to +7) 4 bits immediate range. Sometimes it would be very handy to extend it to 5 or 6 bits, for short jumps or loops for example, but it was a conscious compromise from the start, as finding more bits somewhere would make the architecture more complex and less orthogonal... Food for thought for the next architecture (wink...)

Tuesday 28 February 2012

at a glance...

How did I write 350 lines of code in one day to create a (mostly) working CPU ? It's thanks to a long and careful preparation, using diagrams like this one :

I use different colors to show the separate paths for control, address, data etc.

With this kind of general diagram, it's easy to code : just follow the wires and code the corresponding function...

A cleaner version will appear later :-)

Wednesday 22 February 2012

microYASEP's first boot !

Today is a big milestone : a tiny implementation of the YASEP has executed tens of instructions 

  phase='1' PC=3FE  RAM=0000  Result=0000 DST=0  R1=???? R2=???? R3=???? R4=????
phase='1' PC=3FE RAM=0000 Result=0000 DST=0 R1=???? R2=???? R3=???? R4=????
*** releasing reset ***
phase='1' PC=3FE RAM=0000 Result=0000 DST=0 R1=???? R2=???? R3=???? R4=????
phase='1' PC=3FE RAM=1009 Result=0000 DST=0 R1=???? R2=???? R3=???? R4=????
phase='0' PC=000 RAM=1009 Result=0000 DST=0 R1=???? R2=???? R3=???? R4=????
phase='1' PC=002 RAM=1234 Result=1234 DST=1 R1=???? R2=???? R3=???? R4=????
writing 1234 to R1
phase='0' PC=004 RAM=4009 Result=4009 DST=1 R1=1234 R2=???? R3=???? R4=????
phase='1' PC=006 RAM=5678 Result=5678 DST=4 R1=1234 R2=???? R3=???? R4=????
writing 5678 to R4
phase='0' PC=008 RAM=1115 Result=1115 DST=4 R1=1234 R2=???? R3=???? R4=5678
phase='1' PC=00A RAM=4321 Result=5555 DST=1 R1=1234 R2=???? R3=???? R4=5678
writing 5555 to R1
phase='0' PC=00C RAM=1117 Result=234B DST=1 R1=5555 R2=???? R3=???? R4=5678
phase='1' PC=00E RAM=0100 Result=AAAA DST=1 R1=5555 R2=???? R3=???? R4=5678
writing AAAA to R1
phase='0' PC=010 RAM=220A Result=5556 DST=2 R1=AAAA R2=???? R3=???? R4=5678
phase='1' PC=010 RAM=330A Result=0002 DST=2 R1=AAAA R2=???? R3=???? R4=5678
writing 0002 to R2
phase='0' PC=012 RAM=330A Result=0002 DST=2 R1=AAAA R2=0002 R3=???? R4=5678
phase='1' PC=012 RAM=2408 Result=0003 DST=3 R1=AAAA R2=0002 R3=???? R4=5678
writing 0003 to R3
phase='0' PC=014 RAM=2408 Result=0003 DST=3 R1=AAAA R2=0002 R3=0003 R4=5678
phase='1' PC=014 RAM=3408 Result=0002 DST=4 R1=AAAA R2=0002 R3=0003 R4=5678
writing 0002 to R4
phase='0' PC=016 RAM=3408 Result=0002 DST=4 R1=AAAA R2=0002 R3=0003 R4=0002
phase='1' PC=016 RAM=1317 Result=0003 DST=4 R1=AAAA R2=0002 R3=0003 R4=0002
writing 0003 to R4
....

You can find the source code there and play with the parameters :-)

The "microYASEP" is a compatible subset of the usual YASEP but with many limitations, like only 23 instructions (not yet all implemented), 2 cycles per instruction (no pipeline), not even data memory access... It is designed for tiny FPGAs and the core source code takes about 350 lines in VHDL. Data widths are 16 bits but could potentially be even smaller if needed (I'll have to check). I think it will run around 12 MIPS for the first system that will use it, it could be faster but this is useless.

This would not be possible without all the software tools I have written in the last months and years ! I can now assemble and export in hexadecimal or VHDL, create new custom configuration files with a few clicks, or tweak details at will. I have created a new system of "CPU profiles" that goes beyond the basic YASEP16/YASEP32 distinction.

The microYASEP is just one of the several possible microarchitectures possible with the YASEP. Later configurations will be faster, larger and with more features like the multiplier, shifter and memory interfaces... But with one first application and a running, basic core, the whole YASEP design can tune its details with more real-life feedback !

Monday 13 February 2012

Interactive Assembler, take 2

The YASEP is progressing toward the 2012 revision, with great features and hopefully a first micro-YASEP soon. This burst of productivity has no secret : I just NEED a YASEP as soon as possible for another project. At this moment, I'm working on the assembly environment, a quick development that reuses some code from listed (the LISTing Editor created and stopped in 2009).

So far it can already import and assemble source code from a textarea, not bad for a whole day of coding... I've also been surprised by what I could do in one day for the VHDL code ! This new interface is also able to edit the imported data and export the assembled listing, but I want it to move lines from one editor window to another, and much more... I'll even reuse the recycle bin, just like I did for listed ! But I changed the name to YASMed. Don't ask me why...

Hopefully, a new release should be online on the "prototype area" in a few weeks.

Monday 28 November 2011

A YASEP assembler in C by DeforaOS

Today, khorben from DeforaOS, sent me a surprise : this screenshot !

defora disassembler screenshot

He is implementing his assembler/disassembler in C for his operating system project. A graphic interface is also available, among the many features in development ! In parallel, I implement some features in the YGWM interface that synthesise and export the relevant informations needed by his assembler. In the end we'll both have the tools to create a full working and autonomous system :-)

Thanks again for the screenshot !

Tuesday 8 November 2011

Register Parking

As the YASEP architecture specifies, there are 5 normal registers (R1-R5) and 5 pairs of data/address registers  (A1-D1, A2-D2...) and it's quite difficult to find the right balance between both : each application and approach requires a different optimal number of registers.

When more registers are needed (if you need R6 or R7) then you could assign them to D1 and D2 for example. However you have to set A1 and A2 to a safe location otherwise chaos could propagate in the software. Another issue is that each write to the A registers will update the memory. A similar situation appears if we use the Ax registers as normal registers : each write will trigger a memory read. And in paged/protected memory systems, this would kill the TLB...

This is now "solved" with today's system, which defines hardwired "parking" addresses and internal behaviour (this is still preliminary but looking promising).

  • "Parking" addresses are defined as "negative" addresses (that is : all the MSB are set to 1). This addressing range, at the "top" of the memory space, is normally not used, or used for special purposes, such as "fast constants" addressed by the short immediate values :
    MOV -7, A3 ; mem[-7] contains a constant or a scratch value,
    MOV D3,... ; the address fits in 3 bits
  • To keep the "parking" system compatible with non-parked versions, the addresses are defined globally for all software. They are easy to remember, as the following code shows :
    ; Park all the registers
    MOV -1, A1
    MOV -2, A2
    MOV -3, A3
    MOV -4, A4
    MOV -5, A5
    These will become macros or pseudo-instructions.
  • The internal numbering of the registers is changed to ease hardware implementation. There is a direct match between the binary register number and the binary code of the address (bits 1 to 3) :

    park address  binary    reg.bin       reg.number   register
          -1             1111       1111              15              A1
          -2             1110       1101              13              A2
          -3             1101       1011              11              A3
          -4             1100       1001                9              A4
          -5             1011       0111                7              A5
  • Architecturally, it does not change much. The Data registers are "cached" by the register set. What the hardware parking system adds is just an inhibition of the "data write" signal that would occur normally each time the core writes to a D register.
  • Aliasing : No alias detection is expected. If A4/D4 writes to -2, D2 is not updated. Otherwise it would mean that the result bus could write to 5 registers in parallel, which is not reasonable.
  • Thread backup and restoration : the register set contains the cached version of the memory, it must be refreshed when a thread is restored (swapped in). If the Ax register matches a parked address, the memory doesn't need to be fetched to refresh the cache. Another solution is to save the Dx register through another Ax/Dx, so there is nothing to test during restoration (but memory read cycles could not be spared).
  • This sytem where the "parking" is defined by an auxiliary value (that is inherently preserved through context switches) is "cleaner" than a more radical approach where "status bits" (one per A/D pair) park the registers. The advantage of the radical approach is that two registers can be parked at once (instead of one) but it gets harder to use with a compiler or from user software (you can play with pointers in C or Pascal easily, though you won't be able to define which pair is used). On top of that, adding status/control bits is usually a nightmare
In the end, it's not very complex (not as much as it seems). The hardware price is a few logic gates that detect the parking addresses to inhibit memory writes. For the software writer, it just means more registers on demand and it will work whether the YASEP has the parking hardware or not. You CAN have R6, R7 or R8 but then you'll have to restrict data access and give up A1/D1, A2/D2 and A3/D3. You make the choice !

Sunday 25 September 2011

The YASEP and Defora

Today I think that one big issue with the YASEP project has been solved.

I met Pierre this week, and I start to discover the awesomeness of his Defora project. "Debian For All" turned into creating a whole new, compact, totally GPLv3 system. With almost no dependency from existing systems, yet compatible with them... Perfect for embedded computing too !

We just started to work on a C version of the existing JS assembler and we consider writing a C99-compliant compiler.

YES, you have read it : the YASEP will have binaries generated from C code ! And good code, at that, since it does not go through GCC !

Many roadblocks are now removed. When the code generation tools are in place, we can then simulate/emulate the core and start to write a microkernel...

What this means for me is that I can finally stop worrying about the operating system and application layer. The YASEP will not use Linux and I won't be forced to use the huge GCC armada. I will also have more time to focus on the hardware architecture and implementation. And Pierre is a security specialist...

Oh, by the way : YGWM won the 2nd rank (ex aequo with Pierre's Defora) at the Open World Forum Code Contest this week. A new, shiny, professional laptop was given away by HP and will become my main workstation. Going from an Atom to a Core i5 makes me feel spoiled :-)

Wednesday 7 September 2011

YASEP2011

Development is still happening, at a slow pace (due to work duties) but nothing is forsaken.

I'm still working toward a cheap Actel board that can be easily replicated and cheaply fabricated, and the professional projects might bring some interesting results.

On another front, I resumed work on YGWM and extended the functionalities. You can even test the results at http://ygdes.com/~whygee/yasep2011/ and the whole website will be reimplemented with this new paradigm. No more tabs ! Everything in one browser window with a huge virtual desk !


Sunday 8 May 2011

This little Least Significant Bit

(update : 2011/05/11)

I've been wondering since march of this year if the Least Significant Bit (LSB) of the Next Instruction Pointer (NIP or NPC) could be better used than now.

The YASEP instructions are 16-bits aligned and the instruction addresses have their LSB cleared by convention. This bit is usually wasted in word-aligned byte-oriented computer architectures.

In the current YASEP architecture, this LSB holds the carry flag of ADD/SUB operations. It is the only status flag that I couldn't get rid of with the usual architectural tricks. As a reminder, instructions can check 3 conditions : register is cleared, has its LSB cleared (odd/even) or MSB (sign) cleared. Every condition can be negated and a 4th condition serves as "always" or "reserved" case. Reading the LSB and MSB is easy, checking for a cleared register is more costly. In some implementation, the register set has "shadow" bits with precomputed/cached "register is clear" bits. But otherwise, no dirty trick is employed.

The Carry bit is less easy to handle : it's a dynamic result that can't be reconstructed from the 16 or 32 bits of the registers. It is not possible to restore it after a thread switch. It can't be added to the "condition cache" because it will have to be saved and restored (16 more bits to save ? Bleh...)

Here come the latest changes :

  • The carry bit is now "hidden", not available from the register set for computations (that would make other things more difficult). It exists as a bit that can only be tested via a specific condition code in the conditional instruction forms (certainly one that tests NIP).
  • The LSB of NIP is always cleared. However, when saving/restoring the state in memory, it will hold the carry bit. This is the only case when the two functions (carry and pointer) are mixed.
  • Writing a "1" to the LSB of NIP (other than for saving/restoring the state) triggers a trap. There are several uses :
  1. Breakpointing / tracing / debugging : inject a "1" in the LSB and you can see where the pointer is used.
  2. Safety : for example if the stack is corrupted, there is a chance that the LSB will be set and trigger the trap
In future iterations, this bit could be used for something else more pertinent (such as a second instruction memory bank selector) so it must be carefully handled by programers now.

Wednesday 8 December 2010

ACTUINO day 1

Yesterday, while talking with Jeff about our respective and converging goals, a new idea came.

Today, actuino.org is registered. The website will appear later, one day, but the name is found and secured while we work toward the new milestone of an electronic board that is DIY-friendly, very powerful, affordable and paving the way for developing the YASEP.

The one big issue for me though is that I'll have to cope with the Atmel architecture, which I don't "speak" so any help is appreciated :-)




Saturday 20 November 2010

Fast and secure InterProcess Communications

(post version : 20110108)

(update : 20110515 : environment inheritance)

Recently (2010/11/20) I found the critical elements that solve a crucial problem that the Hurd team submitted to me in ... 2002. It took time and many attempts but I think that the YASEP is a great place to experiment with this idea and prove its worth.

The Hurd uses a lot of processes to separate functions, enforce security and modularize the operating system. It uses "Inter Process Communication" (IPC) such as message passing and this is snail slow on x86 and most other architectures.

The YASEP uses hardware threads which is a concept close, but not identical, to the processes of an operating system. And these last days I have found what was missing : the "execution context" ! So with the YASEP, a process is a hardware thread (a set of registers and special registers) associated to an execution context (the memory mapping, the access rights etc.)

Repeat after me : a process is a thread in a context.

This distinction is necessary because threads are activated for handling interrupts, operating system functions, library function calls and communication between the programs. It's a major feature of the processor which should provide functionalities that go beyond a mere microcontroller...

So IPC is necessary to make a decent OS and it requires several hardware threads (threads can be interleaved at the hardware level to provide with concurrency and better performance) and several contexts (for the operating system, device drivers, libraries, interrupt handlers...). The processor state can jump at will from one to the other with much less latency than an usual CPU.

The antagonistic requirements are as follows :

  1. A process must be able to call code from another context FAST, as fast as possible.
  2. The mechanism must be totally SAFE and SECURE.
  3. The physical implementation must be SIMPLE.

Simple and fast go hand in hand (ask Seymour Cray. Oh, wait, too late...). In the YASEP, communication takes place with a restricted variant of the function call instruction. Function calls are difficult to "harden" and more generic and specific instructions are usually found in other architectures to provide IPC or system calls. These are quite simple to implement in a CISC architecture like x86 because microcodes can do whatever is required... But they are slow because several dependent memory fetches must be performed (read the access rights table then find the address of the code to execute, whatever...)

The YASEP is a RISC-inspired architecture and requires a new approach. What I have found requires just 3 new opcodes :
  1. IPC : InterProcess Call
  2. IPE : InterProcess Call Entry
  3. IPR : InterProcess Call Return
Since the YASEP has a bank of several threads in the register set, the context switch is a matter of a few cycles only. One way to further reduce the execution time is to pre-calculate the destination address of the called code : no call table or things that require several chained/dependent memory accesses. In order to obtain the jump address, a thread must register itself in the called process and obtain the context number and the effective address. The calling thread can then modify its own code (update the constants) or variables to make the proper IPC later. Here is how simple it gets :
     IPC R1, R2    ; call context number R2 at address R1
IPC 1234h, R2 ; call context number R2 at immediate address 1234
Security is a bigger beast and just changing the TID (Thread ID) value is not a good method. The first big problem is that any code can call any context at ANY address and a security mechanism is required to block unwanted calls from succeeding. The policies could be arbitrarily complex (depending on the OS strategies) and don't belong in hardware (unlike x86), a software-based authorisation system is preferred (like MIPS !). This is the role of the IPE instruction :
  1. IPE provides the Thread ID and Process ID of the calling thread (it's a kind of GET). From this, the callee can choose to accept or refuse the call, provide a specific service or even choose to not check at all. Any software can create its own policy, call by call !
  2. IPE is NECESSARY for the IPC instruction to complete. If IPC points to an instruction that is NOT IPE, an error is triggered. This prevents all applications from jumping anywhere in any code.
  3. Each thread can restrict the range of callable addresses so calls can't enter data sections. This is the role of additional registers.
When the thread calls code from another context at the right address, the register set is preserved (not touched) so the transmission of parameters takes no effort. However several new issues appear.

For example, how can one thread in a different context access data from the previous context ? The proposed solution is to provide an attribute to each Address register : the context number. Upon call, the newly spawned process will modify the necessary attributes to access to both the current and the calling process. Which means that all the previous contexts must be kept in the processor (since interthread calls must be reentrant). Before the call, the calling process should mark the memory ranges it accepts to share with the called process (marking the range as "shareable"). This way, no data copy is necessary !

The return address and thread/process/context IDs must be managed by the CPU core itself to prevent tampering by the caller or callee. This is the last point that needs some big work and HW real estate ... A classic stack, with a stack pointer, stack base and stack limit, are necessary hardware resources to add.

So let's sum up the added hardware :
  • Each context must be able to mark memory ranges as data-read and/or data-write by other threads. This can be indicated by flags for each page in the page table. How this can be restricted to certain threads (that are in the call stack) is still uncertain, a token scheme should be created where a permission can be passed to (and inherited from) another thread.
  • Each context has 2 registers that are compared to the called address to restrict unwanted calls.
  • Each process has a set of 3 registers for the IPC stack (pointer, base, limit). Pointer and limit are compared for equality upon call and pointer and base are compared during return.
  • There are also 5 new thread-private registers that determine the owner (thread number) of a pointer. They must be preserved in HW if the caller or callee are not trusting or trusted.
That makes about 10 new registers ! How this will be implemented is still uncertain. Maybe a hardcoded sequence of instructions will be streamed through the instruction decoder, unless everything is done in parallel in big enough chips. This reminds me that in the past, I wanted to add "attributes" to the address generators of the VSP, with base/index/limit/stride, now there is the context number that is some kind of "address space number" (ASN). We can finally merge these ideas and in 16 bits code, we can use ASNs like segments in x86 : one for executable data, one for the stack, several for data, and no opcode prefix is needed.

Whatever the implementation, we're going here from a system initially designed for libraries and system calls, extended to the next level : a micro-kernel oriented architecture where processes can share memory they own so others can work on it, with little overhead. Will the Hurd people be finally happy now ?

Saturday 8 May 2010

YASEP2010


The main YASEP site has not been updated for a while...
Worse : the f-cpu.seul.org miror is down since january !

Is the project dead ?

No :-)

In fact a lot of things are being prepared, mostly in the commercial, infrastructure and very-low-level hardware (like : where are those 0402 capacitors ?) fronts. It's really exciting but it takes a lot of time and money ! Fortunately I'm not completely alone.

A string of good news will probably come in 2011, they will help the bootstrap of the whole YASEP project with different kinds of support, with broad public exposure. It will be possible to have a YASEP implementation in hand, I work both on the hardware and software sides :-) BTW, a recent Wikipedia article has appeared with a short summary of the YASEP's architecture.

Another critical part of the project (the VHDL source code and its infrastructure) is in active development : GHDL is now the officially supported simulator. I have interviewed the main developer (Tristan Gingold) for GNU/Linux Magazine France n°127. With Laura, we started a series of articles about VHDL development under Linux and I am proposing increasingly advanced ... hacks :-) The first YASEP implementations will be designed with "design for test" in mind.

In parallel, another subproject is the design of a Libre, affordable, compact and Ethernet-enabled JTAG programming probe. More on this subject in the future, but it's critical for the rest of the whole project : my JTAG probes are either USB (and constrained to Actel parts, and don't work under Linux) or parallel-port (no new consumer-grade computer today has this port anymore).

Finally, after the seul.org debacle (due to main server being compromised because of its participation in the tor network), I have opened a new miror at TuxFamily.

So I'm still polishing the tools and gathering the parts. It's not a visible activity but it's probably the most important. What does an architecture mean if there is no infrastructure behind ? With no physical implementation that one can buy and hack oneself ?

Friday 13 November 2009

Support of Alphanumeric LCD with YASEP

I have been very busy since august, unfortunately not with YASEP but I keep an eye on this project. Even though I can't dedicate days and weeks to this, I try to gather things here and there when they appear, like electronic parts, ideas, and ways to implement them.

For example I've been thinking about how to display informations with a simple FPGA kit.  I already have a nice collection of alphanumeric LCD modules that is expanding, so they are a good and cheap output peripheral.

From there, at least three things follow :

  1. The modules I own have different resolutions : from 1x8char to 4x20 but there is no electronic means to distinguish them from the others. So I recently imagined a method, discussed a bit about it on USENET and decided that it was worth implementing it. I am writing a RFC about this now.
  2. I'm going to add a set of Special Registers that support the parallel interface to a LCD module in nibble mode. This is going to provide automatic strobes, and ease application software development. This unit will also support readback of LCD resolution, supporting the protocol defined in 1. Contrast voltage is controlled by a simple PWM/PD circuit instead of a trimpot.
  3. While looking around for more informations about the HD44780-compatible modules, wikipedia sent me to a JavaScript HD44780 simulator designed years ago by Dincer Aydin. He has done even crazier things like a graphic LCD simulator or a PIC assembler in JavaScript ! I asked if I could reuse the alphanumeric code and Dincer kindly accepted :-D I have not looked at the source code but I presume that it's going to need a lot of work (particularly for updating the display engine, because updates are "optimised out" in Firefox). Anyway the YASEP simulator is not even mature enough so there is no hurry... 
Everything seems to be in place for a future use of alphanumeric LCD modules. I have more than 20pc available, I have already used some of them on a past PIC project, and the JavaScript framework will support them. I'm not saying it's going to be easy, but it's far easier than I thought !

Friday 2 October 2009

When you connect the power supply, it works...

As one can guess from the past messages on this "*log", I have been slowly preparing custom FPGA boards as a background activity. It's not an easy thing and can be quite expensive. So I patiently gathered the necessary parts through online stores and eBay, looking for interesting deals.

Finally I have all the necessary parts for a cheap and repeatable prototype. Among others :

* A bunch of A3P250VQG100 : I got them from a really nice Canadian guy and I'll use this specific reference as the main target for the future works. I originally intended to target the A3P125 but I got more powerful for less money so why refuse ? :-D The A3P250 has enough logic for moderately complex stuff (though SRAM is really TIGHT) and can replace microcontrollers in many cases.

* QFP100 adapter boards : FUTURLEC has cheap and good proto boards. The tin makes soldering easy, just add some liquid flux, no need for aditional solder.

With the help of Actel's docs and the schematics of other boards, including ACME's Colibri, I easily wired the power rails. The board is not recommended for high-speed signals but the goal is only to check the schematics for more ambitious boards (probably manufactured through FUTURLEC too, as their PCB pooling service looks great).

I created a small dumb VHDL design (8-bit counter with clock, reset and increment/decrement inputs) backed by a small board. The additional board also provides 3.3V from a battery, so I could avoid long wires from power supplies.

And in order to be programmed, the FPGA needs a JTAG interface. I soldered everything correctly but the JTAG/USB interface would refuse to work. After a small nap and many hypothesis, the problem was obvious : the JTAG signals were correctly wired but I forgot to wire the power supplies :-/ Obviously, when it's fixed, the things work considerably better... I'm amazed that it is the only error, considering my sleep deprivation :-D

First Actel proto board \o/

No, really, it just works as expected. I may have finally become good at this, after all the failures and false starts of the past :-D It even reproduces the strange behaviour that I had seen in other designs : the pins are REALLY sensitive ! Don't forget the pull-down's ! I made a basic/passive anti-bounce (just a RC filter) but it is useless : a single clock push creates many strobes and the counter advances unpredictably. But I half-expected it and I did not even register the inputs in VHDL so it is naturally glitch-prone, so I don't care. It "works".

What does this mean ?

* When enough resources are gathered, complex things become easier. I have invested a lot of money and time in the past years just to get to that point and ... it feels good !

* Great things that were "possible" now become "available" for future projects. This includes YASEP and other (commercial ?) designs.

* FPGA are damn cool ! Actel's chips are certainly slower and less capable than other makers but their products make this little board possible and easy : once the power supplies and the JTAG are (hum, correctly) wired, the board can be plugged in other cheap prototypes. No need of external Flash chip, bootstrapping EPLD, or whatever...

Next in line : a parallel port interface (hooked to a computer) then the SRAM chips :-D Then I'll try to develop an embedded CPU design with Ethernet that could replace the Rabbit, PIC and AVR. Finally, I wish to create an Ethernet-based JTAG programmer that will replace the proprietary and USB-bound FlashPro3. This proprietary probe is not extremely expensive (Actel has wisely created even cheaper versions) but USB is such an annoyance !

Sunday 23 August 2009

Back from vacations...

The lack of Internet access during 2 weeks of vacations was a very good thing for the YASEP, the development was stimulated and efficient !

I should mention that the environment helped being in a great mood, if you don't count all the insects. Have a look at this picture or this video if you wonder what it's like to develop in VHDL in the country, under a wonderful tree and sitting next to the tent. BTW, thanks Toshiba for the extra-life-battery pack for the Portégé 3490, I could work about 5 hours in a row but it recharges very slowly.

I did a lot of cleanup, completed some pages, integrated the first extended instructions and re-enabled the disassembler. I also examined the multiply instructions and created an algorithm that initialises the multiply lookup-tables ! I also added an algorithm that generates random opcode examples, instead of the fixed strings of before. It's more efficient at finding bugs !

Before I upload the new site, I still have to change some fields and remove the _X forms (as they are useless now, because the "always" condition has the same effect).

I'm also working in parallel on the VHDL source code. I'm adding a CRC32 unit mapped in the SRs so communications and files will have better and faster checks. Unfortunately, I lost a few days of work in a defunct hard disk...

Stay tuned !

edit :

The site is updated, enjoy !

I also recovered the few days of work locked in one of the computers, the disk is not completely dead (it's just dead slow so a Slackware LiveCD is necessary)

The next steps are : website minification, VHDL code development,  further development of listed, pointer update, short jump/call instructions...

I'm also looking at compression/decompression algorithms such as deflate and range coding.

Friday 24 July 2009

YASEP en français

Grâce au concours de Laura, une partie des pages web du site YASEP est en cours de traduction en français. Pour l'instant, ont été intégrées les pages suivantes : l'index, les registres, les instructions, la carte interactive des opcodes et YASEP16/32. D'autres pages devraient suivre, j'attends que Laura soumette d'autres pages.

Au début du projet, j'avais décidé de tout faire uniquement en anglais. Mon expérience m'a montré que le support de plusieurs formats ou langues différentes augmente la charge de travail, donc réduit le temps passé à créer des choses utiles. De plus, il y a toujours une version qui est à la traine et cela rend le projet incohérent, vu de l'extérieur. On a alors tendance à ne plus se référer qu'à la version "principale" (en anglais) et la version traduite sombre dans l'inutilité.

Ce coup-ci, il est bien clair que la version "officielle" du projet est en anglais. La traduction française sera probablement en retard sur un nombre inconnu de points, à mesure que le temps passe. Mais la démarche de traduction apporte plusieurs avantages :

* D'abord, j'ai tendance à écrire en anglais de manière absconse et à la fin je suis le seul à comprendre ce que j'ai écrit. La traduction me confronte à mes mauvaises manies et m'oblige à reformuler mes phrases, pour les rendre plus claires. C'est en accord avec mon exigence d'accessibilité, d'autant plus que la traductrice, Laura, est moins bonne en anglais et en technique que moi, et je voudrais être compris par des personnes encore plus débutantes.

* Ensuite, Laura est plus proche et plus exigente que les collaborateurs précédents. J'en attends une meilleure qualité et un meilleur suivi.

* Aussi, avoir deux versions d'une même page web force à séparer la présentation, le contenu et les scripts : c'est la nécessité de modularité et de non-redondance qui deviennent importants.

En plus, cela me permet de revoir et donc améliorer les pages originales, d'y faire du tri...

Comme d'habitude, je suis intéressé par toute remarque constructive pour améliorer le site.

Saturday 4 July 2009

Probable new features

When a project has practical uses and implications, it is interesting to see how it evolves and better fill de gaps that a purely theoretical design would address. For YASEP, the modifications have been very deep, while many of the neat original ideas remain. Lately, there have been a few new ideas that may or may not be implemented.

  • A new CRIT instruction :

This is a method to perform atomic instruction sequences. It opens a HW-garanteed CRITical section, that lasts a few and constant number of instructions (1 to 16 depending on the imm4 argument). After/before this, IRQs and other things are checked, to prevent the system from hanging because of back-to-back CRIT instructions...

  • External bus expansion with off-chip buffers

In the case where the number of FPGA pins is low, a lot of them are used by external SRAM. The address and data bus could be used to expand the I/O count, by adding a few 74LVC574 and 74LCV245. In this case, a few specific instructions are required because the GET and PUT instructions work only with internal resources. Another issue is the bus loading that might affect the timings and/or speed. The Inputs and outputs could be easily separated, the output latches can be tied to the address bus (because it is unidirectional) while the Input buffers can only be tied to the data bus. Voltage translation is also a desired feature.

  • CRC32 accelerator

As the need for a zlib port arises, the necessity to check CRC32 signatures becomes a problem. I have already designed CRC routines and... well... they can become quite heavy. OTOH, it is rather straight-forward to do in hardware. I don't want to make yet another instruction here because this would make the pipeline more complex (and the number of registers is already too small) but a small set of SR will do the trick.

  • DMA for SPI

SPI is used when booting the CPU from a SPI Flash memory, or when communicating with Ethernet or 2.4GHz interfaces. Adding a simple DMA capability would save a lot of cycles and latency.

Other things will certainly come later...

- page 1 of 3