OpenRISC Gains Atomic Operations and Multicore Support
An anonymous reader writes "You might recall the Debian port that is coming to OpenRISC (which is by the way making good progress with 5000 packages building) — Olof, a developer on the OpenRISC project, recently posted a lengthy status update about what's going on with OpenRISC. A few highlights are upstreamed binutils support, multicore becoming a thing, atomic operations, and a new build system for System-on-Chips."
Seriously.
No, fool, I is Gains Atomic. [sounds like an great stage name]
It has 16I/Os, powered with 23V, uses 8W and costs 330 euros. Good enough for you?
t is not going to be fast at all, likely in the vicinity of a few hundred MHz. FPGAs are very slow, the 5 stage pipeline version of MIPS I made on one ran at ~80Mhz.
> Absolutely nothing over any of the well supported and understood open source MIPS implementations.
Ah! Read this ( http://jonahprobell.com/lexra.... ) and be cautious when re-implementing the MIPS ISA..
What are the advantages of openrisc?
It is free, so if you want to run a softcore, there are no license fees. If you can read Verilog, you can verify that there are no NSA backdoors.
What are the performance of such a softcore?
An FPGA softcore is going to run several times slower, and consume several times as much power, as a hardcore. If you need a small amount of computing, and most of your app is in the FPGA fabric, then that is reasonable, although you might be able to get by with an 8-bit softcore like PicoBlaze, or even roll your own mini 8-bit core with opcodes customized for your app (this is not that hard, and is a fun project if you are learning Verilog and ready to go beyond blinking LEDs). But if you are doing something compute intensive, you may want to look for something with an integrated hardcore.
Can I expect to have something usable?
That depends on what you are using it for.
MIPS may (or may not be) "open source", however it is not free to implement. Implement the latest MIPS ISA without a license agreement from MIPS and you'll be sued to smithereens. You won't be sued if you implement OpenRISC though.
Oolite: Elite-like game. For Mac, Linux and Windows
It is also possible to use the Verlog to make an ASIC if you go into production.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
" This is just another cause-we-can hobby project on the front page of Slashdot."
OpenRISC is far from a "cause-we-can" project.
Yea it certainly is, and I can't see it ever being anything else. It will never be used in anything useful, just people who want to tinker for the sake of tinkering. Schools have been teaching CPU design with other archs for years, and can continue to do so without OpenRISC.
3D Printing of OpenRISC...
The original 6502 had atomic operations. Read-modify-write operations on memory, such as bit shifting or adding or subtracting 1, would execute a read-write (old value)-write (new value) sequence. This protocol of not waiting between a read of a particular address and writing the new value would allow a memory controller to lock the bus by allowing only one device to write at once. This feature was removed in 65C02 in favor of read (and use)-read (and ignore while calculating)-write (new value), which is slightly safer for memory-mapped I/O but possibly less safe for synchronizing a CPU with other CPUs or DMA sources.
oh? what problem does it solve?
If you can read Verilog, you can verify that there are no NSA backdoors.
But is there a backdoor in your Verilog compiler? Normally, you might use David A. Wheeler's diverse double-compiling method to ensure beyond reasonable doubt that your compiler isn't backdoored. But diverse double-compiling doesn't work unless the compiler is written in the same language that it compiles. And I don't think the Verilog compiler is written in Verilog.
you might be able to get by with an 8-bit softcore like PicoBlaze
Wikipedia's article about PicoBlaze states that it's not free to use on anything but a Xilinx FPGA. So if you switch to Altera or go into production with an ASIC, you might have to switch to PacoBlaze and deal with any minor behavior differences.
"roll your own mini 8-bit core with opcodes customized for your app (this is not that hard"
Not that hard by Verilog standards. The sight of it tends to make software developers run in terror.
Or to be clearer, MIPS owns several patents on instructions in the ISA. Though I think some of them were worked around another way since the patent covers implementation.
But many other architectures are patented as well - x86 is covered by many patents (most owned between AMD and Intel and cross-licensed), which probably explains why a good chunk of embedded x86 only do the i486 ISA. (Excepting companies like Via who license the patents).
Does anyone have any idea why OpenRISC is big-endian? Considering that little-endian has pretty much won nowadays (Every major CPU is either little or bi endian) why would anyone want to release a big-endian cpu?
Another advantage of an open source softcore, is that you can add your own application specific opcodes. You could run your app in a profiler with the standard instruction set, and identify the hot spots. If a big chunk of your CPU time is spent in a single tight loop, you could implement that code directly in FPGA fabric, and execute each iteration in a single clock tick with a custom instruction. For instance, lets say you need to run some sort of CRC or crypto, with involves shifting, masking and adding bits. That would be easy to code up in Verilog into a single instruction, which is then executed by extending OpenRisc for the new opcode. Then just use the "asm" feature of GCC to put that opcode in the inner loop of your C program. Depending on your app, it is possible that you could get better performance from a customized softcore than from a generic hardcore, like ARM or MIPS.
We're just about to open source (Apache-style license) our MIPS IV implementation. MIPS IV is over 20 years old, so there exists at least one implementation that is not covered by any patents. We can't guarantee that nothing in our implementation is patented, but the patents in your linked article have all expired now.
I am TheRaven on Soylent News
>The sight of it tends to make software developers run in terror. That's because it has very little to do with software programming.
I can't see it ever being anything else. It will never be used in anything useful
"Flextronics International and Jennic Limited manufactured the OpenRISC as part of an ASIC. Samsung use the OpenRISC 1000 in their DTV system-on-chips (SDP83 B-Series, SDP92 C-Series, SDP1001/SDP1002 D-Series, SDP1103/SDP1106 E-Series). Allwinner Technology are reported to use an OpenRISC core in their AR100 power controller, which forms part of the A31 ARM based SoC. ... TechEdSat, the first NASA OpenRISC architecture based Linux computer launched in July 2012, and was deployed in October 2012 to the International Space Station with hardware provided, built, and tested by ÅAC Microtec and ÅAC Microtec North America."
https://en.wikipedia.org/wiki/OpenRISC#Commercial_implementations
Is this free to implement?
And those patents, or more specifically the single patent about the unaligned load and store instructions on MIPS expired years ago. To be specific it expired in December 2006.
So while the patent was an issue back in ~200 when OpenRISC was launched it is no longer relevant, and you would be better off implementing a MIPS32 and MIPS64 bit core.
I would also point out that there are full open source implementations of the SPARC architecture, which never suffered from the patent problems of MIPS.
"oh? what problem does it solve?"
What the fuck is so hard to understand here? The answer is in the name of project: An opensource CPU core. There, was that so hard?
Besides being a snarky ass, what was the point of your post? It sounds as if you would rather spark a flame war rather than do some actual research which would take oh lets say 5-10 minutes.
There are other open source cores but none of them are trying to provide a full blown CPU core that could potentially be used for mobile or desktop use. Most of them are for embedded use and are little more than a micro controller and lack an MMU.
You have not even answered the question with all your ranting and hot air. Again, what problem does an open source CPU format solve? I cannot think of a one, my open source software works fine on x86, sparc, MIPS, ARM7 (and anyone interested can get the specs for most of those architectures). I'll even make a claim that open specs are good enough for a CPU, irrelevant whether the particular mask patterns are known.
I'll even make a claim that open specs are good enough for a CPU, irrelevant whether the particular mask patterns are known.
The problem that OpenRISC solves is an absence of free CPU IP. You do not consider an absence of free CPU IP to be a problem but others do consider it a problem and have created OpenRISC to solve that problem.
Exactly. This is very much a hardware thing - and if you want a processor embedded in your chip, it's because you want to run software. Spending time messing around with intricate hardware design is just going to divert you from the important tasks.
I cannot think of a one, my open source software works fine on x86, sparc, MIPS, ARM7 (and anyone interested can get the specs for most of those architectures). I'll even make a claim that open specs are good enough for a CPU, irrelevant whether the particular mask patterns are known.
That is good enough for a software developer, but not for those of us that do hardware development. I have projects that need both a general purpose processor and an FPGA to deal with various combinatorial logic and counters independent of the cpu. In that case I might as well use a cpu on the FPGA too, and not need two different chips. The availability of free to use cpu helps with this, especially if there are several different ones to chose from, to chose the one most appropriate or one that can be modified as needed.
Your post is about one step away from, "What good does solder do? I've never had to solder something when writing a program before."
Bro, you tryin to steal my sick gainz bro?
I would also point out that there are full open source implementations of the SPARC architecture, which never suffered from the patent problems of MIPS.
...but they do suffer from (a very poor implementation of) register windows.
Stick Men
Some random remarks.
-The fact that an instruction set may or may not be copyrightible is still to be debated. There were a loog time ago some fighting in mainframes between IBM and clone manufacturers, probably more than 40 years ago.
-MIPS patents. There were patents on unaligned access instructions. These patents are been long extinct. The "problems" with implementing an active
instruction set is that new instructions are added regularly and new patents may be issued about how to implement these instructions. It is safe to imitate
a 30 years old CPU, but you must restrict yourselft to the instruction defined at that time.
For example, there is a free implementation on opencores of the ARM2 isntruction set, which is incompatible with current ARM CPUs, it has as such
little commercial value. There are certainly still active patents on AMD's 64bits extension to x86, but the original 8086...80386 instfruction set is old enough to be
safe from patents.
-SPARC is almost free (There is a 99$ unlimited "architecture licence").This licence covers the instruction set, not the branding/logos.
-Brands are separate from patents. MIPS, and even SPARC prevent unlicenced use of their brand.
-The OpenRISC instruction set is not particularly insightful, so it is hard to defent against almost equally bland like MIPS but already have all the software infrastructure : OS, compilers,...
-Maybe putting efforts on the free RISC-V instruction set from Stanford, which is modernised/rationalised version of MIPS and all the modern RISCs, would make more sense than keeping OpenRISC alive.
Of course everybody knows now that register windows and delayed branches are bad ideas. ... nevertheless, like x86 has proved, a "suboptimal" instruction set does not prevent high performance implementations, as
Even at that time they knew at Sun that it was an hurried decision, due to the lack of good compilers.
Sun/Oracle and Fujitsu are delivering now.
I know a decent amount of VHDL and Verilog, but I know other software developers who are geniuses in it.
Take the people who wrote OpenRisc for example.
For a sufficiently complex netlist, how can you prove that the HDL compiler didn't insert an obfuscated backdoor? This is especially important if one of the HDL compiler's developers placed highly in an Underhanded C Contest.
How about embedding a CPU core into your ASIC design, without paying licensing fees to MIPS or ARM?
I agree with that.
I do FPGA work, though I usually tend to use an 8-bit rotating register file machine[1] I wrote myself. This has the advantage that the instruction packing is compact, the very opposite of RISC, and I can use the unused instruction codings as peripheral register reads and writes.
I admit that if I had a design that called for running a vmunix OS like Linux or NetBSD, I would consider OpenRISC, though I think nowadays I would use an HPS like the Altera Cyclone V SX with an embedded ARM Cortex A9 hardcore. If I was designing an ASIC[2], area and performance would be a larger concern than IP licensing, so I would probably still go with ARM or MIPS.
Usually though, I think along the lines of "anything I can do in C, I can do in Verilog", so I'm using small 8 to 18bit cpus to program sequencing and multiplexing, and anything computational, I'm realising in logic. In that realm of thinking, CPU complexity is just a waste of area, that could be better spent on application logic.
CPUs are simple, I write one in verilog in an afternoon. What is good about OpenRISC and co. is that they include MMUs and cache hierarchies, these take longer to write and a LOT longer to validate.
[1] What this means is, the result is always stored in register 0 (through a mux), and registers 0..n-1 are moved to registers 1..n, this results in reduced instruction coding, as only the srcA and srcB registers need to be coded in the instruction, and maps perfectly from SSA compiler output. The registers aren't actually moved, instead the mux index base is incremented by one.
[2] I've never designed an ASIC.
or even roll your own mini 8-bit core with opcodes customized for your app (this is not that hard, and is a fun project if you are learning Verilog and ready to go beyond blinking LEDs).
No doubt. I did this in an afternoon. It even worked first time.
Though I hazard the difference in complexity between a simple static stack machine or rotating register file machine and a complex pipelined machine is substantial. The difference in complexity between a CPU core and a virtual memory hierarchy is even more substantial.
Incidentally the word size of a core doesn't change it's complexity to implement. Writing a 32-bit core is just as simple as an 8-bit core, it just eats more resources, because all those nets are now wider. I could just as easily write a 256-bit stack machine as an 8-bit stack machine, it could even have the identical ISA, it's just that it would synthesize to a massive behemoth, with probably terible timing closure. It would also not be very useful, as most programs don't contain 256 bit data types (crypto excepting).
If you simply care about complex sequencing of logic elements, then all you need is a simple machine with minimal area coverage. If you instead want to do complex heavy lifting on CPU, then I would suggest you are better off looking at an FPGA with an embedded ARM core. Both Xilinx and Altera make a range of FPGAs with embedded Cortex A9, and Lattice make one with an embedded Cortex M4.
or even roll your own mini 8-bit core with opcodes customized for your app (this is not that hard, and is a fun project if you are learning Verilog and ready to go beyond blinking LEDs).
No doubt. I did this in an afternoon. It even worked first time.
Though I hazard the difference in complexity between a simple static stack machine or rotating register file machine and a complex pipelined machine is substantial. The difference in complexity between a CPU core and a virtual memory hierarchy is even more substantial.
Incidentally the word size of a core doesn't change it's complexity to implement. Writing a 32-bit core is just as simple as an 8-bit core, it just eats more resources, because all those nets are now wider. I could just as easily write a 256-bit stack machine as an 8-bit stack machine, it could even have the identical ISA, it's just that it would synthesize to a massive behemoth, with probably terible timing closure. It would also not be very useful, as most programs don't contain 256 bit data types (crypto excepting).
If you simply care about complex sequencing of logic elements, then all you need is a simple machine with minimal area coverage. If you instead want to do complex heavy lifting on CPU, then I would suggest you are better off looking at an FPGA with an embedded ARM core. Both Xilinx and Altera make a range of FPGAs with embedded Cortex A9, and Lattice make one with an embedded Cortex M4.
It's not even hard to write a pipelined RISC machine, I did one in two weeks in my spare time. It even worked the first time. ;)
Writing one that's remotely efficient (in terms of speed and area) is harder, but not even that is very hard.
Writing compiler backends, Linux (and other OS) ports, SoCs around the CPU core, and maintaining all that, is perhaps not hard, but time consuming.
OpenRISC is a fairly efficient RISC machine, with a good software eco system and with an active development community that's completely transparent.
To me, that's an advantage. And as far as I know, a pretty unique combination.
That would be awesome. It would be even awesomer if it optionally included some useful wide SIMD execution units or DSP instructions. Do you have an estimate for size (effective gate count/transistor count)?
We're using about 30% of a Stratix IV, a bit more with an FPU. We've also got a smaller version (no TLB, smaller caches) and multicore / multithreaded variants that are larger. We run at 100MHz (pass timing at 120-150MHz depending on the features enabled, but 100MHz gives some headroom when experimenting).
I am TheRaven on Soylent News
and for a couple bucks you can buy an 8 or 16 bit cpu and slap on on a board with your custom FPGA, you're doing it wrongly.
eh, most of my life was engineering physical things including controllers. You take a couple buck CPU and slap it on a board with your FPGA, in 99.9999% of cases making your on fucking CPU is a waste of time and money. It's like a developer who says "I need to write my own web server from scratch to run my php code"
And yet we have multiple open source web servers, each with varying levels of complexity, feature sets and usage cases.