Slashdot Mirror


Toward An FSF-Endorsable Embedded Processor

lkcl writes about his effort to go further than others have, and actually have a processor designed for Free Software manufactured: "A new processor is being put together — one that is FSF Endorseable, contains no proprietary hardware engines, yet an 800MHz 8-core version would, at 38 GFLOPS, be powerful enough on raw GFLOPS performance figures to take on the 3ghz AMD Phenom II x4 940, the 3GHz Intel i7 920 and other respectable mid-range 100 Watt CPUs. The difference is: power consumption in 40nm for an 8-core version would be under 3 watts. The core design has been proven in 65nm, and is based on a hybrid approach, with its general-purpose instruction set being designed from the ground up to help accelerate 3D Graphics and Video Encode and Decode, an 8-core 800mhz version would be capable of 1080p30 H.264 decode, and have peak 3D rates of 320 million triangles/sec and a peak fill rate of 1600 million pixels/sec. The unusual step in the processor world is being taken to solicit input from the Free Software Community at large before going ahead with putting the chip together. So have at it: if given carte blanche, what interfaces and what features would you like an FSF-Endorseable mass-volume processor to have? (Please don't say 'DRM' or 'built-in spyware')." There's some discussion on arm-netbook. This is the guy behind the first EOMA-68 card (currently nearing production). As a heads ups, we'll be interviewing him in a live style similarly to Woz (although intentionally this time) next Tuesday.

53 of 258 comments (clear)

  1. DRM by queazocotal · · Score: 4, Interesting

    DRM, in some aspects - trusted computing - can be a positive thing.
    My ideal system would have a root key I can set, that without software signed by it, it is a rock.

    1. Re:DRM by thrift24 · · Score: 3, Informative

      The reality is the signed executables are going to interact with unsigned data during bootup or normal operation and the exploit to run unsigned code can be triggered at this point

      For example the original xbox could be convinced to run unsigned code through exploited game saves and then system files(fonts/audio db) could be replaced with corrupted versions meant to trigger an exploit on bootup. This is how soft modding was performed for the xbox.

  2. Scientific Computing by simonbp · · Score: 5, Interesting

    IMHO, they really need to push this for scientific computing initially, as they tend to buy in bulk and are not very binary dependant. They are claiming it is so low power (2.7 W) that it would be easy to put an array, say, eight of them on a 1U motherboard for 64 cores.

    1. Re:Scientific Computing by korgitser · · Score: 3, Funny

      imagine a beowulf cluster of those!

      --
      FCKGW 09F9 42
    2. Re:Scientific Computing by mako1138 · · Score: 2

      As long as we're comparing mysterious numbers*, let's take a closer look.
      Future Chip:
      38 GFLOPS / 2.7W = ~14 GFLOPS/W
      Tesla K20x:
      3950 GFLOPS / 235W = ~16.8 GFLOPS/W
      Radeon 7970:
      3790 GFLOPS / 280W = ~13.5 GFLOPS/W

      So I'm not seeing a power advantage here. More questions: does the chip do double precision, and what's the rate? What's the memory bandwidth? Is there support for ECC/scrubbing, which is essential for Big Deal calculations? (The 7970 doesn't support ECC. The Tesla does, and it had better given the amount of money you pay for it.) I'd imagine the Future Chip would be a cheaper solution, but you're starting from scratch with the compilers when everyone else has a major head start.

      So while I think a FSF Principles chip is a good idea, pitching it for scientific computing is a stretch.

      *Future Chip numbers probably do not include memory power consumption, and are likely a optimistic extrapolation from the dual-core silicon. Radeon result is the unholy combination of AMD's published single-point FLOPS and the max power consumption from Anandtech's review. Tesla numbers are marketing numbers combined with TDP.

    3. Re:Scientific Computing by plus_M · · Score: 2

      Infiniband. That's what every scientific computing cluster I've used has had for multi-node parallel computations. Most parallelized scientific computing applications support MPI over Infiniband.

  3. An almost unbelievable breakthrough if true by fnj · · Score: 2

    I always wondered why it is always assumed that separate CPU and GPU are somehow the most efficient use of silicon. It just seemed counter intuitive to me. If the proposed processor is as efficient as claimed, it looks like I was right to wonder. This absolutely annihilates Intel and AMD on a performance per watt basis.

    1. Re:An almost unbelievable breakthrough if true by muon-catalyzed · · Score: 3, Interesting

      Hopefully FSF also patents it, so no troll can extort license fees from using the technology. In fact FSF should patent it all, make the blue prints available RFC-style and don't bother with anything else.

  4. just a little skeptical of those numbers by dywolf · · Score: 5, Insightful

    ok more than a little.

    --
    The guy who said the election was rigged won the presidency with the second-most votes.
    1. Re:just a little skeptical of those numbers by lkcl · · Score: 3, Interesting

      tell me about it. please share your concerns. this is not being sarcastic: i need to know. i need to know what the right questions to ask are, because i don't know.

  5. No thanks by betterunixthanunix · · Score: 4, Interesting

    Can we please move away from x86? That architecture is horribly outdated, loaded down with things that sort-of made sense in the 1970s. Today's x86 CPUs are just dressed up RISC machines; let's free up some of that chip space and just use RISC.

    If you want to run x86 binaries, use a dynamic translation tool.

    --
    Palm trees and 8
    1. Re:No thanks by CajunArson · · Score: 3, Insightful

      Today's ARM architecture is just a dressed up CISC architecture, let's move away from ARM's lame attempts at copying AVX with neon and just use the real thing!

      (You see how the door swings both ways there? Trust me, if any architecture designer from the early 1990's were frozen in a block of ice, thawed out today and then shown the ARMv8 ISA, he would never in a million years call it "RISC")

      --
      AntiFA: An abbreviation for Anti First Amendment.
    2. Re:No thanks by lkcl · · Score: 4, Interesting

      Can we please move away from x86?

      yes please!

      That architecture is horribly outdated, loaded down with things that sort-of made sense in the 1970s. Today's x86 CPUs are just dressed up RISC machines; let's free up some of that chip space and just use RISC.

      this team have come from the perspective of what makes a good GPU, then turned it into a CPU. it's about as far as you can get from x86 as you can possibly get. luckily they've done the hard part of porting at least one OS (android) so have proven the tools, the compiler, the kernel, everythine.

      with linux now being the main OS it's hard for me to even remember that windows and x86 was relevant at one point. not that i'm ruling out the possibility of MS porting windows to this chip: if they want to, that's great: they'll just have to bear in mind that there will be no DRM so they won't be able to lock everyone out.

      If you want to run x86 binaries, use a dynamic translation tool.

      who was it.... i think it was ICT who put 200 special instructions into the Loongson 2H, which allow it to accelerate-emulate the most common x86 instructions, they got 70% of the main processor speed.

    3. Re:No thanks by K.+S.+Kyosuke · · Score: 2

      Trust me, if any architecture designer from the early 1990's were frozen in a block of ice, thawed out today and then shown the ARMv8 ISA, he would never in a million years call it "RISC"

      That still doesn't make it any less true that it's much more preferable to hang yourself rather than to try to write a performing x86 compiler backend. With ARM? I'm not so sure.

      --
      Ezekiel 23:20
    4. Re:No thanks by VortexCortex · · Score: 2

      if any architecture designer from the early 1990's were frozen in a block of ice, thawed out today and then shown the ARMv8 ISA, he would never in a million years call it "RISC"

      Perhaps not, they'd be dead, yes? However, if instead of frozen in ice they were merely kept alive for two short decades...

    5. Re:No thanks by Anonymous Coward · · Score: 5, Insightful

      Yes, we can move away from x86.
      No, it isn't a good idea.

      It's time to put this one to rest.
      It's been a few decades and we've seen the argument from theory, practice, and to conclusion today.

      x86(and it's er.. extension/evolutions) IS the better general purpose arch. But not for the reasons anyone conceived of. I think it's best put this way.

      1. RISC(for example) very good at running good code.
      2. Most code is bad. (No really, it's awful. Ask any programmer)
      3. x86 processors, it turns out, are very good at running bad code.

      Many other arches were created under the premise that good code could be created for them automatically. Turns out that compilers that can do this are like unicorns. They don't exist. It's an np-hard problem.

      It's what killed itanium. The magic compilers never turned up. The amount of developer effort required to write good software isn't worth it.

      *Why is most code bad you ask? Easy. Programming, put crudely, is a bullshit art.
      Just ask Dijkstra (Well not anymore. He's dead now) Programs are math. Few programs, however, are proven to be "correct" mathmatically. - It's impractical for most applications. Sure, you have rules you call "Practices" that tend to generate better code.. But everyone knows how code is really developed nowadays. Lay it down, slap it around until the show stoppers are reduced to a bearable frequency, and patch up anything you missed after it ships.

      I'm not saying this approach is necessarily bad. It has advantages. It's very fast! It's fast, and you can get a lot of useful work out of it. If your idea or application is good or novel or productive enough you can put up with some bugs and at the end of the day you'll end up ahead. - If you set out to write a program that's mathematically prove-able from start to finish.. Your competitors will have buried you years before your first release.

    6. Re:No thanks by Richard_J_N · · Score: 3, Interesting

      How about implementing just a few of the most common C-library functions in dedicated hardware. For example, atoi(), strlen(), or printf(). Although the software routines are highly optimised, they still take hundreds to thousands of cycles. Dedicated libc functions would require a significant amount of chip die space, BUT, they would be really power-efficient - powered off most of the time, and simply used when needed. Imagine being able to use these functions as single-cycle commands... even if the core ran at 100MHz, the performance would be amazing. Essentially it lets us trade a few hundred thousand transistors (now very cheap) for a few mW (still rather valuable).

    7. Re:No thanks by bzipitidoo · · Score: 2

      Conditional instructions are cool. The x86 instruction set isn't.

      Are there any x86 instructions that are slow to emulate

      Yes. Lots of them. Not only are these instructions slow, they're useless. No one needs the ASCII or decimal adjust instructions AAA, AAD, AAS, AAM, DAA, or DAS anymore, and they were never much use to start with. There have been a few cases in which these instructions were cleverly used for other than their intended purpose, but those are rare. Then there's the REP with CMPSB, CMPSW, SCASB, and SCASW instructions. They're useless for string searches-- we have much better string search algorithms than that. SCAS in particular is a legacy of C strings. Its main use is to search for the terminating null. Nice, except that the terminating null never was a good idea to start with, and we've been pulling away from it. They're useful for string comparison, except that we have lots of ways to avoid having to do a nasty old string comparison. LOOP could have been more useful, except they tied it to a single register-- the same register that REP needs. Besides, it doesn't save much-- a DEC plus JNE does the same thing. There is also CALL and RET, and PUSH, POP, PUSHF, and POPF. The idea of subroutines and stacks is fine, but this implementation pigs out on valuable registers (a longstanding criticism of x86 is that they didn't put in enough general purpose registers), and stacks can be implemented just fine with more general purpose instructions. CALL can be done with a store and jump, and RET can be done with a indirect jump that fetches the address stored earlier. Stacks are so 1970s in thinking. The x87 stuff is even more fixated on organizing around a stack. We have gobs of memory now, but this PUSH and POP work on one register at a time. Why? So a subroutine can save only the registers it uses? In the 1970s, every byte was precious. Now, have a LOADALL and STOREALL instruction with a more general pointer increment not one that must use only one register pair, SP:SI, don't worry if a few registers that didn't need saving got saved anyway, and let the CPU get on with the real code instead of forcing it to pipeline 8 or more PUSH instructions. Instructions like XLAT and LODS are more of those overspecific, too limited to be much use instructions. The function they perform is very useful, but it's better done more generally with an indirect MOV. The instructions for manipulating flags, LAHF and SAHF, and CLC, CLI, CLD, CMC, STC, STD, and STI are rather silly. Have a register devoted to the flags, and manipulate them with all the general purpose instructions that work on any register, rather than waste opcode space on them. Another dumb instruction is TEST, which is an AND that doesn't keep the result, it only sets flags. Totally unnecessary if there are plenty of registers.

      The x86 is absolutely chock full of 1970s cruft that isn't much used anymore, but which must be dragged around for the odd compatibility need. Worse, it wasn't even all that good a design for its time!

      --
      Intellectual Property is a monopolistic, selfish, and defective concept. It is "tyranny over the mind of man"
    8. Re:No thanks by n7ytd · · Score: 2

      How about implementing just a few of the most common C-library functions in dedicated hardware. For example, atoi(), strlen(), or printf(). Although the software routines are highly optimised, they still take hundreds to thousands of cycles. Dedicated libc functions would require a significant amount of chip die space, BUT, they would be really power-efficient - powered off most of the time, and simply used when needed. Imagine being able to use these functions as single-cycle commands... even if the core ran at 100MHz, the performance would be amazing. Essentially it lets us trade a few hundred thousand transistors (now very cheap) for a few mW (still rather valuable).

      Yeah, but how do we decide which functions those are? And why C functions? And once we hard-code those functions into silicon, we have to jump through extra hoops to change their behavior.

      All three of your examples make a weak argument for this. atoi() is out of favor, since it doesn't detect errors like the strtol() function does. strlen() has no safety or bounds checking, and printf() is horribly complex.

      BTW, some instructions in the x86 family are very specific for things exactly like this already. For your strlen() example, the SCASQ instruction and friends so something oh-so-close.

  6. Re:x86 - NOT!!!!! by fnj · · Score: 5, Insightful

    I couldn't care less if it is x86 compatible (I assume it is emphatically not). I'm sure the FSF does not care, either. I would use this in a heartbeat for my main desktop, and since I haven't had any significant dealings with Windows in at least 8 years, all I need is a free Posix OS (probably linux) and a C/C++ compiler.

  7. Those performance numbers are BS by CajunArson · · Score: 5, Informative

    Those performance numbers are pure fantasy. First off, the 38 GFlops is undoubtedly referring to single precision operations while the x86 processors mentioned in TFS are doing that much in *double* precision mode. Second off, the 38 GFlop number is a simple arithmetic estimate of what the magic chip could do IFF every functional unit on the chip operated at 100% perfect efficiency. Guess what: a real memory controller that could keep the chip fed with data at that rate will use > 3 watts all by itself. This chip won't have a real memory controller though, so you can bet the 38 GFlop performance will remain a nice fairytale instead of a real product.

    --
    AntiFA: An abbreviation for Anti First Amendment.
    1. Re:Those performance numbers are BS by godrik · · Score: 3, Insightful

      Indeed, high gigaflops is easy, useful high gigaflops is hard. You can easily build a processor that only support float-addition and nothing else with a 1024 bit SIMD register clocked at 4 Ghz. And voila, you get 128Gflop/s per core. Problem is: it is useless.

      The question is not how many adds or muls you can do per second in an ideal application for your architecture. The question is how many adds or muls (or whatever you need to measure) you can do per second on a real application.

      For instance, the top-500 uses linpack, that measures how fast one can multiply dense matrices. That problem is only of interest to a small amount of people.

    2. Re:Those performance numbers are BS by CajunArson · · Score: 4, Insightful

      unless you consider 1333mhz 32-bit DDR3 not to be a real memory controller?

      Thanks for filling in that detail since I didn't know the precise specs (and for proving me right). To reiterate: No, this thing does not have a real memory controller compared to the 128 bit (2 channel 64-bit) or 192 bit (3 channel 64-bit) memory controllers in the AMD and Intel chips, respectively, that are mentioned in TFS.

      You can go on and on about some busy-loop that you were able to code that gets all those gigaflops. I can get a 386 to tell me the result of 100 quadrillion quad-precision add-muls where the only operands are zero in less than a second too.. but it isn't useful work.

      Trust me, if a chip even remotely like the one you are describing could do all that useful computational work in less than 3 watts using a previous generation process, then it would already have been deployed in supercomputers years ago and this wouldn't be some pie in the sky FSF project.

        I have no problem with a hobby project to build a CPU with an open architecture, but frankly hyperbole and outright dishonesty about performance expectations are not doing you or anyone else in the project any favors. Being "open" should include being honest & realistic first and foremost.

      --
      AntiFA: An abbreviation for Anti First Amendment.
    3. Re:Those performance numbers are BS by AdamHaun · · Score: 5, Interesting

      Forget the performance numbers, the whole thing is bullshit:

      * The proposal is dated December 2, 2012 for an advanced kitchen sink SoC with silicon in July 2013? Really?

      * Their never released to market CPU design that beats an ARM on one video decoding benchmark is ready to go, except they need to move it to a new process, double the number of cores, and speed it up by 30%. Trivial, I'm sure.

      * This bit here:

      What's the next step?

      Find investors! We need to move quickly: there's an opportunity to hit
      Christmas sales if the processor is ready by July 2013. This should be
      possible to achieve if the engineers start NOW (because the design's
      already done and proven: it's a matter of bolting on the modern interfaces,
      compiling for FPGA to make sure it works, then running verification etc.
      No actual "design" work is needed).

      The design is done! They just have to, you know, grab their perfectly-working peripheral IPs from unstated sources, "bolt them on" to their heavily-modified CPU, and then compile for FPGA. And maybe some timing simulations for their new 40nm process, but I'm sure that won't turn up any problems. And "verification, etc." (aka the part where you actually make it work). And fixing any problems found in silicon. But no *actual* design work is needed.

      I have spent the last three months in my day job on a team of a dozen people writing design verification test cases for a new SoC. Fuck you for talking like that's nothing.

      * They're going to hit "Christmas sales"? So despite being a real honest for-profit multi-million-selling product, we swear, they're still targeting a consumer shopping season. Hint: you want your chip to go into other products. Products sold at Christmas time are designed long before Christmas. Probably more than six months before, i.e. July 2013. Oops.

      * No mention of post-silicon testing, reliability studies, or even whether they've got a test facility lined up, or what kind of resources they need for long-term support. I said it when OpenCores pulled this crap, and I'll say it again. Hardware is not software. You have to think about this stuff. Yield and reliability are what determine whether other companies buy your stuff and whether you make money from it.

      Let me offer some advice to anyone who wants to change the semiconductor world overnight with the magic of open source: start small. Really small. Even Linus Torvalds didn't start out planning to conquer the world. Maybe you could start by trying to get open source IP blocks into commercial products. Once there's a bench of solid, field-tested designs, *then* we can talk about funding an attempt to put it all together. But coming out of nowhere and asking for $10 million is not the way to start. Just ask OpenCores -- their big donation drive got them a grand total of $20 thousand.

      --
      Visit the
    4. Re:Those performance numbers are BS by AdamHaun · · Score: 5, Insightful

      pay attention 007: we're aiming for mid-2013

      Yes, that's what I said:

      * The proposal is dated December 2, 2012 for an advanced kitchen sink SoC with silicon in July 2013? Really?

      Perhaps my phrasing was unclear. I am skeptical of a six-month development process.

      also, bear in mind: the core design's already proven.

      By who? To what specs (temperature, voltage, operating life)? Using what methodology?

      mid-2013, whilst pretty aggressive, is doable *SO LONG AS* we *DO NOT* do any "design" work. just building-blocks, stack them together, run the verification tools, run it in FPGAs to check it works, run the verification tools again... etc. etc.

      You know you can't go straight from RTL to silicon, right? You need timing sims and physical layout. Those are not trivial and they cannot be totally automated.

      the teams we're working with know what they're doing. me? i have no clue, and am quite happy not knowing: this is waaay beyond my expertise level and time to learn.

      Okay, here's the part that confuses me. You came up with an idea, talked to other people with expertise about doing it, and it sounds like you know who's working on it. All of that is fine. What I don't understand is why you are acting as the leader/spokesman for a project you know almost nothing about. Who are these other groups? The link at the bottom of your proposal is to a no-name Chinese semiconductor company that formed last year and has no products listed. Are they doing the RTL, layout, and verification? Who's doing the silicon testing? What foundry will you use?

      The reason I'm being so harsh here is because you're asking for a lot of money with very little credibility. There is nothing in your proposal, your CV, or your comments to suggest that you are competent to work on a project like this. So who's doing the work? Why aren't their names on the proposal? Who has the experience and leadership to make sure the project actually gets done? Why are you "quite happy not knowing" what they're doing when you're the one trying to secure funding?

      If you come back here in 2013 with a working chip I'll be the first to apologize, but right now I see very little reason to take this seriously.

      --
      Visit the
    5. Re:Those performance numbers are BS by CajunArson · · Score: 4, Insightful

      First of all: Lots of non-x86 high-performance computers have similar memory controller layouts. Look at high-end SPARC or Power architecture systems.

      Second of all: Thanks for proving me right with your screed about how ARM chips don't have good memory controllers. Guess what: you're right! They don't! And guess what: The Cortex-A15 is the first ARM chip capable of beating a 4 year old Atom when clocked north of 1.5 Ghz! So that's the type of performance that even the supposedly miraculous ARM gets with its architecture and a similar memory controller! You are now claiming to be insanely smarter than everyone at ARM and Intel simultaneously.. if chips could be designed and built based solely on arrogance & ego, you'd put ARM & Intel out of business by next Tuesday.

      So basically you have been trolling this thread calling everybody who has pointed out flaws in the grandiose promises that you have put forth "007" in a smarmy and condescending manner while presenting zero facts to backup your arguments and contradicting yourself at every turn.

      From your annoying and repetitive use of "007", do you perchance speak with a British accent? Do you appear in informercials at 2AM pushing whatever fake product of the day some insomniac can buy for $19.95? Because that's exactly how you come across in these discussions, and if you actually are associated with this project and aren't just troll then I'd highly recommend that the FSF immediately disavow this project before they end up getting sued when you make off with somebody's money.

      --
      AntiFA: An abbreviation for Anti First Amendment.
    6. Re:Those performance numbers are BS by LordLimecat · · Score: 3, Insightful

      well, tell you what, rather than accusing, why don't you ask me to ask them

      Its not a matter of asking. If someone could match even a 2-gen old i7 design on 3 watts, they would have done so by now, undercut Intel, and made zillions. They cant, because Intel processors are really good and their R&D budget dwarfs the budget of most US states, not to mention they own their own fabs and are 1-2 generations ahead of literally everyone else in process scale.

      Even without deep technical knowledge, it doesnt pass the smell test.

    7. Re:Those performance numbers are BS by LordLimecat · · Score: 2

      it's clear that you're used to the x86 world

      And there arent any processors AFAIK outside of the x86 / x64 world that can match Intel and AMD designs in raw performance-per-watt. Trying to claim otherwise is dishonest, and as parent mentioned if it were true the top supercomputers wouldnt be wasting their time on Intel and AMD parts.

    8. Re:Those performance numbers are BS by AdamHaun · · Score: 2

      Thanks for the info! I had a feeling EOMA-68 was nonsense too, but I stopped reading after discovering that A) his first big hardware project was developing an "industry standard", and B) they had to change the name from EOMA/PCMCIA because it wasn't actually compatible with PCMCIA.

      The only thing I might be inclined to worry about is the possibility that he might sucker gullible people into donating to his obviously doomed project. (I'm not quite cynical enough to believe he's a scammer, but intent doesn't matter when the money's been flushed and donors can't ever get it back.)

      Yeah, that was why I commented in the first place. There are too many overly optimistic software people here to let this sort of thing slide.

      p.s. I also work for a fabless semi company. HATE YOU if you work for a direct competitor. (okay, not really ;)

      Fabless, heck, I work for TI! We have plenty of fabs. Although we like foundries too. Everyone likes foundries these days since process development is insanely expensive. I spent the last five years doing product engineering and embedded flash process development/testing, and recently moved on to applications engineering. I am intimately familiar with how much work it takes to do the stuff that these proposals gloss over, and become very annoyed when it is not taken seriously. :-)

      --
      Visit the
    9. Re:Those performance numbers are BS by n7ytd · · Score: 2

      I am intimately familiar with how much work it takes to do the stuff that these proposals gloss over, and become very annoyed when it is not taken seriously. :-)

      Seems like the GP is a believer of the "I don't understand what that guy does, so it must be easy" crowd.

  8. Re:And no proprietary software either by lkcl · · Score: 4, Informative

    If this processor is going to be designed and licensed under GPLv3 - I guess one won't be able to build any license-compatible proprietary software for it either. Curious - but count me out :)

    ah interesting. no, it wouldn't be. i believe there are two separate misunderstandings here.

    first: i did actually look some time ago at LEONv..... v2 i think it is, which is LGPL licensed i think by Gaisler Research but the amount of work needed to turn it into a modern GPU/VPU-competitive processor would be too costly. then there is the stuff on http://opencores.org/ but it's not really ready for prime-time - i've been keeping an eye on the projects there for quite some time [none of them are SMP capable for example]

    instead, i kept hunting, spoke to tensilica about their core (which is superb btw!), talked to synopsis about their core (ARC), and even came up with a way to do software-interrupt-driven SMP (yes i ran it by alan cox on LKML!). when this current design popped up, and i saw both its capabilities and that they are willing to respect the GPL regarding the toolchain, i jumped at the chance.

    second misunderstanding is over design of *hardware* impacting what *software* it can run. it would be necessary to have a modified version of the GPL, stating "all and any software programs running on this hardware *must* be GPL licensed". the impact that this would have would be extremely problematic, as well as being rather fascist and not in the spirit of free software at all.... and, also, as it would be a modified version of the GPL, it wouldn't *be* the GPL, so could not be FSF-Endorsed.

    with that as background, to answer the question directly: this is a proprietary design just like all other proprietary designs, using off-the-shelf completed and *tested* hard macros (including the core processor itself albeit only under the MVP Programme), where there is no restriction of any kind on the software that can be run on that processor, be it free software or proprietary software.

    anyone can play, in other words.

  9. HDMI / Licensing by lobiusmoop · · Score: 2

    I know Allwinner did a separate version of their A10 chip without HDMI (A13) to avoid heavy licensing costs, would the HDMI push the cost of the chip up much?

    --
    "I bless every day that I continue to live, for every day is pure profit."
    1. Re:HDMI / Licensing by lkcl · · Score: 2

      would the HDMI push the cost of the chip up much?

      I doubt very much that the people who control the HDMI spec would allow an EFF-endorsed CPU to do this anyway -- the EFF has no interest in enforcing DRM, and HDCP pretty much requires you implement it end to end.

      I'm not sure you could reconcile those two views.

      funny you should mention this. i raised it with Dr Stallman because the same sort of thing occurred to me: why support DRM?? well... his answer was: the DRM in HDMI is so utterly broken that it's as if it didn't matter. therefore, he's okay with it.

      which i find absolutely hilarious. DRM is okay, as long as the keys are available, one way or the other [thus making the DRM irrelevant, one way or the other]. this is primarily what the fuss over the GPLv3 is about, because of the endemic tivoisation that occurred a few years ago [and is still ongoing].

  10. Random number generator by WaffleMonster · · Score: 2

    I want a REAL cryptographic quality random number generator based on thermal noise or some other quantum mumbo jumbo.

    https://www.eff.org/rng-bug

    Lets at least make the spooks have to work for a living :)

  11. Vaporware? by WoOS · · Score: 4, Interesting

    From TFA:

    >The deadline:
    > July 2013 for first mass-produced silicon
    >
    >The cost:
    > $USD 10 million

    This poster has either no idea or is dreaming. In 6 months he will not have an SoC through potentially several tape-outs, having first done System Engineering, Design, Synthesis, Layout, Verification, Validation, Documentation, ... and seemingly all without an existing organization. Or are SoC manufacturers lately doing short-term build-to-order processors. And the 10 million are not going to cover the necessary cost for all of the above. The masks alone might be that expensive depending on the number of tape-outs necessary (which - without an existing organization and working design flow - will be a lot).

    1. Re:Vaporware? by lkcl · · Score: 3, Informative

      From TFA:

      >The deadline:
      > July 2013 for first mass-produced silicon
      >
      >The cost:
      > $USD 10 million

      This poster has either no idea or is dreaming.

      both. i have no clue - that's why i posted this article online, as a way to solicit input and to double-check things - and i'm dreaming of success.

      In 6 months he will not have an SoC through potentially several tape-outs, having first done System Engineering, Design, Synthesis, Layout, Verification, Validation,

      what i haven't mentioned is that one of my associates (my mentor) used to work for LSI Logic, and he later went on to be Samsung's global head of R&D. he knows the ropes - i don't. we've been in constant communication, and also in touch with some people that he knows - long story but we have access to some of the best people who *have* done this sort of thing.

      Documentation,

      ahh, my old enemy: Documentation. [kung fu panda quote. sorry...] - yes, this is probably going to lag. at least there will be source code which we know already works. not having complete documentation has worked out quite well for the Allwinner A10 SoC, wouldn't you agree?

      also, because this is going to be a Rhombus Tech Project, the CPU will *not* be available for sale separately. it will *ONLY* be available as an EOMA-68 module. no arguments over the hardware design. no *need* to do complex hardware designs. the EVB Board will *be* the "Production Unit" - just in a case, instead.

      so by deploying that strategy, Documentation is minimised. heck, most factories in China have absolutely no clue what they're making. it might as well be shoes or handbags, for all they know. heck, many of the factories we've seen actually *make* shoes and handbags, and their owners have gone "i know, let's diversify, let's make tablets". you think they care about Documentation? :) ... ok, i know what you mean.

      ... and seemingly all without an existing organization.

      yeah. it's amazing what you can do if you're prepared to say "i don't know what i'm doing" and ask other people for help rather than try to keep everything secret, controlled and "in-house". my associates are tearing their hair out, i can tell you :)

      Or are SoC manufacturers lately doing short-term build-to-order processors. And the 10 million are not going to cover the necessary cost for all of the above. The masks alone might be that expensive depending on the number of tape-outs necessary (which - without an existing organization and working design flow - will be a lot).

      well, because i know nothing, i've asked people who do know and have a lot of experience. the procedure we'll be following is to get an independent 3rd party - one that partners with the foundry - and get them to do the verification, even if the designers themselves have run the exact same tools. if it then goes wrong, we can tell them to fix it... *without* the extra cost of another set of masks. a kind of insurance, if you will.

      but the other thing we are doing is: there will be *no* additional "design". it's a building-block exercise. the existing design is already proven in 65nm under the MVP Programme: USB-OTG works, DDR3/1333mhz works, RGB/TTL works, the core works, PWM works, I2S works, SD/MMC works and so on. all we're doing is asking them to dial up the macros to put down a few more cores, and surround it with additional well-proven hard macros (HDMI, USB3, SATA-II).

      does that sound like a strategy which would, in your opinion, minimise the costs and increase the chances of first time success?

    2. Re:Vaporware? by WoOS · · Score: 2

      > Yes, this is probably going to lag. at least there will be source code which we know already works.
      > not having complete documentation has worked out quite well for the Allwinner A10 SoC, wouldn't you agree?

      I don't know the A10 with the euphemistic name but I know that the typical SoC MCU I know has documentation in the thousands of pages. And most of it on internal blocks, not external connections which might see a reduced need by delivering it only on a board - although then you need to document the board.
      An SoC MCU is not a PC CPU. It has lots of internal (I/O) modules which all need documentation. And that documentation normally 'ripes' while select customers get engineering samples of the MCU and - for the priviledge of getting them - have the fun of suffering through and reporting all the inconsistent or non-understandable parts which get into the documentation because it is just a bunch of individual module descriptions forced together.

      > it's a building-block exercise. the existing design is already proven in 65nm under the MVP Programme ...
      > all we're doing is asking them to dial up the macros to put down a few more cores, and surround it with additional well-proven hard macros
      If I understood you correctly you want to shrink it to 40nm. Then there is no proven design as a shrink normally means a new libary.
      Also you should have your mentor at Samsung have you get in contact with one of their SoC Design leads and have him tell you how 'easy' it is to just "dial a few macros" and connect them. Any new thing you add to an existing SoC has the chance of causing ripple effect, be it problems with your bus architecture (e.g. not enough ports on your bus for the new cores), larger power supply (internal to external, linear to switching), timing violations because the die size grew, .... .

      On the danger of doing a Bill Gates: Open Source SW is useful because every halfway intelligent person can extend it and make use of it within a few days of installing a development environment. Open Source semiconductor designs on the other hand are not, because the market access barriers in that area are not the knowledge of the design but (the cost of) the technology and the people needed to execute it and make something useful of it. Until nanotechnology delivers there is no "Brew your own core in the backyard".

  12. free formats by vlm · · Score: 2

    hardware support for free formats, as opposed to non-free?

    --
    "Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
  13. Also by Sycraft-fu · · Score: 4, Insightful

    Compare it to a more modern processor. You want floating point performance? Take a look at a Sandy/Ivy Bridge. My 2600k, which I have set to run at 4GHz, gets about 90GFlops in Linpack. The reason is Intel's new AVX extension, which really is something to write home about for DP FP. Ivy Bridge is supposedly a bit more efficient per clock (I don't have one handy to test).

    If you are bringing out a processor at some point in the future, you need to compare to the latest products your competitors have, since that is realistically what you face. You can't look at something two generations old, as the 920 is, and say "Well we compete well with that!" because if I'm looking at buying your new product, it is competing against other new products.

    1. Re:Also by PhrostyMcByte · · Score: 2

      The summary is building expectations so much that I can't help feeling this is a massive flop (yup, I did that) waiting to happen.

      I'd be really impressed if they did match the performance of the 920, even if it'll probably be somewhere between 5-10 years old by the time this Free CPU sees production and gets into consumer hands. That's quite a complex, performant CPU right there to match. But the summary has so many holes, I really have a hard time believing they'll get anywhere near the 920 for general-purpose computing.

      Compare it to a modern processor? Intel's Haswell architecture, coming out mid-2013, has theoretical performance of 973 GFLOPS of single-precision and 486 GFLOPS of double-precision. And those numbers don't include the on-die GPU performing work in OpenCL simultaneously.

      I'm all for a Free CPU design and really wish them well, but naming names and saying they'll be comparable opens them up to this kind of skepticism fair and square.

  14. Requirements by gr8_phk · · Score: 2

    Off the top of my head:

    0) A proper MMU and at least 1Meg of cache
    1) 64bit - If not, there will be a need for yet another version at some point. Just do this.
    2) Double precision floating point in hardware (for + - * / and preferably rsqrt)
    3) GCC support.
    4) LLVM support
    5) LLVM-Pipe for OpenGL support
    6) It would be nice if some instructions were optimized for running virtual machines.

    I haven't looked into what makes sense for #6, but with all the VMs around it would be nice to have them run efficiently.

    1. Re:Requirements by lkcl · · Score: 2

      Off the top of my head:
       

      always the best way :)

      0) A proper MMU and at least 1Meg of cache

      it's got 64k I & D 1st level, yes to the proper MMU, and the dual-core version has 256k 2nd-level (just enough). they reckon for 8-core that'll have to be increased.

      1) 64bit - If not, there will be a need for yet another version at some point. Just do this.

      yes. wellll aware of this :) have to be scheduled for the next version unfortunately.

      2) Double precision floating point in hardware (for + - * / and preferably rsqrt)

      it must have. i'll ask though.

      3) GCC support.

      ah no. this design is too different for gcc to handle. their compiler expert - someone with over 15 continuous years expertise in compiler design - chose open64 instead (which used gcc's front-end at some point, and so the whole compiler chain is *entirely* GPLv2 licensed).

      4) LLVM support

      don't know! good question.

      5) LLVM-Pipe for OpenGL support

      shouldn't need it... he said. i'll have to ask

      6) It would be nice if some instructions were optimized for running virtual machines.

      good point! i'll ask. in the mean-time (and esp. if it's not), i recently looked at LXC. replaced a set of 5 XEN instances in about 3 hours flat. first one was a bit hairy, the rest were almost a cut/paste by-rote job. it's going well. thoroughly recommend it.

      I haven't looked into what makes sense for #6, but with all the VMs around it would be nice to have them run efficiently.

  15. free fabrication process by Dishwasha · · Score: 2

    So will this 100% free processor follow a 100% free fabrication process? What is the use in being worried about dependencies on proprietary vendors' architectures in order to support 3rd, 4th generation processors when the ability to replace 3rd, 4th generation processors with an equivalent part requires production through a proprietary vendor manufacturing processes?

    1. Re:free fabrication process by BitZtream · · Score: 2

      I made it this far down the page before saying it, but I can't hold back any more.

      You have absolutely no clue what you're doing and because of that, if you're leading this project, I doubt any of it exists.

      You're trying to sell vaporware.

      Go sod off you damn troll.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    2. Re:free fabrication process by lkcl · · Score: 3, Insightful

      I made it this far down the page before saying it, but I can't hold back any more.

      You have absolutely no clue what you're doing

      that's right - i don't. that's why i'm asking peoples' input.

      and because of that, if you're leading this project, I doubt any of it exists.

      that's right: it doesn't. the idea is to get it made, with as little risk as possible, using building blocks that have been proven as much as is possible.

      anything that's in the "planning" phase doesn't exist until it actually exists. what's wrong with that? if everyone followed the line you're proposing, nobody would ever make anything, would they?

  16. Re:x86 - NOT!!!!! by VortexCortex · · Score: 4, Funny

    I need is a free Posix OS (probably linux) and a C/C++ compiler.

    You also need a text editor for your hosts file!

    Fools. Both of you. A text editor and C compiler are required by POSIX.

  17. Re:Feature Requests, Now that you asked by lkcl · · Score: 5, Informative

    "So have at it: if given carte blanche, what interfaces and what features would you like an FSF-Endorseable mass-volume processor to have?"

    thank you for taking me literally! really appreciated!

    Standard size chip socket, with adapter springs and guides for using off the shelf cooling implements (like zalman fans, and watercooling), for other CPUs.

    ah. this is going to be a 15mm x 15mm BGA with only around 320 pins. it's tiny. ok, that might have to be revisited now that i thought about doing an 8-core monster - 3 watts in a 15 x 15mm package is hellishly hot.
    i'm still debating whether it should have dual 32-bit DDR3 lanes. even so, that only adds an extra... 75 or so pins, bringing it up maybe to 19 x 19 mm.

    need PCI and PCI express, prefrably at least 24 lanes, hopefully as many as 48 lanes.

    ahhh... PCI express is a bug-bear. that many lanes would, on their own, turn this into a 12 to 30 watt part: right now we're aiming for a different market. i'm happy to be steered in a different direction if it can be shown that it's a genuinely good idea, with a high chance of return on investment.

    Behind this, fast northside/southside busses to keepup with the following, I think AMD open sourced hypertransport, so front side bussing should not be an issue.

    ah this is an embedded processor: they don't have northbridge/southbridge buses [at all]. those are reserved for CPUs at the 10+ watt market.

    If your still mulling over instruction set, a built in crypto proccessing chip would ROCK. implement intels AES-NI or something similar, plus more for twofish, serpent, and other fairly mainstream modern, unbroken Free/Open encryption algorythms. Then add hash instructions for the entire SHA family of hashes, MD6, whirlpool, tiger, RIPMED, and GOST

    ok - this is a general-purpose processor that *happens* to have been designed to be capable of doing a GPU and a VPU's job. hmmm... i wonder whether their instruction set can do crypto primitives.. hmmm.... yeah, that's a great question to ask. i'll get back to you on that.

    GOOD USB 3 support, with legacy suppoequivsrt for 1 and 2. Not only do I want some ports on the back, I want at least 3-4 banks of header pins on a theorhetical motherboard for front panel devices and ports. They shtheorheticalould be USB 1,2,3. Solid high speed memory controller at a preimium.

    definitely going to have 1x USB-OTG, probably 2x USB2-HOST, and at least one USB-3.

    Universial SATA support for revisions 1,2 and 3 (1.5GB/s 3.0 GB/s and 6.0 GBs respectively), built in RAID controller. eSATA would help too.

    i'm reluctant to push this IC towards 6gb/sec - it'd be by far and above the fastest bit of I/O on the chip. RAID i'd be concerned about pushing up the cost for the mass-volume uses [which wouldn't use it]. eSATA is _great_. i'd forgotten about that.

    scalable audio chipset capable of up to 8.1 surround, Stereo input, SPID/F and all the other great audio features.

    SPDIF - i'd not *entirely* forgotten about that - will remember to make a mental note. audio i would like to rely on the processor itself for that sort of thing (for basic audio - headphones and the like), otherwise handing off to a standard I2S/AC97 audio IC for cases where people really want more complex audio. there are 3 I2S interfaces i think.

    so, yeah - i want audio to be done more like the TI McBSP. DMA-driven, but use the main processor for audio handling. keep it simple.

    DDR3 RAM, or something comparable.

    already done. 1333mhz. bit concerned personally about the power consumption of 1333mhz, i know that 800mhz is about 0.3 watts for example: 1333mhz is starting to get to 1 maybe 1.5 watts all on its own!

    Unlocked bootloader with firmware m

  18. Re:Feature Requests, Now that you asked by lkcl · · Score: 2

    split the graphics chipset into another PCI-E board, and sell it seperately, that works with x86. .

    in x86-land, yes. in ARM-land, yes. MIPS, funnily enough no: look up MIPS64-ASE-3D. ingenic jz4760 and below: no (look up X-Burst).

    this chip is more like MIPS-with-3D-ASE, or Ingenic-with-XBurst. you *can't* separate the GPU from the CPU: they're one and the same. ok, you could... but you'd end up with two identical processors connected by some sort of fast bus... why bother? why not just double the number of cores?

  19. Re:x86 - NOT!!!!! by lkcl · · Score: 2, Informative

    So you're volunteering to write the compiler, right?

    the team's done it already.

    And porting Linux to a completely new architecture?

    and that, too. and android on top of that.

    relax - it's been taken care of. come on - think, 007. would i *really* have put this up as a proposal if the compiler and the linux port hadn't already been done? doh! :)

  20. Re:And no proprietary software either by lkcl · · Score: 3, Informative

    Hmm, one problem I have with the full GPL is that it *is* by design rather intent on spreading itself virally and to the exclusion of other legitimate models, and thus a restriction on what software the hardware would be allowed to run would be unfortunately in keeping with the GPL.

    you are absolutely, absolutely dead wrong. waaayyyy off base.

    I agree that that would be excessive, but then I think that the full GPL is generally excessive.

    You may guess that I prefer to license my stuff under BSD licences to allow fully commercial uses. B^>

    Rgds

    Damon

    and how's that working out in the android community? you've seen the list of GPL violations as people mistake "android equals linux", yeah? it's a serious problem, and it's why i started the whole rhombus-tech initiative: to get free software developers involved right from the beginning in the mass-volume industry, right the way through to sales in hypermarket retail stores. the "dream" if you will is for free software people to be able to walk into a supermarket and go go "fuckin'A! i helped write the software for that! you wanna buy one of these, grandma, i can replace the OS in no time, with something that i can manage remotely for you".

    you have to remember that the BSD license was designed and written at a time when everyone trusted (because they knew them personally) everyone else in the industry. *everyone* shared source code. then fuckers like apple came along and went "thank you very much. BYE". at one point, microsoft's NT Team took the TCP/IP BSD-licensed stack, and put it directly into MSRPC (because winsock was so shit). it's almost 20 years later that Wine have finally reverse-engineered MSRPC. i really don't understand people who don't understand why the GPL is so necessary, i really don't.

  21. Re:x86 by jones_supa · · Score: 2

    If you are using proprietary code on an open source OS (eg. Flash on Linux) then x86 code execution would be nice, but in the long run - the ability (and documentation) to create an open source flash version has been available for years now. If anything - having a x86 proprietary Linux flash player has hampered development of FOSS flash implementations as there is less of a perceived need for it.

    Quartus, Maya, MATLAB... There's much more closed-source packages than just Flash. And all the upcoming Steam games for Linux will be binary x86.

  22. Re:x86 by Anonymous Coward · · Score: 2, Insightful

    But with various advances and lessons learned in chip and PC architecture, it makes sense that an 800MHz processor could take on a 3GHz processor and kick its butt.

    No, it doesn't actually. Not when the 3GHz processor is one of the world's most efficient (in terms of realized computational power per clock cycle) designs in existence. Not when the effort required to reach that pinnacle of complexity and performance is... astonishing. (Think: 5 year long design projects.)

    Each discrete processor instruction in x86 land still takes several clock cycles to execute. (I know, pipelining and multiple instructions are being processed at all times assembly line style so the effective instructions per cycle is different.)

    You don't have any clue anyways. Throughput on common x86 instructions often exceeds 1 per cycle on modern advanced x86 core designs. It hardly matters that the start-to-finish latency for individual instructions is many cycles due to pipelining; without pipelines no design could operate even as fast as 800 MHz.

    But if you combine current technology and design it from the ground up to do the kinds of things we do today, it would make sense that it would use less power and fewer cycles per instruction.

    This only makes "sense" if you're an armchair theorizer who has no real insight into the problem space. In reality, the worst problem with x86 as of x86-64 is that its instruction encoding is somewhat difficult to efficiently decode, due to the variable length instructions. A nicer encoding (even one with variable length, so long as it made it easy to detect total length from the first 2 bytes instead of having to scan the whole instruction like you do with x86) would basically save some power.. But not a lot. There isn't any revolution lurking the way you naively want to believe.

    The reason we aren't doing all that well today is that x86 things are crippled into doing things the x86 way because they are still needed to run x86 software.

    So, while all other things are changing, why not take the opportunity and update the processor, OS and software along with the style of computing? I know Microsoft's answer is to adapt x86 Wintel into other forms. No one wants this other than Microsoft...

    Just how old are you? I'm thinking you can't possibly be very old if you aren't aware that all the way back in 1991 the Apple-IBM-Motorola consortium created PowerPC, which was going to eat the lunch of all of the x86 CPUs by being a clean new architecture designed on scientific RISC principles. Apple replaced 68K with PPC. Motorola ported Windows NT to PPC and touted how in the brave new world people would flock to PPC for far better performance which x86 could never provide.

    It took all of about 5 minutes for the NT-on-PPC plans to flop. There wasn't a real performance advantage there, so no third party software vendors were interested in porting application software, so no users wanted to buy PPC NT machines just so they could wank off about how pure and wonderful their CPUs were. The first PPC Macs shipped in 1994. Apple was reasonably successful with it for a while, because Mac users had little choice but to switch and PPC performance at least generally kept up with x86 for a while, but eventually PPC began falling behind and Apple was forced to switch.

    The problems with x86 weren't fatal in 1991, and they aren't fatal 20 years later either. A clean sheet design could indeed be slightly better, but not radically better.

  23. Confused - "licensing of the modern interfaces"? by daboochmeister · · Score: 2

    I'm confused ... I rtfa'ed, and it says in the project costs section that of the $10m needed, "$1.5m will be for licensing of the modern interfaces such as DDR3, HDMI, SATA-II, USB-3 and RGMII." If those modern interfaces need to be licensed that way, don't they violate FSF endorsability, ipso facto? Or, is it that the licensing terms for such are compatible with e.g. GPLv3?

    --
    "Ahh! I see you're in that indeterminate Schrodinger state where - oh, uh ... never mind." Dave Bucci