Slashdot Mirror


19-Year-Old's Supercomputer Chip Startup Gets DARPA Contract, Funding

An anonymous reader writes: 19-year-old Thomas Sohmers, who launched his own supercomputer chip startup back in March, has won a DARPA contract and funding for his company. Rex Computing, is currently finishing up the architecture of its final verified RTL, which is expected to be completed by the end of the year. The new Neo chips will be sampled next year, before moving into full production in mid-2017.The Platform reports: "In addition to the young company’s first round of financing, Rex Computing has also secured close to $100,000 in DARPA funds. The full description can be found midway down this DARPA document under 'Programming New Computers,' and has, according to Sohmers, been instrumental as they start down the verification and early tape out process for the Neo chips. The funding is designed to target the automatic scratch pad memory tools, which, according to Sohmers is the 'difficult part and where this approach might succeed where others have failed is the static compilation analysis technology at runtime.'"

4 of 150 comments (clear)

  1. Re:By Neruos by trsohmers · · Score: 4, Interesting

    We actually have very good reasons to say why this is a very different kind of VLIW, and have found the reason why other VLIW chips have had such static scheduling issues. Hope we can convince you and everyone else soon enough.

  2. Re:Not sure whats more impressive... by captnjohnny1618 · · Score: 4, Interesting

    I'm burning some mod points to post this under my username, but it's totally worth it. THIS is the kind of article that should be on Slashdot!

    Can you elaborate on the programming structure/API you guys are envisioning for this? (it's cool if you can't, I'd understand :-D). Also, what particular types of problems are you guys targeting your chips to solve or to what areas do you envision your chips being especially well suited? Also, who do you think has done the best nitty-gritty write up about the project so far? I'd love to hear what you think is the best technical description publicly available. Can't wait to learn more as the project grows.

    Although I'm not a programmer or CS person by training, I do GPGPU programming (although not BLAS-based stuff) almost exclusively for my research and enjoy it because once you understand the differences between the GPU and CPU it just become a question of how to best parallelize your algorithm. It'd be AMAZING to see the memory bandwidth and power usage specs you guys are working towards under a similar programming structure we currently see with something like CUDA or OpenCL. Any plans for something like that or am I betraying my hobbyist computing status?

    Finally, if you ever need any applications testing, specifically in the medical imaging field, feel free get in touch. ;-)

  3. I'm a pro in the field. This doesn't scan. by Brannon · · Score: 4, Interesting

    Please explain to me simply how you get 10x in compute efficiency over GPUs--these chips are already fairly optimal at general purpose flops per watt because they run at low voltage and fill up the die with arithmetic.

    GPUs have excellent memory bandwidth to their video RAM (GDDR*), they have poor IO latency & bandwidth (PCIe limited) which is the main reason they don't scale well.

    We've heard the VLIW "we just need better compilers" line several times before.

    Thus far this sounds like a truly excellent high school science fair project, or a slightly above average college engineering project. It is miles away from passing an industrial smell test.

  4. Re:Not sure whats more impressive... by trsohmers · · Score: 5, Interesting

    1. My personal favorite programming models for our sort of architecture would be PGAS/SPMD style, with the latter being the basis for OpenMP. PGAS gives a lot more power in describing and efficiently having shared memory in an application with multiple memory regions. Since every one of our cores have 128KB of our scratchpad memory, and all of those memories are part of a global flat address space, every core can access any other cores memory as if it is part of one giant continuous memory region. That does cause some issues with memory protection, but that is a sacrifice you make for this sort of efficiency and power (but we have some plans on how to address that with software... more news on that will be in the future). The other nice programming model we see is the Actor model... so think Erlang, but potentially also some CSP like stuff with Go in the future (And yes, I do realize they are competing models).
    If you want to get the latest info as it comes out, sign up for our mailing list on our website!