Network wins over disk...
on
RAMdisk RAID?
·
· Score: 2, Interesting
...but only if you can deal with the OS latency. My very rough
understanding says any networking based on the OSI model is going to
pay a sufficiently large penalty in OS latencies that remote memory
probably won't be any faster than a good local disk subsystem.
However, if you can get rid of that latency, you can win BIG.
Since the questioner is looking at using commodity hardware with a
commodity OS using a commodity networking protocol, my gut feeling is
that (s)he doesn't have a prayer. It is a cool idea, but latencies are
likely to be too high.
The/. dreamers don't need to give up all hope, however.:) There is
relevant work in the academic literature, using specialized hardware and software of course. The work
I'm familiar with is from Hank Levy's group at UW. To sum up, based
on what I remember from a class I took back in '98 from
Mike Feeley
(first author on said paper; also did his PhD thesis on the topic):
The motivating example came from Boeing. They had a bunch of CAD
workstations all with lots of RAM (by the standards of the day).
However, looking at any nontrivial part of the design required more
memory than any single workstation. Paging to disk was S-L-O-W. So
why not use the frequently idle memory on the other workstations?
The result of the UW work was a sort of global memory management,
with paging to remote workstations in the cluster as well as to disk.
Using memory on the remote workstations was significantly faster than
using the local disk.
So what about latency from the network stack? IIRC (and it has been
five years since I talked to Mike about this...) they used myranet.
In some sense myranet is basically DMA to remote workstations. One
myranet node issues a write request in software, which includes the
source address in memory for the data to be copied, a target node in
the cluster, and the target memory address on the target node. The
myranet hardware on the local workstation does DMA from the source
memory location, fires it over fibre to the remote workstation, which
dutifully does DMA from the myranet card to the memory locations
specified by the sender. This is very fast, but not the stuff
traditional general-purpose computing has been made of.
First off, I highly recommend you read some of the ASYNC proceedings. (ASYNC = Nth International Symposium on Asynchronous Circuits and Systems, for N=[1,8]) The best stuff in async tends to get published there. I think they've always published through IEEE, so you can go there, or just do some google searching.
There are no async design flows which you can just drop in to replace Cadence or whatever your current synchronous flow is. There is also a lot of literature (see ASYNC proceedings) on what async people have to do to get commercial synthesis tools to play nice at all. Cadence-style synthesis just isn't there. (Which could be seen as a good thing, depending on your view of the current ack-bassward synchronous flows...)
Your goal sounds close enough to the Manchester ARM work that their tools might work for you. First, read their papers on their ARM work; the devil is always in the details. (Publishing is like sales, you bury the ugly details in the fine print.:) If the experts at Manchester ran into a problem somewhere, odds are good you will too. Second, make sure you completely buy into their sect's interpretation of async before you jump in. They aren't the only sect in town.
I'm academically descended from the QDI culture in SoCal (Caltech, USC, whatever their startup is called these days,...), so personallly I would go a different route. You should read up on what the QDI folks are doing. They've built their own (in many areas proprietary) flow, but the tools they use are readily available so depending on the complexity of your project you might be able to cook up your own, similar flow with the same tools.
My last point is a word of warning. As much as I like async and would like to see it more widely adopted, you are taking a big plunge, and if this is anything more than a course project, a big risk. Paraphrasing what a man who is much smarter than I said at the last ASYNC: If you read between the lines in the ASYNC proceedings you will see that the async community is basically a small (order hundreds) group of generally very smart people who are highly motivated. As often as not, they get things done by being highly motivated smart people. As a community, we haven't really demonstrated that our techniques to date will work for anyone but highly motivated smart people in the async community.
I have two degrees in CS, and am currently persuing a third. I do VLSI design, often down to the level of drawing the polygons as they will be fabricated on the chip.
The curriculum for CS/CE/ECE varies widely from department to department. Often the program is what you make of it. When I was an undergraduate I decided I liked hardware, but didn't like the EE curriculum at my school, so I chose to do hardware as a CS major. YMMV.
I believe that the export of technical assistance (in this case, fixing bugs) with crypto is also prohibited. The corporate world (RSA, etc.) would have set up this sort of thing long ago otherwise.
Since the questioner is looking at using commodity hardware with a commodity OS using a commodity networking protocol, my gut feeling is that (s)he doesn't have a prayer. It is a cool idea, but latencies are likely to be too high.
The /. dreamers don't need to give up all hope, however. :) There is
relevant work in the academic literature, using specialized hardware and software of course. The work
I'm familiar with is from Hank Levy's group at UW. To sum up, based
on what I remember from a class I took back in '98 from
Mike Feeley
(first author on said paper; also did his PhD thesis on the topic):
The motivating example came from Boeing. They had a bunch of CAD workstations all with lots of RAM (by the standards of the day). However, looking at any nontrivial part of the design required more memory than any single workstation. Paging to disk was S-L-O-W. So why not use the frequently idle memory on the other workstations? The result of the UW work was a sort of global memory management, with paging to remote workstations in the cluster as well as to disk. Using memory on the remote workstations was significantly faster than using the local disk.
So what about latency from the network stack? IIRC (and it has been five years since I talked to Mike about this...) they used myranet. In some sense myranet is basically DMA to remote workstations. One myranet node issues a write request in software, which includes the source address in memory for the data to be copied, a target node in the cluster, and the target memory address on the target node. The myranet hardware on the local workstation does DMA from the source memory location, fires it over fibre to the remote workstation, which dutifully does DMA from the myranet card to the memory locations specified by the sender. This is very fast, but not the stuff traditional general-purpose computing has been made of.
Brian
What does it mean to you?
:) If the experts at Manchester ran into a problem somewhere, odds are good you will too. Second, make sure you completely buy into their sect's interpretation of async before you jump in. They aren't the only sect in town.
...), so personallly I would go a different route. You should read up on what the QDI folks are doing. They've built their own (in many areas proprietary) flow, but the tools they use are readily available so depending on the complexity of your project you might be able to cook up your own, similar flow with the same tools.
First off, I highly recommend you read some of the ASYNC proceedings. (ASYNC = Nth International Symposium on Asynchronous Circuits and Systems, for N=[1,8]) The best stuff in async tends to get published there. I think they've always published through IEEE, so you can go there, or just do some google searching.
There are no async design flows which you can just drop in to replace Cadence or whatever your current synchronous flow is. There is also a lot of literature (see ASYNC proceedings) on what async people have to do to get commercial synthesis tools to play nice at all. Cadence-style synthesis just isn't there. (Which could be seen as a good thing, depending on your view of the current ack-bassward synchronous flows...)
Your goal sounds close enough to the Manchester ARM work that their tools might work for you. First, read their papers on their ARM work; the devil is always in the details. (Publishing is like sales, you bury the ugly details in the fine print.
I'm academically descended from the QDI culture in SoCal (Caltech, USC, whatever their startup is called these days,
My last point is a word of warning. As much as I like async and would like to see it more widely adopted, you are taking a big plunge, and if this is anything more than a course project, a big risk. Paraphrasing what a man who is much smarter than I said at the last ASYNC: If you read between the lines in the ASYNC proceedings you will see that the async community is basically a small (order hundreds) group of generally very smart people who are highly motivated. As often as not, they get things done by being highly motivated smart people. As a community, we haven't really demonstrated that our techniques to date will work for anyone but highly motivated smart people in the async community.
Look before you leap.
Brian
I have two degrees in CS, and am currently persuing a third. I do VLSI design, often down to the level of drawing the polygons as they will be fabricated on the chip.
The curriculum for CS/CE/ECE varies widely from department to department. Often the program is what you make of it. When I was an undergraduate I decided I liked hardware, but didn't like the EE curriculum at my school, so I chose to do hardware as a CS major. YMMV.
I believe that the export of technical assistance (in this case, fixing bugs) with crypto is also prohibited. The corporate world (RSA, etc.) would have set up this sort of thing long ago otherwise.