Remote Direct Memory Access Over IP
doormat writes "Accessing another computer's memory over the internet? It might not be that far off. Sounds like a great tool for clustering, especially considering that the new motherboards have gigabit ethernet and a link directly to the northbridge/MCH."
Not to mention easy access to sensitive information in emails, documents, and PIMs that the user currently is running and are resident in memory.
Seriously though... this is where Scott McNealy's vision of "The Network is the Computer" comes even closer to reality.
S
The security implications are staggering.
How do we lobby for port number 31337 for the RDMA protocol?
How small a thought it takes to fill a whole life
ShowEQ needed something like that be able to read the encrypted memory, and display on their Linux Client. I'm not sure if they actually implemented it, but it'd be a great reverse engineering tool!
This feature has been available for a while now, but using a dedicated link rather than IP. Sun call it Remote Shared Memory and it's mainly used for database clusters.
Accessing another computer's memmory over the internet
Maybe you should watch less japanese kiddie porn cartoons and learn to read and write english you retard.
I take it that error code 500 will be used when the DIMM or controller is fried?
surely this would be a piece of cake to do?
think about it:
- you can make a program read from anywhere in memory.
- you can make a program write to anywhere in memory.
- you can send information back and forward over the internet.
so, computer 1 asks for a memory address from computer 2, and can then read or write to it by sending back a command.
what is even slightly exciting here?
Sharing memory is not necessary in distributed programming if the variables are kept mostly local and a single computer works mainly with what it has stored in its local memory. This is very applicable to renderfarms where the acceleration scheme itself works very well for distributed rendering because methods such as the grid subdivides into cells each of which can be stored on and evaluated on a single computer with its local memory. Only a central computer is needed to control these nodes and store the ouput which is of very limited size and without great computational needs.
Checking out my form of escapism.
> Microsoft ultimately is expected to support RDMA
> over TCP/IP in all versions of Windows
Can you see it coming? The ultimate Windows root exploit!! Hmm... I guess someone has to go tell them. Othervise they won't notice it until it's too late...
Seriously, how do you dare to enable this kind of access?!?
I tried something like this a while ago -- I wanted to mount an NFS-exported file via loopback and use it as swap.
The file in question actually resided in a RAM drive on another machine on the LAN.
I couldn't get it to work in the 45 minutes or so I messed around with it. I'm not sure if Linux was unhappy using an NFS-hosted file for swap, or what exactly the problem was, but I did get some funny looks from people to whom I explained the idea (ie, to determine whether the network would be faster than waiting for my disk-based swap).
Of course, this was back when RAM wasn't cheap...
Somebody get that guy an ambulance!
iWarp has been around for a few years and I think is getting deprecated by a newer system. Just a way of getting *that* much more speed by avoiding unnecessary context switches. Datacenter stuff mostly but is general enough that it could be dropped on a lot of current stuff (AFAIK).
What is music when you despise all sound?
You read my mind.
The article says that Microsoft is part of this "consortium".
What kind of problems will develope once virus & worm writers, and spammers get access to this mechanism?
Of course, if DRM (digital restriction management) comes along, at least it will give a back door into the system.
That would be the first port I would firewall off...
Brings up interesting ideas of ways to prank your friends & enemies though.
0100 lea edi, dma://foo.example.com:b8000h
0103 mov al, 65
0105 mov ecx, 2000
010a rep stosb
010b jmp 100
g=100
Microsoft products have had this "feature" for a while now. Esp. IIS.
VI Architecture
It's very interesting that using memory over the network is very much the same problem as cache coherency amongst processors. If you have multiple processors, you don't want to have to go out to the slow memory when the data you want is in your neighbors cache... so perhaps you grab it from the neighbor's cache.
Similarly, if you have many computers on a network, and you are out of RAM, and your nighbor has extra RAM, you don't want to page out to your slow disk when you can use your neighbor's memory.
NUMA machines are somewhere in between these two scenarios.
There are lots of problems: networks aren't very reliable, there's lots of network balancing issues, etc. But it's certainly interesting research, and can be useful for the right application, I guess.
Disk is slow, though... memory access time is measured in ns, disk access time is in ms... that's a 1,000,000x difference. So paging to someone else's RAM over the network can be more efficient.
I don't have any good papers handy, but I'm sure you can google for some.
-- Erich
Slashdot reader since 1997
All of this is available in the Infiniband Spec... now if someone would just build it... and then we could all buy it.
-- oh.... so..... sleeeeeepy.
Servers will very soon be equiped with Infiniband (http://www.infinibandta.org/). Infiniband has dedicated support for RDMA. This includes efficient key mechanisms, which minimize operating system involvement (which would be context switches each time) and low latency. Bandwidth available right now is 2.5 GBit/s and higher bandwidth can be anticipated very soon.
Security does not mean 100% exploit-proof, it means it secures your information/services given certain desired lengths of protection and certain operating conditions.
While M$ is probably not going to get this one right, it doesn't mean that someone can't. This *is* a desirable feature for some applications, and it is possible to make a secure environment (where secure is defined for the application), and make it seamless as well. That is the whole goal of network security professionals.
If anything, the fact that people already know what kinds of "old" exploits this may be vulnerable to, it means that we are already headed in the right direction.
I'm absolutely amazed how stupid the average Slashdotter is. First, they don't know that this sort of technology is decades old. I was using remote memory on an image generator back in the early 1980's, and it was nothing new then. Then, they cry about security implications when THERE ARE NONE.
Slashdot is on life support. Somebody needs to pull the plug.
FreeBSD already supports gdb over firewire using
the firewire bridge ability to DMA to/from any
location of memory. Very handy for remote kernel
debugging.
First, what the headline would have you believe has been invented is making it appear as though the RAM of one machine is really the RAM of another machine. This technology has been around and used for quite some time in clustered/distributed/parallel computing communities since at least the 1980s.
If you look at a brief summary of the spec, http://www.rdmaconsortium.org/home/PressReleaseOct 30.pdf, you'll find that all that's happening is that more of the network stack's functionality has been pushed into the NIC. This prevents the CPU from hammering both memory and the bus as it copies data between buffers for various layers of the networking stack.
I'll also note that the networking code in the linux kernel was extensively redesigned to do minimal (and usually no) copying between layers, thereby providing very little advantage of pushing this into hardware.
Please, folks, don't drink and submit!
This article defines NUMA as
which seems to cover all of this.Searched the web for mammory...
Did you mean: mammary
Next time, resist the temptation to use the rotten mind!
It seems to me that this is all about implementing a few tweaks to the protocol to allow NICs to use DMA to a much more efficient measure. It's not about letting apps coming from the network to use arbitrary memory blocks. It means programs like apache will be a bit faster because one can program the NIC to pull data directly from the buffer set aside from network access rather than having the CPU do such work. This is about UDMA for networks, not an insanely stupid backdoor.
Marxism is the opiate of dumbasses
Actually, I think it's "memorium," if you're referring to the Nirvana song. It's good to know we have some cultured Slashdotters. :-
I'm sure you can already pretty much do this with Windows/IIS... of course Microsoft probably doesn't know about it...
Anyone?
...er, or rather, "memoriam." M-W: "in memory of -- used especially in epitaphs"
This post is the epitome of why there should be the option of complete post obliteration on slashdot...
-------
"In times of universal deceit, telling the truth becomes a revolutionary act."
-- George Orwell
1. ssh root@remote-machine /proc/kcore in remote-machine
2. read from and write to
So where is the use of that? And shared memory emulation over a network is also a decades old technology.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted and ignored otherwise.
More off loading of resources on PDA's and Tablets... Could help reduce the cost to the point they are like stickypads... throwaway.....
---- Booth was a patriot ----
I was wondering when we would see more of the network becoming the system bus for the computer. Sun, IBM, HP, and others have been working toward this type of architecture where a network serves as the interconnect for cpu's, ram, disks. Was a good read, but left me wanting more... Soon. Why is it so rare to see good stories make the frontpage on /.?
Fnord.sig
Allowing one to access the memory of a remote computer over an IP network. Several programs have presented this useful feature including BIND DNS server, Sendmail MTA and of course MS IIS web service. The technology is called "buffer overflow" and has been used by many individuals for "fun and profit"^H^H^H^H^H^H^H^H^H^H their computing needs. The ultimate guide to using this great feature has been seen here
Why is it we can figure out more and more ways to use braudband but not get it deployed, practical, and cheep?
Check out http://www.systran.com for their "Shared Common RAM NETwork" products....
This would only be a slightly different transport...
I think that shared memory across a network is doable but, like all initial attempts, bugs will exist. But the benefits of having shared memory like this outweight the drawbacks of having a hard problem to solve.
I hate liberals. If you are a liberal, do not reply.
Scott McNealy said that, but the vision was implemented by others. CMU's Mach (1985), Andrew Tanenbaum's Amoeba (1986), and Plan 9 (1987) were OSes that made a network into a computer.
To be fair, Sun does have ChorusOS , but that seems to have died the death (i.e. gone Sun Public Source) despite Scott's best intentions.
Sigmentation fault - core dumped
I agree, as long as by "this post" you meant your own. Now STFU.
The approach you describe relies on CPU intervention on both ends of the connection. The article describes an approach that is much closer to the actual hardware than simply opening a ssh connection. I hope this clears the issue up for you!
I hate liberals. If you are a liberal, do not reply.
In what way is this technique better then swapping over NFS?
www.vanheusden.com - home of Multitail, HTTPing, CoffeeSaint, EntropyBroker, rsstail, bsod, listener, nagcon, nagi
They'll require the evil bit to be set for any malicious connections
IB is dead before it even started. How many companies pulled out of support? Only one or two companies will be making the chip.
3GIO ( or whatever Intel is calling it ) RapidIO and Hypertransport will fill the void.
The proc device serves a two-level directory structure. The first level contains numbered directories corresponding to pids of live processes; each such directory contains a set of files representing the corresponding process.
The mem file contains the current memory image of the process. A read or write at offset o, which must be a valid virtual address, accesses bytes from address o up to the end of the memory segment containing o. Kernel virtual memory, including the kernel stack for the process and saved user registers (whose addresses are machine-dependent), can be accessed through mem. Writes are permitted only while the process is in the Stopped state and only to user addresses or registers.
The read-only proc file contains the kernel per-process structure. Its main use is to recover the kernel stack and program counter for kernel debugging.
The files regs, fpregs, and kregs hold representations of the user-level registers, floating-point registers, and kernel registers in machine-dependent form. The kregs file is read-only.
The read-only fd file lists the open file descriptors of the process. The first line of the file is its current directory; subsequent lines list, one per line, the open files, giving the decimal file descriptor number; whether the file is open for read (r), write, (w), or both (rw); the type, device number, and qid of the file; its I/O unit (the amount of data that may be transferred on the file as a contiguous piece; see iounit(2)), its I/O offset; and its name at the time it was opened.
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
Can you eat it?
Down with Saudi Arabia!!!
"Imagine a Beo-(clobber mangle clobber mangle)..$%@$%@$@%$!"
-jc
... when we can just plant our code in your memory directly.
(ok, ok, there should be some serious security with remote memory. I couldn't resist.)
It's Linux, damnit! Pay no attention to renaming attempts by self-aggrandizing blowhards.
I wonder why noone pointed that out before. This might enable (more or less) DRI to remote X-server...
Imagine running accelerated X-server and being able to start [insert your favourite 3D game] on a remote host.
Of course _if_ 1Gbps is enough.
Really? Seems odd that my company keeps selling it. Sun is using it in their next systems. Network Appliance is using it today. Intel is selling systems with it. Chips are made by Mellanox, Agilent, and Fujitsu. Sun and IBM are doing chips for internal use. There are still a good number of IB vendors as well. See InfiniCon Systems (where I work), TopSpin, InfiniSwitch, Voltaire, and JNI. There are also a bunch of folks using IB in embedded environments.
The high performance computing market and database clustering market are really interested in IB. 10 Gb ethernet is many times more expensive and without a TOE that can do 10 Gb, no one can really use it except for switch to switch. 10 Gb fibre channel isn't quite out yet.
Also, many companies are still paying people to spend a big chunck of their time working on InfiniBand Trade Association Working Group issues.
Finally, none of the technologies you mentioned solve clustering issues because they are for inside the box, not from box to box.
-- soldack
The amount of book-keeping required to keep this thing going makes it a non-starter. And as for scale'ing. Forget it.
The sad truth is that it's common knowledge that this is the least efficient principle for distributed systems. This technique is usually the fall-back position if nothing else works.
TCAP-Abort
I cannot wait to see some raw test numbers off of this on many different scenarios.
The only thing with this is providing redundancy and backups....if there's any latency then, nothing gets done. How much overhead is really involved in checking the data, checking the connection, restarting/etc.
Did people learn nothing from MS Access databases, where you've got a bunch of people accessing this database file directly instead of through a server? It didn't scale very well, network usage was excessive, and every once in a while the database would get irrepairably corrupted, often when any one of the computers accessing it crashed.
With direct memory access, we're going to have the same problem, plus garbage collection is going to be a pain. Memory locking will especially pose difficulties, and may require several round trips between the client and server. If a computer crashes, parts of memory in any computers it was accessing could be left in an intermediate state. And hackers and virus writers will see it as a dream come true.
Keep server code on the server.
I've been using user-mode virtual memory driver that uses IP (UDP actually) for 8 years. It's really a simple technology, and I'm sure it's been done before my work as well.
parts of the internet run over dry copper. With this system you can have the telephone company install a twisted pair at a cost of about $30 bux per link between any _resonable_ pair of locations and then you can hook up whatever you want also _within reason_. This allows one to run say DSL or MVL or whatever you want.
AFAIK there is not equivalent offering for fibre and one really needs fiber to be able to do anything interesting.
Now - if dry fiber did exist then it would make a great deal of sense to rent it and drop in some 100baseT to fiber drivers. These cost under $1000 bux for many models and can drive oh up to say 75 KM's.
Fibre costs about as much as copper anyway - to buy and install. If the phone company can make a bux renting dry copper at say $25 per month - then they should be able to make the same bux renting dry fiber. Imagine - 100MB/sec ethernet across town for say 50 bux/month. Attractive? I think so!
http://now.cs.berkeley.edu/Xfs/xfs.html
xFS: Serverless Network File Service
Clients pony up some RAM that is devoted
to a distributed file system, with fail
over, redundancy, migration, etc.
...because haxoring those buffer overflow exploits is just too damn hard.
... I was toying for some time with the reverse idea, replacing network with memory access, having one processor write at processor speed to a special memory location, and another reading at processor speed from a special memory location, the two being interconnected. The idea being to explode gigabit ethernet speed limits
First off, this is not a network shared memory scheme. RDMA could be used to implement one very efficently though.
It will not allow arbitary access to your memory space. In fact, it would prevent a great number of buffer overflow exploits
The best analogy is the difference between PIO and UDMA modes of your IDE devices (or any device). This is all about offloading work from your CPU. It is moving the TCP/IP stack from the kernel to the network card for a very specific protocol.
Here's how RDMA would work layered over (under?) HTTP.
- browser creates GET request in a buffer
- browser tells NIC address of buffer and who to send it to.
- NIC does a DMA transfer to get buffer. OS not involved
- NIC opens RDMA connection to webserver
- server NIC has already been told by the webserver what buffer it should put incoming data
- webserver unblocks once data in buffer and parses it.
- webserver creates HTML page in second buffer.
- webserver tells server NIC to do a RDMA transfer from buffer to browser host
- client NIC takes data and puts it in browser buffer
- browser unblocks parse HTML and displays it.
All of this with minimal interaction with the TCP/IP stack. RDMA just allows you to move a buffer from one machine to another without alot of memory copying in the TCPIP stack.
In fact, the RDMA protocol could be emulated completely in software. It would probably have a small overhead verses current techniques but would still be useful. Just imagine real RDMA on the server and emulated RDMA on the clients (cheaper NIC). The server has less overhead and most clients have cycles to spare!
Sounds like a great tool for clustering, especially considering that the new motherboards have gigabit ethernet and a link directly to the northbridge/MCH.
There's just one problem with that... ethernet (even GigE) is *not* a good connection for clustering. Sure, the bandwidth is semi-decent, but the *latency* is the killer. Instead of a processer waiting a number of nanoseconds for memory (as with local memory), it'll end up waiting as much as milliseconds. That may not sound like much, but from nano to micro you jump seven orders of mangitude!
steve
Oh, you're not stuck, you're just unable to let go of the onion rings.
..."See, we TOLD you it was a feature!" Microsoft will also sue the researchers working on this project, citing they Innovated this years ago.
CAn'T CompreHend SARcaSm?
We can also start wrapping processor instructions in XML and transmit them via SOAP, in order to create more interoperability between different machine architectures! Remember, we already have IP over XML :-)
That's what the whole thing sounds like to me...
Don't drink and su! antidisestablishmentariazationally
If you get a Myrinet cluster and run IP over it I think it uses the GM kernel driver which does exactly DMA remote access. The NIC has to be smart enough to handle this of course.
:). But what of security?
Cplant style clusters do this as well. They also provide an API called Portals which revolves around RDMA. Portals, incidentally, is being used in the Lustre cluster filesystem and is implemented in kernel space for that project. It can use TCP/IP I believe but its not real RDMA.
*sigh* some day all NICs will be smart enough to not interrupt the CPU to do data delivery. That would rock
Don't know myself.
"The SkyNet funding bill is passed. The system goes online on August 4th, 1997. Human decisions are removed from strategic defense. SkyNet begins to learn at a geometric rate. It becomes self-aware at 2:14am Eastern time, August 29th. In a panic, they try to pull the plug...And, Skynet fights back."
... this would be nice in two applications on the low end. It would be nice to be able to recycle older hardware that has a too-small RAM limit now, perhaps with a pci card gizmo. Or to be able to upgraxde to better quality RAM. It would also be nice in a home or business lan situation where you could have several relatively cheap dumb nodes but a server that has or is capable of using a huge amount of ram and serving that along with various files.
but it can be made to do it using a patch, see the contrubutions on openmosix.org
From a quick read of the article, RDMA isn't really much different from AIO (asynchronous I/O), which is currently being integrated into the next great Linux kernel (or maybe the one after that). Thus, I expect all of you to become instant converts.
t ' crowd, I think this whole idea is pretty sound. We're not really talking about unrestricted remote access to memory, but remote DMA--like in your sound card--which allows for chunks of memory to be transferred without the explicit involvement of the process(or). The performance implications of this are similar to using sendfile() to serve static Web pages. It's just another way to lob buffers around the network.
Having pandered to the 'dude-if-its-not-Linux-its-not-worth-talking-abou
Since this is a proposal for enhancing Ethernet, I gather there's some sort of hardware basis for why this is such a great idea, since it seems pretty darn obvious that it's been done before in software. But I don't think they got into that in the article, except for the mention of the offload chips (but how is that really any different from using an SSL accelerator? why is this technology tied to Ethernet?).
I have seen dell 2650s hit over 800 Megabytes (6.4 Gb) per second running MPI over InfiniBand using large buffer sizes. The limit is pretty much the PCI-X 133 Mhz interface we are on. I suspect that with PCI-X DDR and PCI Express, we will be able to get a lot closer to 10 Gbit.
-- soldack
Um.. how about Apple's computers.... they've been doing onboard gigabit for a few years haven't they?
Blessed be he who reads this post, Cursed be he who tells my boss.
Windows has had this feature where stuff in my ram
could be read my anyone who sent me an outlook
attachment...
Or is this something different?
it is only after a long journey that you know the strength of the horse.
This gives a whole new meaning to remote exploit.
1. Set up a ramdisk on a machine with lots of RAM. /dev/nd0
2. Set up a network block device to export said ramdisk.
3. Set up client using nbd-client to talk to server with network block device.
4. swapon
5. profit!!!
Using NFS for disk-based swap is possible but silly since you incur the extra overhead. NBD works on a plain vanilla TCP connection and avoids touchy issues like memory vs. packet fragmentation. If you have a gigabit ethernet card with zero-copy support in the driver, then you are in business.
Have another go at it! It's fun
Fuck Beta. Fuck Dice
There are already similar solutions if your intention is to cluster.
http://www.vmic.com/
...a Beowulf cluster of this!
1)Get into one machine behind firewall.
2)Sniff database's possibly encrypted RDMA setting your account to zero balance.
3)...
4)Profit!!! (Replay the message setting your account balance back to zero before you get billed.)
Copyright Violation:"theft, piracy"::Anti-Trust Violation:"thermonuclear price terrorism"<-Overly dramatic language.
DIPC, Distributed Interprocess Communication has been around for years. It's presently a Linux kernel mod which allows standard IPC through TCP/IP, so you can use shared memory and semaphores between computers. It is implemented so an application merely has to turn on one more bit in an IPC request to activate DIPC, and then the configuration file handles where the data goes.
I accessed the memory of a Honeywell Alpha-Delta 3000 with a 12Ga. once.
Diplomacy is the art of saying "Nice doggie" until you can find a rock. Will Rogers
Well, sort of...
"Back in the day", I wrote a virtual memory handler for my Amiga's accelerator card (which had a 68030 and MMU). Meanwhile, some friends of mine had developed this networking scheme that involved wiring the serial ports of our Amiga's together in a ring, which allowed us to have a true network without network cards.
Then came the true test: I configured my virtual memory to use a swapfile located in a friend's RAM-disk (he had way more memory than I did), fired up an image editor, opened a large image, and lo and behold: I was swapping at a whopping 9600 bytes per second! The fact that every packet had to pass through multiple other machines (because of the ring-nature of the network) didn't make it any faster either...
Back in the mid 1980's somebody implemented network connection hardware that used N-way cache coherency for a memory to memory shared data segment. The CPU did not do anything to keep the the data synchronized between all the systems.
It was notable because the guys who implemented it were unaware that you could not do that without the overhead of network protocol to slow everything down to a crawl but they made it work anyhow.
It was called 'membus' or something similar.
Anybody remember this or is it just the drugs again?
best regards,
buck
duh - reinvented, actually.
open (SIG, "</dev/zero"); $sig = <SIG>; close SIG;