More Effective Use of Shared Memory on Linux

← Back to Stories (view on slashdot.org)

More Effective Use of Shared Memory on Linux

Posted by ScuttleMonkey on Monday November 14, 2005 @12:02AM from the sharing-is-caring dept.

An anonymous reader writes "Making effective use of shared memory in high-level languages such as C++ is not straightforward, but it is possible to overcome the inherent difficulties. This article describes, and includes sample code for, two C++ design patterns that use shared memory on Linux in interesting ways and open the door for more efficient interprocess communication."

14 of 280 comments (clear)

SysV IPC is obsolete by bogolisk · 2005-11-14 00:06 · Score: 4, Informative

some1 should tell the authors to rtfm.

$ man shm_open

--
Bogus
1. Re:SysV IPC is obsolete by maxwell+demon · 2005-11-14 00:46 · Score: 5, Funny
  
  The authors here use a static key of 0x1234...
  
  Well, that should be a safe choice, because no sane person would use 0x1234, therefore this key is still unused. :-)
  
  --
  The Tao of math: The numbers you can count are not the real numbers.
2. Re:SysV IPC is obsolete by Anonymous Coward · 2005-11-14 01:47 · Score: 5, Funny
  
  0x1234? Amazing! That's the combination on my luggage!
shmem (soon in Boost!) by Cyberax · 2005-11-14 00:18 · Score: 4, Informative

There is a great C++ library for shared memory support: SHMEM. It can place complex objects and STL-like containers in shared memory. And it is crossplatform (POSIX and Windows are supported).

And it will soon (hopefully) be a part of Boost!
1. Re:shmem (soon in Boost!) by maxwell+demon · 2005-11-14 00:40 · Score: 5, Interesting
  
  It can place complex objects and STL-like containers in shared memory.
  
  Depends on your definition of "complex objects".
  
  From the documentation:
  
  Virtuality forbidden
  
  This is not an specific problem of Shmem, it is a problem for all shared memory object placing mechanisms. The virtual table pointer and the virtual table are in the address space of the process that constructs the object, so if we place a class with virtual function or inheritance, the virtual function pointer placed in shared memory will be invalid for other processes.
  
  Basically, I would have been surprised if they had found a solution for that. But I guess it cannot be portably solved. Instead, the system would have to be prepared for it. I could imagine that objects in a shared library (so the same code is guaranteed to be shared to both processes) could be placed in shared memory, if the compiler/runtime system provided the means for it (say, instead of the pointer to a VMT, it would contain an offset into the constant data section of the shared library, and something to identify the library with, say a system-wide unique active library index which is generated by the dynamic linker).
  
  --
  The Tao of math: The numbers you can count are not the real numbers.
Re:C++ has bigger memory issues by ObsessiveMathsFreak · 2005-11-14 00:40 · Score: 5, Insightful

In fact, forget it; just use an actual OO language instead.

C++ is an actual Object Oriented language, which is of course half the problem.

If you mean a pure OO language like Java, in which everything is an object except for primitives and it takes ten classes and wrappers just to read a file, well then C++ isn't exactly an Object Oriented language as such. Perhaps you mean Smalltalk or the like.

I tell you what though, C++ is still around after all this time. With all the hype surrounding Java, Perl, C#, Python, etc, etc, etc C++ programmers are still there beavering away with the god awful sytax Stroustrup left them with. Even after all the improvments, all the innovation and all the additional research into computer languages, for a hell of a lot of tasks, there is really no real alternative to C++.

I don't say this as a C++ fanboy, even though I am "somewhat" fond of the language when it is used properly, and not in garbled and unreadable line noise. I say this simply as a statement of fact. There is still no successor to C++.

I don't want garbage collection so much as I want a cleanup and rationalisation of the syntax. GC would be nice, but forcing more readable code would be even better.

--
May the Maths Be with you!
This is nothing new by Anonymous Coward · 2005-11-14 00:46 · Score: 4, Interesting

You've been able to do this for a while using process shared mutexes and condition variables which allow you to do the same things you could do with pthreads and shared memory. The tradeoff is you get better performance avoiding syscalls to do IPC but it's less robust. If you get a segfault, you have to assume that the shared memory is in an unknown state and either shutdown or restart everything. The other processes can (or will be able) to detect this using once robust futex support is in Linux. Idiot programmers will of course ignore this and continue to use the corrupted memory anyway just like they do now with sysV semaphores used as mutexes with the SEM_UNDO option to allow the semaphore to auto reset if a process exits without resetting it.
Anyway, old stuff. Wake me up when you start talking about the newer tricks with shared memory.
Hardware-enforced sharing: OLD HAT by VernonNemitz · 2005-11-14 00:51 · Score: 4, Interesting

Quite a few years ago, there was a brief popularity of something called VRAM (video ram) that had memory cells specifically designed with one input line and TWO output lines. The idea was that the part of the hardware needing to construct an image for the screen ONLY needed to read memory, while the system responsible for creating the image needed both read and write access. Ever since then, I've wondered why they don't use this kind of memory in multi-processor systems, for communication between processors, such that Processor A has read/write access to a block of VRAM, to give info to Processor B (it has read-access only), while Processor B has read/write access to a different block of VRAM, to give info to Processor A (it has read-access only).
Doors by Anonymous Coward · 2005-11-14 01:06 · Score: 5, Interesting

I'm surprised no-one has mentioned Solaris Doors. Doors is an IPC mechanism whereby the first process (client) can hand off any residual time in its timeslice to the second process (server) resulting in short IPC calls running much less time as there is no discarded timeslice time and no wait for the server process to be scheduled (since it uses the client's timeslice).
Re:C++ has bigger memory issues by theCoder · 2005-11-14 01:22 · Score: 4, Interesting

C++ already has a garbage collector. Just allocate your objects on the stack instead of the heap:
void foo() { SomeObject obj; // other code // poof -- obj is deallocated automatically, even if an exception is thrown }
I work on a project that has tens of thousands of C++ classes, and very few "new" and "delete" operations (more "new" than "delete" because we have a class that manages reference counting like a heap garbage collector would do).

People who think they always need to "new" objects in C++ have spent way too much time using Java.

Here's another hint -- pass objects to functions as const references:
void foo(const SomeObject& obj) { // code }
This way, a copied object isn't allocated for the passing (no memory at all is in fact allocated). The biggest drawback is you can only call "const" methods on the object, but this is outweighed by not using pointers. Not that I don't like pointers, they just increase the complexity and should be used prudently. And as my .sig says, be sure to free those mallocs!

--
"Save the whales, feed the hungry, free the mallocs" -- author unknown
Re:10 fold speed improvement - Dekkers mutex ! fas by Anonymous Coward · 2005-11-14 01:33 · Score: 5, Informative

Yes, some algorithms are worth remembering...

This one is worth remembering as one to avoid -- it's based on the idea of a busy-wait. Look at the while(test) { /* do nothing */ } loop and outer while loop. This should not be done. Semaphores might be slower in the specific case, but overall system performance will benefit from using best-practices.

There's a reason this algorithm lies in rest in academic journals: it's only useful as a teaching tool.
There are better ways by photon317 · 2005-11-14 02:32 · Score: 4, Informative

A lot of shared memory synchronization and/or caching problems can be solved on Linux through the effective use of a few simple things:

1) shm_open (if seperately-started processes which need to coordinate in shared memory), or mmap(MAP_SHARED|MAP_ANONYMOUS) for a process which will fork children which need to communicate/share between themselves and the parent.

2) Use 's "atomic_t" integer type within that shared memory array (atomic_t* my_shm_array = mmap(....)). The atomic_t type has several functions defined in that header for atomic read, write, increment, etc for the linux hardware platform at hand. On most sane (cache-coherent) SMP architectures, reading and writing are already atomic operations, so this basically devolves to just setting and getting integers like normal (with a little bit of syntactic sugar (struct { volatile int val }) to make sure the C compiler doesn't optimize things away that it shouldn't. And you can implement a whole lot of sane algorithms using nothing but shared memory integer reads and writes with no locking or special atomic increment ops.

3) If you need more advanced or complex locking on the shared memory for synchronization, use Linux's "futex"'s. They're in the man pages, and they're really fast.

--
11*43+456^2
yeah, fast, and 10-fold chance of odd failures by Krischi · 2005-11-14 02:36 · Score: 5, Informative

Yeah, this algorithm is fast. Too bad that it does not work. This kind of design is a common mistake by people who do not understand the intricacies of multithreaded programming. In short, it fails miserably when the CPUs are allowed to reorder loads and stores, a.k.a. pretty much any modern CPU. You need a memory barrier between setting and testing of a shared variable.

Google for Dekker's algorithm and memory barrier - you will find better explanations of the problem there than I could type up in my limited time here right now.
Preying on the non-comp SCI mods, I see. by Inoshiro · 2005-11-14 04:54 · Score: 4, Insightful

"How many people know about this? Nobody! I never read about it anywhere. I invented it myself years ago, .."

Turn to page 55 of your OS design and implementation by Tanenbaum. See where he says, "For a discussion of Dekker's algorithm, see Dijkstra (1965)."? How do you get through a proper comp sci honours degree to the point where you can take a masters and then a PhD without reading Dijkstra?

How about you crack open that copy of Operating Systems (4th ed) by William Stallings, which has a discussion of concurrency and Dekker's on pages 208-213? How can you get past a 2nd/3rd-year introductory operating systems class without having gone over this topic?

You are a troll. A troll preying on the fact that most of the moderators here have no idea about computer science, and have not taken a wiff of a real operatings systems class.

For the record, Peterson's algorithm (published in 1981) is a much simpler solution to your problem. It's on page 56 of the Tanenbaum book, and also discussed in Stallings on page 213. There's a new 5th edition of the Stallings book, but the index will take you to the correct chapter/page in short order.

--
--
Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.