Why Isn't X11 Thread-Safe?
blackcoot asks: "I've just spent a couple very frustrating days trying to figure out what 'unexpected async reply' means and fixing it. The problem is a result of the fact that X11 simply isn't designed to handle events from more than one thread at a time. Why? Given that more and more often, people are writing multi-threaded GUI applications, are there fundamental design decisions in X11 that make dealing with receiving events from multiple threads simultaneously, impossible? Or was the protocol never designed to handle concurrent updates? More to the point, is there an easy way in Qt (short of deriving a new widget for every widget and overriding it's paintEvent to lock the library first, paint, then unlock as Trolltech's docs seem to suggest) to make this problem go away?" I'm not sure if things have been done in recent revisions of XFree to fix this problem, but this message, from February of last year, might help some of you out that are suffering from this problem. Any ideas if this problem has been fixed in recent versions of XFree?
This is the same approach taken by Swing ("lightweight" layer on top of Java AWT). Events fired by a GUI object are run in a GUI thread. For side effects from a non-GUI object, a convenience class is provide to push events into the GUI thread. Basically to make a GUI call from a non-GUI thread you throw a work request onto a queue which the Swing thread processes at an appropriate time. -- Jack
Xlib was made thread safe in the X11R6 release in 1994, but only if you initialize the locks it needs to do it properly via XInitThreads.
I'm going to explain myself more clearly because it's obvious from reading this thread that there is a LOT of confusion out there.
X11 is a protocol. Xlib is a C library that provides an API to the protocol. It is important to understand this distinction. Applications and toolkits do not have to use Xlib - they could generate X11 protocol streams directly, or they could use an Xlib replacement - but as there's nothing really wrong with Xlib nearly everybody uses it.
X11 is by design a client-server protocol. The client opens up a socket (UNIX socket, TCP/IP socket, etc) to the server. The client then sends multi-byte "messages" down the socket to tell the server to do stuff. For example, there is a message to draw a line. Each message has a few bytes to identify the command, then a bunch more bytes describing parameters to the command. The "line" message has one parameter describing which Window to draw to, one paramter for the Graphics Context (colour, line style, etc), and several parameters for the X,Y coordinates of the line.
Now imagine a threaded X11 client. Also imagine for the sake of argument that the client is generating X11 messages directly or is using a non-thread-safe Xlib. The client pseudo-code looks something like this:
Now remember that X11 is a protocol - a byte stream - so what is actually happening is that each thread is generating a sequence of bytes. The bytes look something like this:
Because these byte streams are both being fed down the same socket, and because the application is not thread safe, the resulting stream looks like this:
It's an absolute mess! The X server gets very confused - it thinks the client has gone haywire - and so nothing works. There are only two solutions to this problem.
#1 is make all messages ATOMIC. This is simply impossible for sockets. You can make it work by getting rid of sockets and forcing all X11 clients to use a messaging IPC - and this IPC might even use sockets at the lowest layer - but it's impossible to retrofit it to sockets. The messaging approach has been used by Berlin, GDI, and a bunch of other windowing systems.
#2 is to force all multi-threaded X11 clients to impose their own locking. Each thread shares a lock for the protocol stream. Threads cannot proceed until they have gained the lock and for efficiency they should release the lock as quickly as possible. This is the approach that X11R5 (and X11R6) have used. Each thread uses XLockDisplay and XUnlockDisplay which are two new calls provided by Xlib. The change to the pseudo-code from before is trivial.
With this simple change in place your multi-threaded X11 client is now perfectly compatible with all X11 servers. The combined protocol stream is not confusing: the ARC and LINE messages are sequential rather than munged together.
Now the reason I think there is confusion here is that people are asking "Why Can't X11 be Multi Threaded?". The question is nonsensical. Socket protocols are not threadable. It's impossible to do this. It is very helpful here to understand that X11 is a LOT like other client-server protocols such as HTTP. In fact the analogies with HTTP are strong. HTTP has a client called the "web browser". HTTP has a server such as "Apache". The client opens a TCP/IP socket to the server. Messages begin with the multi-byte string GET /page HTTP/1.0. Optional bytes can follow describing additional HTTP functionality. The only real difference to HTTP is that X11 is PERSISTENT and has SERVER SIDE STATE. There are also minor differences such as the protocol is binary instead of text.
Now you can have a multi-threaded web browser, and you can also have a multi-threaded X11 client. You can have a multi-threaded web server, and you can also have a multi-threaded X11 server. But you can't have a multi-threaded HTTP protocol stream. Similarly you can't have a multi-threaded X11 protocol stream. It doesn't make any sense to even ask for this. As I showed before, it would be like a web browser requiring two URLs from a single server, but generating an HTTP "request" that looked like this.
The solution is to serialise the HTTP commands in the web browser. The way to do this is with a serialisation library with locking. This is the same approach used by X11 with Xlib, provided by the XLockDisplay and XUnlockDisplay primitives.
You can reasonably argue that X11 wouldn't have this problem if it was a messaging protocol instead of a multi-byte stream protocol. That's the design decision that was made for X11, and I personally think it's a non-issue. There are other issues with the X11 protocol - it's quite heavy, many of the messages are limited or outdated, and some of the server-side state is useless - but the fact that is a BYTE STREAM protocol instead of a MESSAGING protocol is I think a non-argument. People seem to focus very heavily on it as the "reason that X11 sucks" but I think these people simply haven't investigated how the alternatives work. Eventually everything becomes a byte stream: it's just a design decision as to how early you make the conversion.
i'm working on a real time computer vision system. capture runs in its own thread, firing off imageArrived events which end up being executed in the capture thread (a subtlety of qt that i was unaware of). this imageArrived event gets plugged into whichever listeners are interested, the idea being to allow multiple paths of processing the same image. i attempted to do this using timers, having capture events posted peridiodically and then occasionally refreshing. this was very unsatisfactory for two reasons: 1) it looks awefully slow even though i know that it's running very quickly underneath and 2) (more importantly) this method will only ever use a single processor. as i expand to do more interesting processing, i'm going to be forced to use multithreading because a single processor will not be able to do it all in the allotted amount of time, however, multiple processors could because of how much of these processes are easily parallelizable. for a rough guestimate of how much processing is involved, there are four classifiers per pixel (three based on chrominance or one based on luminance and two based chrominance plus a classifier that combines the output of the three other classifiers), 640x480 pixels 30 frames per second for a grand total of 37 odd million classifications per second. add to this the possibility of yuv to rgb conversion, and contraction/expansion filtering to clean up noise and you can quickly see how the time adds up. this is before i even begin to do anything useful with the images --- this is just removing the boring pixels. so far, this is all integer arithmetic on arrays that are "cache friendly" in their layout. the really cpu intensive stuff comes later (this is all preprocessing in terms of my application)... updating classifier's background models, extending classification to foreground, background, and things that have been merged into the background (i.e. have stayed still long enough, either items removed from the scene, added to the scene or moved within the scene).
for those who were wondering, i did figure it out with a number of critical sections synchronized indirectly on xlib's mutex. performance is actually better threaded than with earlier single threaded prototypes (mostly because i am now able to start processing an image while i start getting the next one). looking back, i realize that most of my frustration is the result of the vast majority of my gui programming experience being done on windows in applications that were fundamentally stupid to thread. as the message that cliff pointed out notes, this particular quirk of x is not very well documented and has the potential to be very counter intuitive to people doing this for the first time.
anyways, thanks a lot for the help. for those who are curious, my goal is to release the source for the framework and sample application within a month.
Go look at KParts - KDE embedding. One window, one application, as far as the user is concerned, but subwindows are controlled possibly by different processes on different machines.
How do you think window managers work? The X server doesn't know from window managers - the way you prevent multiple window managers is by checking for atoms on the root window. Remember, conceptually any client of a server can do something with any window - you just need a way to get the window ID.
Wrong. The X server already handles multiple simultaneous connections. Whether the X client does or not is the client's choice. Lots of clients for other systems handle multiple connections (your web browser, for one). Um, add the hedge for some applications and I'll agree with you. Otherwise you're wedged in a one-track design mind. You're making your problem fit your design, rather than your design fit your problem. There's any number of interactive applications that make lots of logical sense to be multi-threaded. Take your typical movie player with its nifty visual feedback. One thread for putting the frames up. One thread to respond to user actions.The richer your interaction, the more application semantics are involved, the more likely that arbitrarily splitting the app in half along an arbitrary "UI/App" line is just not going to work. MVC has a similar problem. They're both cookie-cutter designs that pretend that every interactive app is structured the same way at the top level.
Go read papers on the "eXene" system in the programming language "ML". A pervasively multi-threaded X client library. One widget - one thread. It makes the widget code very easy to understand - you don't have to split your code into a bazillion little callbacks. You don't have to arbitrarily time-slice things that are conceptually continuous. Things like while (buttonIsDown) followTheMouse() work just fine. Ever have to break up a callback into multiple functions, triggered by timers, just so the app didn't appear to "freeze" while you were off doing something time-intensive? Multi-threading interaction can make the code much easier to maintain because you don't have to worry about "starving" parts of the application for events while busy working on others - the thread scheduler handles pre-emption for you.And eXene isn't some hot new thing. It dates from the early 90's.
Go back even earlier and find some of James "Java" Gosling's earliest work - NeWS. NeWS clients wrote multi-threaded PostScript to draw on the display.
Events, timer callbacks and the like are all just ways of simulating something continuous with discrete code - go look at TBAG from Sun and then the follow-on Fran from Microsoft Research. Forcing discretizations of continuous phenomena into an arbitrary serialization is just a way to kludge around a poor understanding of parallel activity.
It isn't the app - it's the libraries the app is trying to use. They're a poor fit to the abstraction blackcoot would like to use.-----
Klactovedestene!