Why Isn't X11 Thread-Safe?
blackcoot asks: "I've just spent a couple very frustrating days trying to figure out what 'unexpected async reply' means and fixing it. The problem is a result of the fact that X11 simply isn't designed to handle events from more than one thread at a time. Why? Given that more and more often, people are writing multi-threaded GUI applications, are there fundamental design decisions in X11 that make dealing with receiving events from multiple threads simultaneously, impossible? Or was the protocol never designed to handle concurrent updates? More to the point, is there an easy way in Qt (short of deriving a new widget for every widget and overriding it's paintEvent to lock the library first, paint, then unlock as Trolltech's docs seem to suggest) to make this problem go away?" I'm not sure if things have been done in recent revisions of XFree to fix this problem, but this message, from February of last year, might help some of you out that are suffering from this problem. Any ideas if this problem has been fixed in recent versions of XFree?
What I do is create a GUI thread which handles all interface to the GUI. All of my other threads then fire their events to that thread through an interface that IS threadsafe. The GUI thread then synchronously handles the GUI events. This abstraction away from your actual GUI API allows you to more easily port your app to another platform because all of your X11 specific code is contained in one place.
I'm going to explain myself more clearly because it's obvious from reading this thread that there is a LOT of confusion out there.
X11 is a protocol. Xlib is a C library that provides an API to the protocol. It is important to understand this distinction. Applications and toolkits do not have to use Xlib - they could generate X11 protocol streams directly, or they could use an Xlib replacement - but as there's nothing really wrong with Xlib nearly everybody uses it.
X11 is by design a client-server protocol. The client opens up a socket (UNIX socket, TCP/IP socket, etc) to the server. The client then sends multi-byte "messages" down the socket to tell the server to do stuff. For example, there is a message to draw a line. Each message has a few bytes to identify the command, then a bunch more bytes describing parameters to the command. The "line" message has one parameter describing which Window to draw to, one paramter for the Graphics Context (colour, line style, etc), and several parameters for the X,Y coordinates of the line.
Now imagine a threaded X11 client. Also imagine for the sake of argument that the client is generating X11 messages directly or is using a non-thread-safe Xlib. The client pseudo-code looks something like this:
Now remember that X11 is a protocol - a byte stream - so what is actually happening is that each thread is generating a sequence of bytes. The bytes look something like this:
Because these byte streams are both being fed down the same socket, and because the application is not thread safe, the resulting stream looks like this:
It's an absolute mess! The X server gets very confused - it thinks the client has gone haywire - and so nothing works. There are only two solutions to this problem.
#1 is make all messages ATOMIC. This is simply impossible for sockets. You can make it work by getting rid of sockets and forcing all X11 clients to use a messaging IPC - and this IPC might even use sockets at the lowest layer - but it's impossible to retrofit it to sockets. The messaging approach has been used by Berlin, GDI, and a bunch of other windowing systems.
#2 is to force all multi-threaded X11 clients to impose their own locking. Each thread shares a lock for the protocol stream. Threads cannot proceed until they have gained the lock and for efficiency they should release the lock as quickly as possible. This is the approach that X11R5 (and X11R6) have used. Each thread uses XLockDisplay and XUnlockDisplay which are two new calls provided by Xlib. The change to the pseudo-code from before is trivial.
With this simple change in place your multi-threaded X11 client is now perfectly compatible with all X11 servers. The combined protocol stream is not confusing: the ARC and LINE messages are sequential rather than munged together.
Now the reason I think there is confusion here is that people are asking "Why Can't X11 be Multi Threaded?". The question is nonsensical. Socket protocols are not threadable. It's impossible to do this. It is very helpful here to understand that X11 is a LOT like other client-server protocols such as HTTP. In fact the analogies with HTTP are strong. HTTP has a client called the "web browser". HTTP has a server such as "Apache". The client opens a TCP/IP socket to the server. Messages begin with the multi-byte string GET /page HTTP/1.0. Optional bytes can follow describing additional HTTP functionality. The only real difference to HTTP is that X11 is PERSISTENT and has SERVER SIDE STATE. There are also minor differences such as the protocol is binary instead of text.
Now you can have a multi-threaded web browser, and you can also have a multi-threaded X11 client. You can have a multi-threaded web server, and you can also have a multi-threaded X11 server. But you can't have a multi-threaded HTTP protocol stream. Similarly you can't have a multi-threaded X11 protocol stream. It doesn't make any sense to even ask for this. As I showed before, it would be like a web browser requiring two URLs from a single server, but generating an HTTP "request" that looked like this.
The solution is to serialise the HTTP commands in the web browser. The way to do this is with a serialisation library with locking. This is the same approach used by X11 with Xlib, provided by the XLockDisplay and XUnlockDisplay primitives.
You can reasonably argue that X11 wouldn't have this problem if it was a messaging protocol instead of a multi-byte stream protocol. That's the design decision that was made for X11, and I personally think it's a non-issue. There are other issues with the X11 protocol - it's quite heavy, many of the messages are limited or outdated, and some of the server-side state is useless - but the fact that is a BYTE STREAM protocol instead of a MESSAGING protocol is I think a non-argument. People seem to focus very heavily on it as the "reason that X11 sucks" but I think these people simply haven't investigated how the alternatives work. Eventually everything becomes a byte stream: it's just a design decision as to how early you make the conversion.