Why Isn't X11 Thread-Safe?

← Back to Stories (view on slashdot.org)

Posted by Cliff on Thursday January 16, 2003 @08:15AM from the correcting-a-potential-underlying-problem dept.

blackcoot asks: "I've just spent a couple very frustrating days trying to figure out what 'unexpected async reply' means and fixing it. The problem is a result of the fact that X11 simply isn't designed to handle events from more than one thread at a time. Why? Given that more and more often, people are writing multi-threaded GUI applications, are there fundamental design decisions in X11 that make dealing with receiving events from multiple threads simultaneously, impossible? Or was the protocol never designed to handle concurrent updates? More to the point, is there an easy way in Qt (short of deriving a new widget for every widget and overriding it's paintEvent to lock the library first, paint, then unlock as Trolltech's docs seem to suggest) to make this problem go away?" I'm not sure if things have been done in recent revisions of XFree to fix this problem, but this message, from February of last year, might help some of you out that are suffering from this problem. Any ideas if this problem has been fixed in recent versions of XFree?

8 of 44 comments (clear)

Min score:

Reason:

Sort:

Swing does this by Jack+Greenbaum · 2003-01-16 10:40 · Score: 3, Informative

This is the same approach taken by Swing ("lightweight" layer on top of Java AWT). Events fired by a GUI object are run in a GUI thread. For side effects from a non-GUI object, a convenience class is provide to push events into the GUI thread. Basically to make a GUI call from a non-GUI thread you throw a work request onto a queue which the Swing thread processes at an appropriate time. -- Jack
It has been for almost a decade by acoopersmith · 2003-01-16 12:18 · Score: 2, Informative

Xlib was made thread safe in the X11R6 release in 1994, but only if you initialize the locks it needs to do it properly via XInitThreads.
Lots of Confusion! Long Explanation. by nathanh · 2003-01-16 14:24 · Score: 4, Informative

I'm going to explain myself more clearly because it's obvious from reading this thread that there is a LOT of confusion out there.
X11 is a protocol. Xlib is a C library that provides an API to the protocol. It is important to understand this distinction. Applications and toolkits do not have to use Xlib - they could generate X11 protocol streams directly, or they could use an Xlib replacement - but as there's nothing really wrong with Xlib nearly everybody uses it.
X11 is by design a client-server protocol. The client opens up a socket (UNIX socket, TCP/IP socket, etc) to the server. The client then sends multi-byte "messages" down the socket to tell the server to do stuff. For example, there is a message to draw a line. Each message has a few bytes to identify the command, then a bunch more bytes describing parameters to the command. The "line" message has one parameter describing which Window to draw to, one paramter for the Graphics Context (colour, line style, etc), and several parameters for the X,Y coordinates of the line.
Now imagine a threaded X11 client. Also imagine for the sake of argument that the client is generating X11 messages directly or is using a non-thread-safe Xlib. The client pseudo-code looks something like this:
Thread A: while (1) { XDrawLine(); } Thread B: while (1) { XDrawArc(); }

Now remember that X11 is a protocol - a byte stream - so what is actually happening is that each thread is generating a sequence of bytes. The bytes look something like this:
Thread A Byte Stream: LINE Display1 0 0 100 100 LINE Display1 0 0 100 100 ... Thread B Byte Stream: ARC Display1 0 0 50 360 ARC Display1 0 0 50 360 ...

Because these byte streams are both being fed down the same socket, and because the application is not thread safe, the resulting stream looks like this:
Combined Byte Stream: LINE Display1 ARC 0 0 Display1 0 100 100 0 50 360 LINE Display1 0 0 LINE Display1 0 100 100 0 50 360 ...

It's an absolute mess! The X server gets very confused - it thinks the client has gone haywire - and so nothing works. There are only two solutions to this problem.
#1 is make all messages ATOMIC. This is simply impossible for sockets. You can make it work by getting rid of sockets and forcing all X11 clients to use a messaging IPC - and this IPC might even use sockets at the lowest layer - but it's impossible to retrofit it to sockets. The messaging approach has been used by Berlin, GDI, and a bunch of other windowing systems.
#2 is to force all multi-threaded X11 clients to impose their own locking. Each thread shares a lock for the protocol stream. Threads cannot proceed until they have gained the lock and for efficiency they should release the lock as quickly as possible. This is the approach that X11R5 (and X11R6) have used. Each thread uses XLockDisplay and XUnlockDisplay which are two new calls provided by Xlib. The change to the pseudo-code from before is trivial.
Thread A: while (1) { XLockDisplay(); XDrawLine(); XUnlockDisplay(); } Thread B: while (1) { XLockDisplay(); XDrawArc(); XUnlockDisplay(); }

With this simple change in place your multi-threaded X11 client is now perfectly compatible with all X11 servers. The combined protocol stream is not confusing: the ARC and LINE messages are sequential rather than munged together.
Now the reason I think there is confusion here is that people are asking "Why Can't X11 be Multi Threaded?". The question is nonsensical. Socket protocols are not threadable. It's impossible to do this. It is very helpful here to understand that X11 is a LOT like other client-server protocols such as HTTP. In fact the analogies with HTTP are strong. HTTP has a client called the "web browser". HTTP has a server such as "Apache". The client opens a TCP/IP socket to the server. Messages begin with the multi-byte string GET /page HTTP/1.0. Optional bytes can follow describing additional HTTP functionality. The only real difference to HTTP is that X11 is PERSISTENT and has SERVER SIDE STATE. There are also minor differences such as the protocol is binary instead of text.
Now you can have a multi-threaded web browser, and you can also have a multi-threaded X11 client. You can have a multi-threaded web server, and you can also have a multi-threaded X11 server. But you can't have a multi-threaded HTTP protocol stream. Similarly you can't have a multi-threaded X11 protocol stream. It doesn't make any sense to even ask for this. As I showed before, it would be like a web browser requiring two URLs from a single server, but generating an HTTP "request" that looked like this.
GET GET /pag/index.htmle.asp HTTP/HTTP1.1.00

The solution is to serialise the HTTP commands in the web browser. The way to do this is with a serialisation library with locking. This is the same approach used by X11 with Xlib, provided by the XLockDisplay and XUnlockDisplay primitives.
You can reasonably argue that X11 wouldn't have this problem if it was a messaging protocol instead of a multi-byte stream protocol. That's the design decision that was made for X11, and I personally think it's a non-issue. There are other issues with the X11 protocol - it's quite heavy, many of the messages are limited or outdated, and some of the server-side state is useless - but the fact that is a BYTE STREAM protocol instead of a MESSAGING protocol is I think a non-argument. People seem to focus very heavily on it as the "reason that X11 sucks" but I think these people simply haven't investigated how the alternatives work. Eventually everything becomes a byte stream: it's just a design decision as to how early you make the conversion.
a little clarification... by blackcoot · 2003-01-16 16:27 · Score: 3, Informative

i'm working on a real time computer vision system. capture runs in its own thread, firing off imageArrived events which end up being executed in the capture thread (a subtlety of qt that i was unaware of). this imageArrived event gets plugged into whichever listeners are interested, the idea being to allow multiple paths of processing the same image. i attempted to do this using timers, having capture events posted peridiodically and then occasionally refreshing. this was very unsatisfactory for two reasons: 1) it looks awefully slow even though i know that it's running very quickly underneath and 2) (more importantly) this method will only ever use a single processor. as i expand to do more interesting processing, i'm going to be forced to use multithreading because a single processor will not be able to do it all in the allotted amount of time, however, multiple processors could because of how much of these processes are easily parallelizable. for a rough guestimate of how much processing is involved, there are four classifiers per pixel (three based on chrominance or one based on luminance and two based chrominance plus a classifier that combines the output of the three other classifiers), 640x480 pixels 30 frames per second for a grand total of 37 odd million classifications per second. add to this the possibility of yuv to rgb conversion, and contraction/expansion filtering to clean up noise and you can quickly see how the time adds up. this is before i even begin to do anything useful with the images --- this is just removing the boring pixels. so far, this is all integer arithmetic on arrays that are "cache friendly" in their layout. the really cpu intensive stuff comes later (this is all preprocessing in terms of my application)... updating classifier's background models, extending classification to foreground, background, and things that have been merged into the background (i.e. have stayed still long enough, either items removed from the scene, added to the scene or moved within the scene).

for those who were wondering, i did figure it out with a number of critical sections synchronized indirectly on xlib's mutex. performance is actually better threaded than with earlier single threaded prototypes (mostly because i am now able to start processing an image while i start getting the next one). looking back, i realize that most of my frustration is the result of the vast majority of my gui programming experience being done on windows in applications that were fundamentally stupid to thread. as the message that cliff pointed out notes, this particular quirk of x is not very well documented and has the potential to be very counter intuitive to people doing this for the first time.

anyways, thanks a lot for the help. for those who are curious, my goal is to release the source for the framework and sample application within a month.
Multi-threading is GOOD [was Re:What I do] by nellardo · 2003-01-16 18:05 · Score: 3, Informative

There's absolutely no need for a display system like X to allow multiple application threads to concurrently recieve events and/or update the application's display.
First, you should restrict such broad generalizations to an X client in order to be even remotely correct. The X Server doesn't know anything about the process or thread structure of its clients. The clients may not even be on the same machine. The X Server gets a socket connection and responds to messages on that socket. If something is sending good messages down that socket, the X Server doesn't care if the something is one process, ten threads, or ten million threads.
Go look at KParts - KDE embedding. One window, one application, as far as the user is concerned, but subwindows are controlled possibly by different processes on different machines.
How do you think window managers work? The X server doesn't know from window managers - the way you prevent multiple window managers is by checking for atoms on the root window. Remember, conceptually any client of a server can do something with any window - you just need a way to get the window ID.
It would just add unnesessary complexity, making it harder to debug and maintain.
Wrong. The X server already handles multiple simultaneous connections. Whether the X client does or not is the client's choice. Lots of clients for other systems handle multiple connections (your web browser, for one).
Also, having a single GUI thread is a good design pattern.
Um, add the hedge for some applications and I'll agree with you. Otherwise you're wedged in a one-track design mind. You're making your problem fit your design, rather than your design fit your problem. There's any number of interactive applications that make lots of logical sense to be multi-threaded. Take your typical movie player with its nifty visual feedback. One thread for putting the frames up. One thread to respond to user actions.
The richer your interaction, the more application semantics are involved, the more likely that arbitrarily splitting the app in half along an arbitrary "UI/App" line is just not going to work. MVC has a similar problem. They're both cookie-cutter designs that pretend that every interactive app is structured the same way at the top level.
All GUI work is handled by one thead, all application logic is (potentially) handled by other threads. The delineation between the application's UI and logic makes it much easier to maintain.
Go read papers on the "eXene" system in the programming language "ML". A pervasively multi-threaded X client library. One widget - one thread. It makes the widget code very easy to understand - you don't have to split your code into a bazillion little callbacks. You don't have to arbitrarily time-slice things that are conceptually continuous. Things like while (buttonIsDown) followTheMouse() work just fine. Ever have to break up a callback into multiple functions, triggered by timers, just so the app didn't appear to "freeze" while you were off doing something time-intensive? Multi-threading interaction can make the code much easier to maintain because you don't have to worry about "starving" parts of the application for events while busy working on others - the thread scheduler handles pre-emption for you.
And eXene isn't some hot new thing. It dates from the early 90's.
Go back even earlier and find some of James "Java" Gosling's earliest work - NeWS. NeWS clients wrote multi-threaded PostScript to draw on the display.
Events, timer callbacks and the like are all just ways of simulating something continuous with discrete code - go look at TBAG from Sun and then the follow-on Fran from Microsoft Research. Forcing discretizations of continuous phenomena into an arbitrary serialization is just a way to kludge around a poor understanding of parallel activity.
So yeah, blackcoot shouldn't complain that X is broken, I'd say whatever app {s}he is writing needs to get fixed.
It isn't the app - it's the libraries the app is trying to use. They're a poor fit to the abstraction blackcoot would like to use.

--
-----
Klactovedestene!
1. Re:Multi-threading is GOOD [was Re:What I do] by mike_sucks · 2003-01-16 19:48 · Score: 3, Informative
  
  First, you should restrict such broad generalizations to an X client in order to be even remotely correct.
  
  Ah, I thought that was exactly what I implied. I was talking about the application, which is an X client. Of course the server needs to be thread and process safe - it does display multiple applications at once.
  
  Wrong.
  
  You're telling me multi-threaded apps and libraries are not harder to write, debug and maintain than single-threaded ones? Sorry: *you're* wrong. Even if you use a language which is designed with threading in mind (which X isn't), it adds a *lot* of complexity.
  
  Also, having a single GUI thread is a good design pattern.
  Um, add the hedge for some applications and I'll agree with you. Otherwise you're wedged in a one-track design mind.
  
  Hey, I said it is a good design pattern. I didn't say it is a golden hammer. Of course you only apply a pattern when it fits.
  
  You're making your problem fit your design, rather than your design fit your problem. There's any number of interactive applications that make lots of logical sense to be multi-threaded.
  
  No, I'm not. Nor am I saying that applications should be single threaded. What gave you that idea?
  
  There's no reason why an application needs to have a multi-threaded GUI. This is not to say that the application should not be multi-threaded, clearly that can very useful. The reason is because user interaction effectively serialized. A user rarely, if ever, provides multiple sources of input simultaneously. In cases when they do, it's usually supplementary to the interaction already occuring - a modifier. So there is rarely any need to process user input and update the display in multiple threads because there is only one thing going on at a time.
  Go read papers on the "eXene" system in the programming language "ML". A pervasively multi-threaded X client library. One widget - one thread. It makes the widget code very easy to understand - you don't have to split your code into a bazillion little callbacks.
  
  Damm, that sounds truely awful. What about user events that traverse mutliple widgets? How do you synchronise them all? What about the scheduling overhead when someone just drags a mouse over the app's UI and fifty threads are woken up almost simultaneously? Most workstations are still uniprocessor based.
  
  And I don't see how that avoids the bazillion little callbacks issue. You're splitting your code up into bazillion little threads instead. In any case, the callbacks aren't an issue of you've designed and written your code properly.
  
  Ever have to break up a callback into multiple functions, triggered by timers, just so the app didn't appear to "freeze" while you were off doing something time-intensive?
  
  No, I just use a non-GUI thread to do the intensive work, so that the GUI thread is free to play with the user.
  
  Multi-threading interaction can make the code much easier to maintain because you don't have to worry about "starving" parts of the application for events while busy working on others - the thread scheduler handles pre-emption for you.
  
  Yeah, you "just" need to worry about synchronization, deadlocking, and other concurreny issues instead. Muuuuuuch easier. But what you said above made no sense to me (perhaps I need more coffee) - can you explain this in more detail?
  
  Forcing discretizations of continuous phenomena into an arbitrary serialization is just a way to kludge around a poor understanding of parallel activity.
  
  That would be the case if user interaction was a parallel activity, but unfortunately it is not.
  
  It isn't the app - it's the libraries the app is trying to use. They're a poor fit to the abstraction blackcoot would like to use.
  
  Ah, so you obviously know more about the app than I do, because I don't see any evidence in the article to support that statement.
  
  /mike
  
  --
  -- "So, what's the deal with Auntie Gerschwitz et all?"
2. Re:Multi-threading is GOOD [was Re:What I do] by rpeppe · 2003-01-17 03:53 · Score: 2, Informative
  
  Yeah, you "just" need to worry about synchronization, deadlocking, and other concurreny issues instead. Muuuuuuch easier. But what you said above made no sense to me (perhaps I need some more coffee) - can you explain this in more detail?
  
  It depends on the thread abstractions that are used for synchronisation and thread communication. The most commonly used abstractions today (semaphores, locks, etc) date from the 1970s; there are much better ways to do it!
  
  One way derives from a mathematical notation created by Tony Hoare, called CSP. There is one unit of thread communications and synchronisation, called a channel. It's like a rendezvous point that allows a value to be passed between threads. If one thread tries to send a value on a channel, it will block until another thread tries to read from the channel (also, reading from the channel will block until another thread tries to send on it).
  
  This scheme is incredibly versatile, easy to use and cheap. There are also some tools that can aid in automatic verification of software built in this way. It's true that it's possible to deadlock in concurrent systems, but it's almost always possible to structure the system in such a way that it's deadlock-free by construction. For instance, if my program is structured as a one-way pipeline, it's impossible to deadlock.
  
  Concurrency at this level in a GUI application can greatly enhance the simplicity and maintainability of a program. This is because it's generally much easier to write a straightforward piece of imperative code than encode the same thing as a state machine, e.g.
  
  while (buttons != 0) { (buttons, point) = <-mouse; drawat(point) }
  
  (where <- receives from a channel), versus:
  
  callback(buttons, point) { if (state == DRAGGING) { if (buttons != 0) drawat(point); else state = NOTDRAGGING; } }
  
  You say:
  That would be the case if interaction was a parallel activity, but unfortunately it's not.
  
  But it is! Yes, the user themselves only contributes one thread to the activity, but the program itself is often dealing with multiple activities at the same time; for instance updating itself in response to network activities or updating graphics on a time-step basis.
  
  The most important thing it gives you, in my experience, is the sense of control. As a separate thread, you are free to structure your application in a way directly appropriate to the task being solved. In a callback system, you are at the mercy of the caller; you can't just wait for an event, then do the next thing, you have to encode your current state, return, and wait to be called back, whereupon you have to figure out where you just were!
  
  For a language that exemplifies this, see Limbo, the language of choice in the Inferno environment. No problems with thread unsafe graphics there!
3. Re:Multi-threading is GOOD [was Re:What I do] by Alex+Belits · 2003-01-17 07:21 · Score: 3, Informative
  
  And how does it help considering that the only usable general-purpose language is C?
  
  Really, threads exist for one reason -- because OS developers write shitty schedulers, and because applications programmers don't understand their own data models and write shitty libraries. Most of things that "innovative" threads libraries do are various ways to implement serial communications between processes with asyncronous (or syncronous to asyncronous in your example) handling of events/messages. This is what pipes are for -- and yes, they work between threads, too. Just not in Windows.
  
  --
  Contrary to the popular belief, there indeed is no God.