XFree & Rendering
Keith Packard from SuSE (and a member of the XFree core team) is doing A New Rendering System For X. You can check here to see what he has done so far. Looks great! Help is needed, so if you can give a hand - please do. Keith is going to show some more rendering stuff on LWE.
Now that this is posted to Slashdot, I thought people might be interested to know how this extension relates to Libart, which, after all, shares many of the same goals of providing high performance 2D rendering.
Libart and the Render extension are both fairly low-level mechanisms for 2D rendering. Libart nails down the data structures and buffer layouts (24-bit packed RGB for most work), which has desirable consequences for both performance and simplicity. On the other hand, Render is really optimized for the case where there is hardware support for the primitives.
In many ways, Render is even lower level than Libart. For example, Libart handles vector paths, stroke outline, miters, line dashing, and so on. Render only handles low-level primitives such as triangles and trapezoids. To render more complex shapes, you have to tesselate them into triangles client-side before sending them to the screen.
Neither of these projects currently provides an ideal high-level API for application developers. While it will be possible to write directly to the Render layer, it's probably too much of an impedance mismatch for most applications. Similarly, the current Libart API doesn't provide any way to use the hardware acceleration that Render provides.
Thus, I'm working on a slightly higher level API for Libart, one with basically the same semantics as the existing API, but with the actual representations of data structures abstracted. Thus, a Libart+ "shape" can be a bezier path, a vector path, an SVP (sorted vector path), or a pile of triangles optimized for the Render extension. Similarly, a "picture" can be a packed RGB or RGBA buffer as now, or may be a handle to a pixel buffer actually living on the video card, accessible through the Render extension.
I want this to be a beautiful API for applications to write to. The main advantage is that it can automatically negotiate which functions should be performed server-side and which client-side. For example, no video cards today can actually deal with bezier paths directly. If this ever changes, then Keith can add the relevant stuff to the Render extension, I can add the code to Libart+ to recognize it, and apps will automatically win.
Further, this architecture provides for richer functionality than Render provides, without loss of performance in the common case. Right now, I'm adding the PDF 1.4 blend modes (Multiply, Overlay, Hard Light, Soft Light, etc) to Libart. These blending modes just get implemented in software.
Lastly, I have to say that I think Keith is making exactly the right decisions to keep Render low-level, lean, and simple. I have great confidence that he will ship something soon that provides useful access to hardware acceleration. It is refreshing to see well-engineered stuff coming out of the X world again, after many years of bloated, nearly useless crap from the X consortium and The Open Group. Gambai!
LILO boot: linux init=/usr/bin/emacs
Representing the range of brightness of a screen as 8 bits of intensity is actually very efficient and almost exactly matches human visual capability, if you use an exponential version where brightness(N+1)/brightness(N) is a constant for all 8-bit values N. This means brighness is a function pow(N,1/G) where G is a constant called "gamma". If you normalize N to be 0-1 then G is about 2.2.
Quite by accident this brightness function is exactly what cheap CRT's that are on PC's do when fed the linear conversion from D/A converters. So the hardware is actually perfect.
If the screen has a gamma value of G the correct math for adding images A and B so the result is the sum of brightnesses is pow(pow(A,G)+pow(B,G),1/G). The proper math for multiplying A by a mask M is pow(pow(A,G)*M,1/G).
Fortunately the data is 8 bits and thus both the above functions can be provided by 256x256 lookup tables. You can also use 256-entry tables for the two pow functions. Assumming G==2 may allow a lot of other math tricks as well, and is plenty accurate enough for vision.
The biggest problem with their design is that they specified premultiplied. This prevents the above math from being applied correctly, also gamma introduces questions about exactly what value you premultiply by.
I would recommend that they reformat their stuff to describe everything in an unpremultiplied area, and that all math is in intensities, not numeric values. The gamma is a constant built into the server and frozen at 2.2 or 2.0.
There can be a "premultiplied" flag in a picture. This should be used to fix "bad premultiply" that is produced by most (all?) 3D renderers, where the colors are in a gamma space but have been linearily multiplied by the alpha. To unpremultiply these, replace R with A?(pow(R,G)/A,1/G):R and same for G and B, leave A alone, and continue with the composite.
I disagree with your desire to keep the pixel values and the raster ops visible.
Yup, I was wrong on that one; raster ops are replaced with compositing operators on color values.
It may be important to support "premultiplied" alpha compositing
Reading Jim Blinn's work along with Porter & Duff convinced me that premultiplied alpha was the "one true representation". You can still use non-premultiplied alphas if you like; the extension can be twisted to make that work, it's just that premultiplied alpha is the "obvious" way to do things.
I think you *should* use IEEE 32-bit numbers and put the "transformation" into the server.
Maybe someday, but I'm sticking with client-side tesselation for now. The number of 2D geometric objects drawn on the screen is miniscule today; maybe when we're drawing thousands it will make sense to move some of that across the wire. The neat part is that we can do that transparent to the application by tesselating in the library for servers which don't support the new stuff.
Having the transformation in the server is necessary for supporting font rasterizers and to take advantage of hardware acceleration for transforming images.
Font rasterization is now the client's problem. The extension does support affine transformations of images using fixed point coordinates; it's hard to imagine what floating point would add to that.
I do think paths should be in the server, because otherwise people will write wrapper libraries that will be slower than doing it in the server.
Given that hardware requires tesselation to triangles, I believe there isn't a significant performance advantage to doing that stage in the server. The key here is getting the huge hardware acceleration speed increase and not worrying to much the small speedups from shifting work from one general purpose CPU to another.
I do not want the Windoze-style 12-parameter call to sepecify a font,
By moving the rasterization to the client side, we can experiment with lots of different font naming schemes to see what we like; there is no longer a performance penalty for using non-"native" fonts.
I just wrote my first FreeType application today; that API seems much nicer than either Windows or XLFD.
Again, I STRONGLY disagree with the conventional wisdom that the only fast way to do images is to have the program know the details of the hardware.
With a color-based rendering model, it's now obvious how to transform color data from one format to another. The extension transparently converts image data.
Make a new type of "gc" that includes the damn window!
It's called a Picture.