To bring up another point
by
Anonymous Coward
·
· Score: 4
I think Carmack makes an excellent point about X's rendering pipeline. Under the X Window System, the Xclients (any X program) update the display by writing data to the X Display Server via a socket.
While this is great for doing things over a network, it limits performance, especially performance where huge chunks of data must be thrown around (like in higher resolutions/depth) on the local machine.
UNIX (LOCAL) sockets speed this up by about 100% (or more or less, depending on the implementation), but it's still fundamentally flawed.
Each write to the X Server must involve the kernel's interaction. This introduces more overhead than it probably ever should. Especially if you want excessively high performance.
It's a simple fact. While the X Server remains a seperate process that can only be updated via a local socket, performance will be gated.
I'm not an expert on The X Window System, so I really don't know what a viable solution would entail. I'm just throwing this out: What if we were to make an extension to Xlib that would ask the X Server if it would allow the Xclient to mmap part of the framebuffer (it's client area) or just mmap the entire thing? All subsequent Xlib calls would write to the framebuffer.
This introduces a whole mess of issues, but we need to do SOMETHING. Anyone want to open up discussion? or tell me how wrong I am?:)
--Michael Bacarella
Re:To bring up another point
by
nathanh
·
· Score: 3
I'm not an expert on The X Window System, so I really don't know what a viable solution would entail. I'm just throwing this out: What if we were to make an extension to Xlib that would ask the X Server if it would allow the Xclient to mmap part of the framebuffer (it's client area) or just mmap the entire thing? All subsequent Xlib calls would write to the framebuffer.
You're too late, it's already been done. It's called DGA (Direct Graphics Access) and has been part of modern X servers for a while now. With a DGA application you get *equal* performance when compared to FB or SVGALIB.
The obvious examples of well written DGA apps include XMAME and SNES9X.
DGA even works exactly like you suggested. It lets you mmap the framebuffer.
The MITSHM solution others have pointed to is simply not good enough. You still get bogged down in the X socket, so there are a couple of wasted copies involved, and performance degrades fairly noticeably. DGA is the better solution.
UNIX (LOCAL) sockets speed this up by about 100% (or more or less, depending on the implementation), but it's still fundamentally flawed.
Please be careful to know that though sockets are "fundamentally flawed" (in that sockets are always going to reduce performance), the concept of X isn't fundamentally flawed. SGI uses mmap'd ring buffers for local clients, avoiding all the issues with system call overhead. SGI manages to retain the benefits of the X abstraction without sacrificing performance. They just used cleverer X-server code.
Remember that all good systems will sacrifice some speed for a good abstraction. Even Linux is guilty of sacrificing that extra 10% performance to keep the nice UNIX abstraction. X is a really nice abstraction, so don't blame it for losing a few percentile points of performance.
Also, the XFree86 team could really do with a lot more coders. X is easily as complicated as a UNIX kernel, if not more so, but they have a lot fewer people working on XFree86 than work on the Linux kernel! There are a lot of very cool ideas that X can do - stuff invented by SGI - that the XFree86 group would like to do, but without good coders these ideas will never be implemented. If you want a real project to get your teeth into then XFree86 is challenging: drivers aren't the only things XFree86 members work on! And, if you really like 3D stuff and OpenGL, then now is the right time to help work on XFree86!
Why is there a warning that there's an article on slashdot that "has some kind of techie stuff" included in it? I mean, I'm so used to the page that I probably wouldn't notice immediately, but the masthead does still read "News for Nerds", right?
I mean, we have to take the occassional break from making fun of Bill "Richboy" Gates and beating up on Jesse Berst to still do geek-type stuff on occassion, don't we? I'd think so.
Now that I'm done abusing Rob (for now), here's my constructive observation:
I've always argued that games were vital to Linux because of the user base and notoriety that they bring, as well as enhancing my ability to kill time at the office. This article, however, point out another big advantage that I hadn't really considered before: Linux games spur technical enhancements.
I feel stupid for now adding this to my list of "Why I like games" before, but it's up there next to number one right now. Hey, you could even argue that games are what drive us to get better and better computers: I mean, who needs a 450Mhz k6-2 to run Word Perfect or that other word processor from those guys in Washington?
----
-- Every year during my review, I just pray the words "slashdot.org" aren't mentioned.
I read all of the posts (so far), and I didn't have a clue what half of them were talking about. It's great. I love it. I spent a couple hours last night surfing around trying to figure this stuff out. Some of the posts are wrong, but they're still good because they got me thinking.
We need more stories like this on/., instead of all the World Dominion fluff and mindless flaming.
The 3DLabs Permidia2 is a _cheap_ OpenGL board -- performs about like a Voodoo1, and has excellent GL under Win95/98/NT -- fully compliant and everything. A decent choice for a business or low end graphics workstation. I think you can get one for $40, maybe even $20.
3DLabs has a variety of much newer, more powerful boards you can learn about at their website. I'm sure some other people make GL boards -- Gloria comes to mind. All $$$$!
Under Linux you can use GL on your Voodoo1 ($30) or Voodoo2/Banshee ($80), supported thru Mesa (not as fast as it could be, since it's Mesa on top of Glide). Under Windoze, you can use 3Dfx's OpenGL ICD, which is really only a subset of GL implemented for Quake (and therefore not really any good for graphix work), and is also slo since it's going thru M$ function calls (yuck). Many other boards have GL ICD's for Windoze, and nVidia is supposed to be writing one for the TNT for Linux.
Yes, we have done shared memory transport with Accelerated-X for quite a while now. However for 2D it just doesn't make a lot of sence anymore. If you have a good 2D core implementation you'll get already very close to the maximum without this type of transport (assuming that you use the MIT-SHM extension for images).
For 3D that's a different story. OpenGL does support something called a direct rendering context. This is a GLX context that has semantics that allow libGL.so to be implemented in a way that it talks directly to hardware. In any case I feel that it would be foolish to expose an API that allows talking to hardware to a programmer. It's way to complex and gives to much opprtunity to screw things up (not intentionally, but hey, show me a bugfree piece of HW). Having OpenGL there and let libGL.so do the talking to the hardware makes way more sence.
With all this talk of sockets and network operation, I'd like to remind everyone that that is only now just *one* possibility for accessing the display. Let's not forget that under linux 2.2 we can access the display directly via/dev/fb* devices.
Most programs like things such as acceleration from video cards, at which point no sane programmer wants app->video card
With modern video cards, this is completely reasonable. The TNT has channels that allow up to 127 programs to communicate directly with the hardware. IIRC, the privileged program (X server or kernel) creates a channel by specifying a clipping region and a 64k region of memory that the client can mmap(). The TNT (with help from the kernel - it needs an interrupt handler) the interraction between all these clients.
There are papers out there that describe how good graphics cards should be designed. A good graphics card allows for app->video with accelerated (2d and 3d) features. It is nice to see that these cards are starting to show up for ~$100.
It's called the MIT shared memory extension. Most programs that do intensive graphics stuff (TV players, the gimp, etc.) already use it. X and the program share a section of memory. The program writes to it, and then notifies X when it's done. Then X simply uses that area of memory for moving things to the card. The kernel isn't really involved at all except for setting up the shared memory area and for the pipe protocol to notify the X server. This also allows the X server to properly do clipping, and things like that.
-- They laughed at Einstein. They laughed at the Wright Brothers. But they also laughed at Bozo the Clown. -- C. Sagan
Can be faster in some situations
by
JamesHenstridge
·
· Score: 2
If we are talking about a higher level graphics API like OpenGL, going through X (using GLX) can sometimes increase speed.
Consider the situation where the opengl driver for a card needs to do a bit of non trivial preprocessing of the gl commands before sending them off to the graphics card (the extreme of this is software rendering). Now AFAIK gl is not thread safe, so this processing would occurin the same thread of execution as the program.
Now with GLX and a multi processor machine, the render preprocessing would occur in a separate process, in effect giving your program a separate render thread that could run on the other processor.
As for X connections being slow, the X protocol spec defines how information should be sent accross the network. For local connections (eg:0.0), the data can be sent using the most efficient method to the X server. If you know of a faster method than unix domain sockets/pipes, the XFree86 team may be interested in your input.
Please look at Precision Insight This is funded by Red Hat, and gives a similar rendering architecture to SGIs on sensible hardware. This is going into XFree86 4.0 , which is rumoured to be coming out in June sometime.
I think Carmack makes an excellent point about X's rendering pipeline. Under the X Window System, the Xclients (any X program) update the display by writing data to the X Display Server via a socket.
:)
While this is great for doing things over a network, it limits performance, especially performance where huge chunks of data must be thrown around (like in higher resolutions/depth) on the local machine.
UNIX (LOCAL) sockets speed this up by about 100% (or more or less, depending on the implementation), but it's still fundamentally flawed.
Each write to the X Server must involve the kernel's interaction. This introduces more overhead than it probably ever should. Especially if you want excessively high performance.
It's a simple fact. While the X Server remains a seperate process that can only be updated via a local socket, performance will be gated.
I'm not an expert on The X Window System, so I really don't know what a viable solution would entail. I'm just throwing this out: What if we were to make an extension to Xlib that would ask the X Server if it would allow the Xclient to mmap part of the framebuffer (it's client area) or just mmap the entire thing? All subsequent Xlib calls would write to the framebuffer.
This introduces a whole mess of issues, but we need to do SOMETHING. Anyone want to open up discussion? or tell me how wrong I am?
--Michael Bacarella
The only programmer writing code specifically for the TNT should be the programmer of the OpenGL library.
I mean, we have to take the occassional break from making fun of Bill "Richboy" Gates and beating up on Jesse Berst to still do geek-type stuff on occassion, don't we? I'd think so.
Now that I'm done abusing Rob (for now), here's my constructive observation:
I've always argued that games were vital to Linux because of the user base and notoriety that they bring, as well as enhancing my ability to kill time at the office. This article, however, point out another big advantage that I hadn't really considered before: Linux games spur technical enhancements.
I feel stupid for now adding this to my list of "Why I like games" before, but it's up there next to number one right now. Hey, you could even argue that games are what drive us to get better and better computers: I mean, who needs a 450Mhz k6-2 to run Word Perfect or that other word processor from those guys in Washington?
----
Every year during my review, I just pray the words "slashdot.org" aren't mentioned.
We need more stories like this on /., instead of all the World Dominion fluff and mindless flaming.
TedC
The 3DLabs Permidia2 is a _cheap_ OpenGL board -- performs about like a Voodoo1, and has excellent GL under Win95/98/NT -- fully compliant and everything. A decent choice for a business or low end graphics workstation. I think you can get one for $40, maybe even $20.
3DLabs has a variety of much newer, more powerful boards you can learn about at their website. I'm sure some other people make GL boards -- Gloria comes to mind. All $$$$!
Under Linux you can use GL on your Voodoo1 ($30) or Voodoo2/Banshee ($80), supported thru Mesa (not as fast as it could be, since it's Mesa on top of Glide). Under Windoze, you can use 3Dfx's OpenGL ICD, which is really only a subset of GL implemented for Quake (and therefore not really any good for graphix work), and is also slo since it's going thru M$ function calls (yuck). Many other boards have GL ICD's for Windoze, and nVidia is supposed to be writing one for the TNT for Linux.
Yes, we have done shared memory transport with Accelerated-X for quite a while now. However for 2D it just doesn't make a lot of sence anymore. If you have a good 2D core implementation you'll get already very close to the maximum without this type of transport (assuming that you use the MIT-SHM extension for images).
For 3D that's a different story. OpenGL does support something called a direct rendering context. This is a GLX context that has semantics that allow libGL.so to be implemented in a way that it talks directly to hardware. In any case I feel that it would be foolish to expose an API that allows talking to hardware to a programmer. It's way to complex and gives to much opprtunity to screw things up (not intentionally, but hey, show me a bugfree piece of HW). Having OpenGL there and let libGL.so do the talking to the hardware makes way more sence.
- Thomas
With all this talk of sockets and network operation, I'd like to remind everyone that that is only now just *one* possibility for accessing the display. Let's not forget that under linux 2.2 we can access the display directly via /dev/fb* devices.
--
With modern video cards, this is completely reasonable. The TNT has channels that allow up to 127 programs to communicate directly with the hardware. IIRC, the privileged program (X server or kernel) creates a channel by specifying a clipping region and a 64k region of memory that the client can mmap(). The TNT (with help from the kernel - it needs an interrupt handler) the interraction between all these clients.
There are papers out there that describe how good graphics cards should be designed. A good graphics card allows for app->video with accelerated (2d and 3d) features. It is nice to see that these cards are starting to show up for ~$100.
It's called the MIT shared memory extension. Most programs that do intensive graphics stuff (TV players, the gimp, etc.) already use it. X and the program share a section of memory. The program writes to it, and then notifies X when it's done. Then X simply uses that area of memory for moving things to the card. The kernel isn't really involved at all except for setting up the shared memory area and for the pipe protocol to notify the X server. This also allows the X server to properly do clipping, and things like that.
They laughed at Einstein. They laughed at the Wright Brothers. But they also laughed at Bozo the Clown. -- C. Sagan
If we are talking about a higher level graphics API like OpenGL, going through X (using GLX) can sometimes increase speed.
:0.0), the data can be sent using the most efficient method to the X server. If you know of a faster method than unix domain sockets/pipes, the XFree86 team may be interested in your input.
Consider the situation where the opengl driver for a card needs to do a bit of non trivial preprocessing of the gl commands before sending them off to the graphics card (the extreme of this is software rendering). Now AFAIK gl is not thread safe, so this processing would occurin the same thread of execution as the program.
Now with GLX and a multi processor machine, the render preprocessing would occur in a separate process, in effect giving your program a separate render thread that could run on the other processor.
As for X connections being slow, the X protocol spec defines how information should be sent accross the network. For local connections (eg
Please look at Precision Insight This is funded by Red Hat, and gives a similar rendering architecture to SGIs on sensible hardware. This is going into XFree86 4.0 , which is rumoured to be coming out in June sometime.
What's a good/cheap OpenGL card (AGP)...???
--
Alan L. * Webmaster of www.UnixPower.org
Alanp