Hardware Based XRender Slower than Software Rendering?
Neon Spiral Injector writes "Rasterman of Enlightenment fame has finally updated the news page of his personal site. It seems that the behind the scenes work for E is coming along. He is investigating rendering backends for Evas. The default backend is a software renderer written by Raster. Trying to gain a little more speed he ported it to the XRender extension, only to find that it became 20-50 times slower on his NVidia card. He has placed some sample code on this same news page for people to try, and see if this is also experienced on other setups."
last time I checked all graphix cards need drivers to enable their acceleration.
He didn't really get too far into that, but it would be interesting to see how feasible it is to do all the 2D rendering using OpenGL, encapsulated by some layer, like his Evas.
Has anyone done that? Any interesting results? One would think that there's a lot of potential here...
And what does that have to do with the questions and issues raised in the story?
I have used both ATI and NVIDIA,(and 3dfx, and matrox, but staying relevant). Generally the NVIDIA cards I have owned have been vastly outperformed by the ATI cards right off the bat, without tweakage. (This is under Linux, mind you) Even with tweakage, in my experience, you rarely get the full potential from your card.
I hate sigs.
Sorry, but that was the funniest thing I've read all month.
Is XRender really accelerated? I thought that most Render operations were still unaccelerated on most video cards, and how and if they could be accelerated was still an open question. Maybe the real problem here is Render's software rendering code?
main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
The fact that he's working on it again. A sudden change of heart, apparently. I can remember slashdotters posting hate about this guy because he was claiming the Linux desktop was dead; claims like E having failed and that being his way of taking it.
Happy New Year, it's 1984!
Keith please explain this! This shouldnt happen.
Also how many more years will it take before Linux can compete with OSX? 5 more years? Maybe 10? We have forever and a day you know.
If you use Linux, please help development of Autopac
graphics cards work quickly because they cut every corner that can possibly be cut. It makes sense that they would run computer software slower.
I'm more interested in using them for specific calculations. Imagine if one of these things was accidentally embued with the ability to factor gigantic numbers. The AGP slot is just an excuse to keep us from beowulfing them over PCI-X
You can't judge a book by the way it wears its hair.
Irix.
IrisGL or OpenGL (I think OpenGL is based on IrisGL, so Irix probably now uses OpenGL) is used extensively in Irix, for both 2D and 3D.
A solution to the problem with music today
This is when I predict Xrender will be complete and Linux will be set to compete with OSX and Windows Longhorn in terms of rendering.
I suggest that Rasterman just forget about Xrender and use directfb or opengl.
If you use Linux, please help development of Autopac
Here is the entry from the driver README:
Following that option, this one is noted:
This is too long. Can someone give me a summary?
It may be big and bloated, but at least it's slow.
I'm an American. I love this country and the freedoms that we used to have.
Is this the same person who some time ago said that: "Windows has won. Face it. The market is not driven by a technically superior kernel, or an OS that avoids its crashes a few times a day. Users don't (mostly) care. They just reboot and get on with it. They want apps. If the apps they want and like aren't there, it's a lose-lose. Windows has the apps. Linux does not. Its life on the desktop is limited to nice areas (video production, though Mac is very strong and with a UNIX core now will probably end up ruling the roost). The only place you are likely to see Linux is the embedded space." Slashdot article is also available here: http://slashdot.org/articles/02/07/20/1342205.shtm l?tid=106
What Apps can I not run under Linux?
My browser works, most of my games work, Photoshop works, Microsoft word works,
Do your research, Wine, Transgaming, Crossoveroffice
If you use Linux, please help development of Autopac
"Also how many more years will it take before Linux can compete with OSX? 5 more years? Maybe 10? We have forever and a day you know."
Why? Are you in a rush to be somewere?
I actually have a mod point and I'm not going to spend it here. I'd give an underrated if it wasn't so long. I know that's the joke and parts _are_ funny, but more shock value than than witty commentary, IMO.
Hey and at least this comment will make someone browse at -1.
Acquiescence leads to obliteration
yes, it is the same guy.
just because he thinks market wise linux doesn't have a chance in taking over the desktop doesn't mean that he doesn't want to have a badass wm / suite of libs for himself (and anyone else who wants to use it)
Also the only reason its taking so long is because they wont fork, theres millions of developers who Redhat, Suse, Lindows etc would love to pay to develop Xrender, you think Keith Packard is the only developer in the world qualified to do this? No hes not, and neither is Carl Worth, but until there is a fork, everything goes through this core group of developers who decide everything.
Its a management issue moreso than lack of developers or lack of money, believe me if Transgaming can get money, Xfree could get about x10 that amount of money, Mandrake has 15,000 subscribers paying $60 a year or something.
This isnt about money, its not about lack of programmers, its about management, the developers argue and fight over stupid stuff on mailing lists, theres only two developers working on Xrender and these developers seem over worked because they are doing so many other projects.
Its more complicated than it seems.
Xwin is not an official fork, at least I was told that it wasnt a fork, it was more of a threat of a fork, I am wishing and hoping they DO fork and then accept money somehow so we can pay developers to write this very important code.
If you use Linux, please help development of Autopac
There has been some work on using graphics cards for computation. The tough part is figuring out how to rephrase your algorithm in terms of what the GPU can handle. You'd expect matrix math to work out but people have tried to implement more interesting algorithms too. :-)
- AmitWhat, so now they've got rendering backends in Evangelions?
Here's the rasterman article at linux and main where he does indeed say what you have posted. Your post sounds like flamebait, but upon reading the links, you're right.
Normally, he would answer some questions or comments posted about something he has written, but he will be out of town for at least a few days.
I highly doubt he meant for this to get wide-spread exposure beyond developers of Enlightenment or X. Since it has, this is a good opportunity. I'll make this clear for anyone that didn't catch it, raster WANTS XRENDER TO BE FASTER! If there is a way to alter configuration or to recode the benchmark to do so, he wants to know about it.
Rather than posting questions about his configuration (which he can't answer right now), grab the benchmarks that he put up and get better results.
Now back to your regularly scheduled trolling...
Dude, you have read some P.J. O'Rourke I see!
Nice mixing of his themes.
HBI's Law: Frequency of calling others Nazis is directly correlated with the likelihood of the accuser being Communist.
This is fuckin' hilarious.
There's an example from back in the 80's that still probably serves as a good engineering reference for people working on hardware/software driver issues.
In those days of yore (only in the computer industry can one refer to something 20 years ago as "yore"...) there was the Commodore 64. It retains it's place as a pioneering home computer in that it offered very good (for the time) graphics and sound capability, and an amazing 64K of RAM, in an inexpensive unit. But then came its bastard son...
The 1541 floppy disk drive. It became the storage option for a home user once they became infuriated enough with the capabilites of cassette-tape backup to pony up for storage on a real medium. Unfortunately, the 1541 was slow. Unbelievably slow. Slow enough to think, just maybe, there were little dwarven people in your serial interface cable running your bits back and forth by hand.
Now, a very unique attribute of the 1541 drive was that it had its own 6502 processor and firmware. Plausibly, having in effect a "disk-drive-coprocessor" would accelerate your data transfer. It did not. Not remotely. Running through a disassembly of the 6502 firmware revealed endless, meandering code to provide what would appear, on the surface, to be a pretty straightforward piece of functionality: send data bits over the data pin and handshake it over the handshake signal pin.
As the market forces of installed base and demand for faster speed imposed themselves, solutions to the 1541 speed problem were found by third party companies. Software was released which performed such functions as loading from disk and backing up floppies as speeds that were many, many times faster than the 1541's base hardware and firmware could offer.
The top of this particular speed-enhancement heap was a nice strategy involving utilizing both the Commodore 64's and the 1541's processors, and the serial connection, optimally. Literally optimally. Assembly routines were written to run on the both 64 and the 1541 side to exactly synchronize the sending and receiving of bits on a clock-cycle by clock-cycle basis. Taking advantage of the fact both 6502's were running at 1 Mhz, the 1541's code would start blasting the data across the serial line to the corresponding 64 code, which would pull it off the serial bus within a 3-clock-cycle window (you could not write the two routines to be any more in sync than a couple 6502 instructions). This method used no handshaking whatsoever for large blocks of data being sent from the drive to the computer, and so, in an added speed coup, the handshaking line was also used for data, doubling the effective speed.
The 1541 still seems pertinent as an example of a computer function that one would probably think would best be done primarily on a software level (running on the Commodore 64), but was engineered instead to utilize a more-hardware approach (on the 1541), only to be rescued by better software to utilize the hardware (on both).
There's probably still a few design lessons from the "ancient" 1541, for both the hardware and the software guys.
~ Whence do you come, slayer of men, or where are you going, conqueror of space?
The numbers being reported for this benchmark are at best questionable--yeah, like that's new. The imlib image is composed off-screen and then rendered at the last moment to the display. The Xrender, non-off screen, version has the penalty of having to upgrade the physical display so frequently. If you make imlib2 render the image to the screen *every* draw, you end up getting results very similar to the Xrender on-screen display. Now, the fact that the Xrender off-screen display is so poor *is* a concern.
As far as I know, only the Matrox G400 card has good hardware render accelaration. NVidia's support is still experimental and rather poor. Render is still considered experimental, and speed is not yet considered to be very important. Full accelerated support is planned for XFree86 5.
a "poor" "server-only OS"... unlike IIS, which is clearly the bees testicles?
riiiiight.
I agree that X is slow, sucks etc etc, but fuck man - get your arguments sorted out. say that theres no apps, blahblahblah, thats more convincing than the shit you just spouted about it being a poor server OS.
I worked on 2D & 3D libs a while back for a graphics company. Among the biggest problems at the time was that each different output device had its own feature set, implemented slightly differently. Every designer had their own ideas of what would be 'cool' in their graphics engine, which tended to follow the latest progress in the field.
General purpose graphics libraries such as ours ended up spending most of the time dealing with the cool features than the features saved. For example, if a plotter had a 2D perspective transform built in, was it better to do the 3D projection ourselves and just feed it untransformed vectors, or map the 3D in such a way as to allow the 2D processing of the plotter to help out? This might require pre-computing sample data.
Also, since the plotter had 2D transforms we have to do a lot more work including reading the plotter's status and inverting the plotter's transform matrix to make sure that the resulting output didn't end up outside the plotter's viewport.
A code analysis found that over 90% of the code and 90% of the processing time was spent preventing and dealing with input errors and handling compatibility issues.
Nowadays, it's harder in many ways with a wide variety of hardware based texturing and other rendering - do we do the lighting model ourselves, or let the HW do it? It may depend on whether we're going for speed and 'looks' or photometric correctness.
It's easier to be a result of the past, but more fun to be a cause of the future! http://www.spacefinancegroup.com/
Anybody else read that as "XBender"?
--
"Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
Your a fucking asshole.
When was the last time you used Linux? In 2000? Its changed ALOT since 2000.
If you use Linux, please help development of Autopac
and I noticed something strange. For those of you who can't or won't try Rasterman's benchmark yourself, the program runs six different tests, each of which uses a different scaling technique. Each of the six tests is run on the three different test platforms: XRender onscreen, XRender offscreen, and Imlib2. Imlib2 is also written by Rasterman, and is part of Enlightenment.
Here are the test scores from one of the rounds -
*** ROUND 3 ***
Test: Test Xrender doing 2* smooth scaled Over blends
Time: 196.868 sec.
Test: Test Xrender (offscreen) doing 2* smooth scaled Over blends
Time: 196.347 sec.
Test: Test Imlib2 doing 2* smooth scaled Over blends
Time: 6.434 sec.
Now for the strange thing. For the first platform, I watched as the program drew the enlightenment logo thousands of times in the test window, as you would expect. For the second test, it took about the same amount of time, but drew offscreen, again, as the test's name would indicate. However, for the imlib2 test, it also didn't draw anything in the test window.
I got the impression (perhaps wrongly?) that Imlib2 would actually draw to the screen as well. Since it doesn't change the screen, I have no way of telling if imlib2 is doing any drawing at all.
So, I'm digging into the benchmark's code... I'll let you guys know what I find.
from benchmarks? In case you havn't been keeping up with the latest news, nvidia have been shown to be uber cheaters going back a long way.
This reminds me of the experience of WindowFX, a 3d transparency/animation tool made by Stardock. They included hardware 'acceleration' as a settable option, but for most cards it was anything but an option, ran at 1fps.
The exception being the G400, then the Radeon, and only very recently (on Windows) the GeForce. It's entirely an issue of how well the drivers are implemented, and since many of these 2d acceleration functions aren't widely used they're often overlooked in favor of the (traditionally) common case. I'd expect that NVidia hasn't been lobbied so heavily to make it's Linux drivers support these functions, as it took months for Stardock to lobby them to alter the Windows drivers to do the same.
I guess the answer is that I'm not surprised to hear something like this, and there is hope even if it's small hope that it will get better.
Careful design of both the pixel, and vertices programs, and intelligent scene decomposition. i.e. Look at what's presently on your screen, and ask: "What's the minimumn I can send to get this? Same with changes (caching can help here).
Also you can get the 2D part to help out. I'm currently research what other parts (if any) I can use to help out in this task.
A lot of people are questioning the results claimed by Rasterman; however try downloading the thing and running it for yourself. I see the same trend that Rasterman claims when I do it.
My system: Athlon 800, nVidia 2-GTS.
Drivers: nVidia driver, 1.0.4363 (Gentoo)
Kernel: 2.4.20-r6 (Gentoo)
X11: XFree86 4.3.0
I've checked and:
The benchmark consists of rendering an alphablended bitmap to the screen repeatedly using Render extension (on- and off-screen) and imlib2. Various scaling modes are also tried.
When there's no scaling involved, the hardware Render extension wins; it's over twice as fast. That's only the first round of tests though. The rest of the rounds all involve scaling (half- and double-size, various antialiasing modes). For these, imlib2 walks all over the Render extension; we're talking three and a half minutes versus 6 seconds in one of the rounds; the rest are similar.
I'm not posting the exact figures since the benchmark isn't scientific and worrying about exact numbers isn't the point; the trend is undeniable. Things like agpgart versus nVidia's internal AGP driver should not account for the wide gap.
Given that at least one of the rounds in the benchmark shows the Render extension winning, I'm going to take a stab at explaining the results by suggesting that the hardware is probably performing the scaling operations each and every time, while imlib2 caches the results (or something). The results seem to suggest that scaling the thing once and then reverting to non-scaling blitting would improve at least some of the rounds; this is too easy, however, since while it helps the application that knows it's going to repeatedly blit the same scaled bitmap, not all applications know this a priori.
- Andrew
client/server setup is a superior way of designing a windowing environment.
.. I don't know for certain, I hope it uses Mach kernel messages.
X11 uses unix sockets (or optionally slower, less secure TCP) and shared memory.
Win32 uses shared memory and messaging.
MacOS X
QNX Photon uses qnx kernel messages and shared memory.
The real difference is the layer at which the windowing system exists. in the case of X11, MacOS X and Photon. the windowing system is just another process.
In Win32 it's a kernel thread (as far as I know). But still, you're sending messages from one place to another and constructing windows based on them.
Client/Server is the natural way to build a multi-application graphical environment.
Of course there are "fake" environments which amount of an embedded video driver and some library to draw widgets. (most DOS gui apps are like this).
“Common sense is not so common.” — Voltaire
If Tyranny and Oppression come to this land,
it will be in the guise of fighting a foreign enemy. -James Madison
How this got modded as Interesting and not Flamebait is beyond me. As far as Raster's coding skill goes, Let's see you write something akin to Enlightenment, or edje or evas, or any number of other apps and libraries. Now, in true open source spirit, did you even install his test app and submit results to help make anything better?
He's been working on it constantly. Never stopped.
The problem is in *sending* the graphics commands to the hardware. If you're manually sending quads one at a time, I found that for 16x16 squares on screen, it's faster to do it in software than on a GEForce 2 (that was what I had at the time - this was a few years back). Think about it:
:)
== Hardware ==
Vertex coordinates, texture coordinates and primative types are DMA'd to the video card. The video card finds the texture and loads all the information into it's registers. It the executes triangle setup, then the triangle fill operation - twice (because it's drawing a quad).
== Software ==
Source texture is copied by the CPU to hardware memory, line by line.
Actual peak fill rate in software will be lower than hardware - but if your code is structured correctly (textures in the right format, etc) - there's no setup. The hardware latency looses out to the speed of your CPU's cache - the software copy has the same complexity as making the calls to the graphics card.
The trick is to *batch* your commands. Sending several hundred primatives to the hardware at the same time will blow software away - especially as the area to be filled increases. Well.. most of the time, but it really depends on what you're doing.
XRender is a new extension with only a reference implementation in XFree86. The point is to experiment with an API prior to freezing it. I know this may come as news to people who have grown up on Microsoft software, but real software developers first try out various ideas and then later start hacking it for speed. It would be quite surprising, actually, if it were faster than a hand-tuned client-side software implementation.
It will be a while until XRender beats client-side software implementations. Furthermore, you can't just take a client-side renderer and hack in XRender calls and expect it to run fast--code that works efficiently with a client-server window system like X11 needs to be written differently than something that moves around pixels locally.
With a few tweaks it FLIES!
http://osnews.com/story.php?news_id=1905
I ran the benchmark on a VIA MVP3 motherboard with AMD K6-3+ 400 MHz CPU and GeForce2 MX 400 vide card. With RenderAccel option enabled, the unscaled test runs two times faster with XRender, but when that option is set to "false" in XF86Config, the results are as follows:
Test: Test Xrender doing non-scaled Over blends
Time: 16.234 sec.
Test: Test Xrender (offscreen) doing non-scaled Over blends
Time: 16.108 sec.
Test: Test Imlib2 doing non-scaled Over blends
Time: 1.932 sec.
That was with hardware acceleration disabled. I was surprised that enabling the hardware acceleration speeds the first test up by that much (up to 32 times actually).
I also confirm the previous posters' findings that Imlib2 tests don't draw anything onscreen.
Last time I checked the word "graphics" lacked an 'x'.
After installing imlib2, and running render_bench's 'make', it gives me the following:
cc -g -I/usr/X11R6/include `imlib2-config --cflags` -c main.c -o main.o
main.c: In function `xrender_surf_new':
main.c:67: `PictStandardARGB32' undeclared (first use in this function)
main.c:67: (Each undeclared identifier is reported only once
main.c:67: for each function it appears in.)
main.c:67: warning: assignment makes pointer from integer without a cast
main.c:69: `PictStandardRGB24' undeclared (first use in this function)
main.c:69: warning: assignment makes pointer from integer without a cast
main.c: In function `xrender_surf_blend':
main.c:153: `XFilters' undeclared (first use in this function)
main.c:153: `flt' undeclared (first use in this function)
main.c:154: `XTransform' undeclared (first use in this function)
main.c:154: parse error before `xf'
main.c:156: `xf' undeclared (first use in this function)
main.c: In function `main_loop':
main.c:439: `XFilters' undeclared (first use in this function)
main.c:439: `flt' undeclared (first use in this function)
make: *** [main.o] Error 1
It seems to do this at the same speed, whether or not I have render acceleration enabled.
"Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
so what??? you think this is so easy and fast to develop quite complex window manager as E with limited resources and developers, after hours in your free time... ?
Somebody should actually say THANKS to Rasterman, cause he done great job on E and other stuff as well!!!
E was "the thing" which got me into Linux, when I saw it for the first time I thought: "This is damn cool! It's beautiful... ooo... and it talks to me! I wanna try it"
So stop trolling or do better if_u_r_so_fucking_mighty.. damn stupid trolls!
Sounds like he has investments, or some other finiancial interests, in embedded linux. It really isn't realistic to say desktop linux is over at a time when it's never been so popular. Maybe "desktop linux profits" aren't so hot, but linux wasn't designed to make money with anyways. And maybe "desktop linux as #1 popular desktop" isn't seeming very likely either. But I see no reason why whithin the next few years we can't get a decent amount of application, games, and hardware support for linux. I'm doing fine right now, and since I never had ties to msoffice, photoshop etc, the native linux solutions currently available a fine for me. It's just people habits and windows app training that traps most of them. Also, as an nvidia card owner, linux gaming is good enough to fill all my gamings needs. There is a big difference between "windows has won", and "linux has lost".
True genius is grasping a situation like a peice of fruit, and peircing it just right so that it drains dry.
While I know Space cadet is digressing, let me answer your question...
>> And what does that have to do with the questions and issues raised in the story?
Just a shot in the dark, but... maybe the guy's Space Cadet is talking about is the same guy who raised the questions and issues?
Hope this helps. (/sarcasm)
Sounds like he has investments, or some other finiancial interests, in embedded linux.
Indeed.
Maybe "desktop linux profits" aren't so hot, but linux wasn't designed to make money with anyways.
On the contrary, I allege that lots of money are going to be made on Linux desktops. Support, mass deployment, customization, life cycle extension... so the money will be made in corporate space, and that's the way it should be. That's where the money in Linux servers is made at the moment.
Also, as an nvidia card owner, linux gaming is good enough to fill all my gamings needs.
The thing is, gaming doesn't matter all that much, again because of the corporate focus. I'm looking forward to the time when every OpenGL game will have Linux binaries on the same CDs. But meanwhile, home users can just dual boot to their war3z windows installations.
Save your wrists today - switch to Dvorak
Yup. And of course XRender knows that, and has decided to punish him.
Not only does he want a good wm and some good libraries, he want's those libraries to be portable to embedded devices, epsecially since that's where he thinks part of the future for Linux is. The drawing library, Evas, has been ported to a number of devices.
// file: mice.h
#include "frickin_lasers.h"
On my linux/itanium system (using NVIDIA's 1.0.4431/IA64 driver), xrender is always much faster than imlib. Here are the reults:
Test: Test Xrender doing non-scaled Over blends
Time: 0.085 sec.
Test: Test Xrender (offscreen) doing non-scaled Over blends
Time: 0.095 sec.
Test: Test Imlib2 doing non-scaled Over blends
Time: 4.893 sec.
Test: Test Xrender doing 1/2 scaled Over blends
Time: 0.028 sec.
Test: Test Xrender (offscreen) doing 1/2 scaled Over blends
Time: 0.033 sec.
Test: Test Imlib2 doing 1/2 scaled Over blends
Time: 1.344 sec.
Test: Test Xrender doing 2* smooth scaled Over blends
Time: 0.328 sec.
Test: Test Xrender (offscreen) doing 2* smooth scaled Over blends
Time: 0.370 sec.
Test: Test Imlib2 doing 2* smooth scaled Over blends
Time: 28.058 sec.
Test: Test Xrender doing 2* nearest scaled Over blends
Time: 0.323 sec.
Test: Test Xrender (offscreen) doing 2* nearest scaled Over blends
Time: 0.370 sec.
Test: Test Imlib2 doing 2* nearest scaled Over blends
Time: 20.745 sec.
Test: Test Xrender doing general nearest scaled Over blends
Time: 0.780 sec.
Test: Test Xrender (offscreen) doing general nearest scaled Over blends
Time: 0.855 sec.
Test: Test Imlib2 doing general nearest scaled Over blends
Time: 45.613 sec.
Test: Test Xrender doing general smooth scaled Over blends
Time: 0.611 sec.
Test: Test Xrender (offscreen) doing general smooth scaled Over blends
Time: 0.849 sec.
Test: Test Imlib2 doing general smooth scaled Over blends
Time: 74.811 sec.
And the results were pretty much the same. Using render was several magnitudes slower on tests 2 - 7. I have a GeForce1 with 1.0.4349 nvidia driver and haven't had the same trouble others have with this option on so I run with this extension on all the time.
t up...
Here are the results for the interested:
Available XRENDER filters:
nearest
bilinear
fast
good
best
Se
*** ROUND 1 ***
Test: Test Xrender doing non-scaled Over blends Time: 0.190 sec.
Test: Test Xrender (offscreen) doing non-scaled Over blends Time: 0.303 sec.
Test: Test Imlib2 doing non-scaled Over blends Time: 0.697 sec.
*** ROUND 2 ***
Test: Test Xrender doing 1/2 scaled Over blends Time: 10.347 sec.
Test: Test Xrender (offscreen) doing 1/2 scaled Over blends Time: 10.231 sec.
Test: Test Imlib2 doing 1/2 scaled Over blends Time: 0.315 sec.
*** ROUND 3 ***
Test: Test Xrender doing 2* smooth scaled Over blends Time: 207.028 sec.
Test: Test Xrender (offscreen) doing 2* smooth scaled Over blends Time: 205.275 sec.
Test: Test Imlib2 doing 2* smooth scaled Over blends Time: 5.695 sec.
*** ROUND 4 ***
Test: Test Xrender doing 2* nearest scaled Over blends Time: 164.460 sec.
Test: Test Xrender (offscreen) doing 2* nearest scaled Over blends Time: 166.281 sec.
Test: Test Imlib2 doing 2* nearest scaled Over blends Time: 4.119 sec.
*** ROUND 6 ***
Test: Test Xrender doing general nearest scaled Over blends Time: 313.187 sec.
Test: Test Xrender (offscreen) doing general nearest scaled Over blends Time: 310.261 sec.
Test: Test Imlib2 doing general nearest scaled Over blends Time: 11.444 sec.
*** ROUND 7 ***
Test: Test Xrender doing general smooth scaled Over blends Time: 477.511 sec.
Test: Test Xrender (offscreen) doing general smooth scaled Over blends Time: 474.695 sec.
Test: Test Imlib2 doing general smooth scaled Over blends Time: 17.290 sec.
(reformatted to get past the lameness filter)
I'm happy I made the jump, but for most people they're just too scared to try another OS. Windows barely works for them - to try another "hacker" OS is too adventurous/hard. I have a bunch of people on Mozilla now....and they see it's good. Eventually they'll switch after one propblem too many in Windows.
..........FULL STOP.
If you're getting unusual AND nice results, we should know WHY! What kernal, drivers (Nvidia?) Linux version, Graphics card, processor, etc. Maybe we are all missing something obvious. tell us like in a scientific paper, the exact details, so we can VERIFY your results, and maybe YOU could help out the whole community.
..........FULL STOP.
Java 2D (the graphics backend, also used by swing) has been hardware accelerated on windows since ~1.4 by use of directx.
Now it seems that java 1.5 will use opengl on linux to achieve the same (or even better...;) graphics performance as seen on windows.
i.e.: http://www.javagaming.org/cgi-bin/JGNetForums/YaBB .cgi?action=display;board=jogl;num=1060784520;star t=0#6
Decided to give it a try for my Matrox G400, but unfortunately, as soon as I ran his program, it died with a memory fault. It was apparently checking which XRENDER filters were available, then promptly died. All output I got was:
Available XRENDER filters:
Memory fault
This was on Debian sid. Anyone else get something similar?
Six sick
He said that over a year ago, however, when desktop Linux wasn't looking so hot. A large part of his point was that the desktop itself would be going away in the future, except as hackers' and enthusiasts' systems. In fact, he went on to state that if this is the case, Linux has a huge advantage over Windows, since Linux is not nearly as tied into the desktop as Windows is, and will have an easier time adapting to such a setting. So he ported his canvas library to run on embedded as well, without axing it for the desktop. Sounds to me like a wholly reasonable thing to do.
Six sick
The drivers from XIG accelerate the Render extension VERY WELL.
All others SUCK, except maybe matrox, but I have none of those to test.
3D means:
2D means:
Although 2D and 3D share some concepts, there are entirely two different things. As today's software requires both windowed graphics and full screen/windowed 3D graphics, graphics cards must have circuits for both 2D and 3D graphics. A 2D graphics hardware implementation is something very trivial and very cheap these days.
Therefore, I find it innapropriate to use 3D graphics for 2D rendering. It will certainly not speed up drawing operations, because the 3D requires more steps than 2D, even if the Z coordinate is always 0. Why Linux should use 3D for 2D operations ?
-1 Weak joke?
Got time? Spend some of it coding or testing
The 16MB Banshee EvilQueen sitting across the room maps three copies of its 16MB into main RAM (so 48MB total, plus maybe another 4MB for a busy X server); apparently each copy is mapped in a different way optimised for different ops.
Got time? Spend some of it coding or testing
Somebody mentioned below that imlib is probably caching the image, whereas Xrender is doing the transformation everytime. So I thought I'd try the same caching approach with Xrender.
The first time the scale test is called, I rendered the image to an offscreen buffer with the correct transformations set. Then after that I just XRenderComposite to the screen from the offscreen buffer. The results (NVidia 4496, RenderAccel=true, geforce2 MX,athlon XP 1800+) for one test are:
*** ROUND 2 ***
Test: Test Xrender doing 1/2 scaled Over blends - caching implementation
Time: 0.126 sec.
Test: Test Xrender doing 1/2 scaled Over blends - original implementation
Time: 6.993 sec.
Test: Test Imlib2 doing 1/2 scaled Over blends
Time: 0.191 sec.
Which shows Xrender taking two-thirds the time of imlib.
My guess is that imlib is probably caching something. This is supported by the fact that Xrender is faster for the non-scaled composition in the original code.
Unless I'm mistaken, XRender is utilizing the 2D acceleration features of a graphics card for scaling, alpha blending, anti-aliasing, etc. It's not trying to do 2D graphics over 3D. Although if you think linux shouldn't be doing that then you really shoulod look at Microsoft, they're moving to an entirely 3D desktop for longhorn.
Who's saying they are/did?
BTW, there's also something called the Stencil Buffer, which is not completely accessible in most 2D API's. It's very useful for your third aspect listed under 2D, "clip output." In that way, using a 3D API would be much more dynamic and faster than using the 2D API.
Basically, the 2D-only API's are limited because some hardware acclerated features on the card are only available by using a 3D-enabled API, like OpenGL. Why waste some of the abilities of the card?
It just depends on what's most appropriate for the task at hand.
This ends up being even more true if you do any sort of complex compositing (eg: alpha blending, hardware accelerated mpeg / video, openGL windows, etc, etc). Enlightenment uses alpha channels, it would be fater to composite in hardware than software. These sorts of operations are not accelerated at all on the 2d path, and have to be done in software.
Go check out Quartz Extreme at http://www.apple.com/macosx/jaguar/quartzextreme.h tml (excuse the space in html).
Having used Xfree86 and Quartz extreme on the same graphics hardware, I can tell you there's no comparison. Quartz is much faster and much more capable.
http://slashdot.org/articles/02/07/20/1342205.shtm l?tid=106
Happy New Year, it's 1984!
That's right, nobody has written anything akin to Enlightenment or edie or evas, and thank god. If Mr. Haitzler's coding skill were as good as it is claimed to be, maybe all those rewrites over the years wouldn't have been necessary. Meanwhile WindowMaker, which is almost as old, sails along with code that's clean as a whistle. Here again we see the responsibility-avoidance mentality of the open source advocate who, in response to criticism, has nothing better to say than "it's your fault for not helping" or "you couldn't do any better anyway."
And NEVER finished.
Config:
t up...
IBM T30 Laptop
2.0 GHz
768MB RAM
ATI 7500 M7 Chipset in Video 16MB RAM
XFree86-4.3.0-15.fdr.1 from fedora
1400x1500x16bit screen rez
RedHat 9.0 plus KDE 3.1.3 and other APT'able stuff
kernel-2.4.20-19.9
Available XRENDER filters:
nearest
bilinear
fast
good
best
Se
*** ROUND 1 ***
Test: Test Xrender doing non-scaled Over blends
Time: 0.016 sec.
Test: Test Xrender (offscreen) doing non-scaled Over blends
Time: 1.269 sec.
Test: Test Imlib2 doing non-scaled Over blends
Time: 0.515 sec.
*** ROUND 2 ***
Test: Test Xrender doing 1/2 scaled Over blends
Time: 10.245 sec.
Test: Test Xrender (offscreen) doing 1/2 scaled Over blends
Time: 4.779 sec.
Test: Test Imlib2 doing 1/2 scaled Over blends
Time: 0.180 sec.
*** ROUND 3 ***
Test: Test Xrender doing 2* smooth scaled Over blends
Time: 217.934 sec.
Test: Test Xrender (offscreen) doing 2* smooth scaled Over blends
Time: 124.811 sec.
Test: Test Imlib2 doing 2* smooth scaled Over blends
Time: 3.359 sec.
*** ROUND 4 ***
Test: Test Xrender doing 2* nearest scaled Over blends
Time: 163.273 sec.
Test: Test Xrender (offscreen) doing 2* nearest scaled Over blends
Time: 75.666 sec.
Test: Test Imlib2 doing 2* nearest scaled Over blends
Time: 2.147 sec.
*** ROUND 6 ***
Test: Test Xrender doing general nearest scaled Over blends
Time: 282.936 sec.
Test: Test Xrender (offscreen) doing general nearest scaled Over blends
Time: 166.580 sec.
Test: Test Imlib2 doing general nearest scaled Over blends
Time: 4.703 sec.
*** ROUND 7 ***
Test: Test Xrender doing general smooth scaled Over blends
Time: 0.009 sec.
Test: Test Xrender (offscreen) doing general smooth scaled Over blends
Time: 324.960 sec.
Test: Test Imlib2 doing general smooth scaled Over blends
Time: 10.415 sec.
"The average user does not judge better by how secure or stable it is, the judge better by how it looks, how smooth and easy to use it is, and how much eye candy it has."
That's funny. A non-average user telling us what average users want. Do you really think that people enjoy virus, trojan, and worm attacks on their machines? Or crackers wandering through their data? Or BSODS on a regular basis? Your opinion is as worthless as a wooden nickel.
Just wondering, cause it is not enabled by default. If I'm not entirly wrong ', no X driver does hardware acceleration of the Render extension yet. Except the binary only nVidia driver. But since its hw accelerated Render extension is experimental it is not on by default.
" Option "RenderAccel" "boolean"
Enable or disable hardware acceleration of the RENDER
extension. THIS OPTION IS EXPERIMENTAL. ENABLE IT ATY OUR
OWN RISK. There is no correctness test suite for the
RENDER extension so NVIDIA can not verify that RENDER
acceleration works correctly. Default: hardware
acceleration of the RENDER extension is disabled.
"
Well yes Windows uses shared mem but there is more to the beast.
a) It has a set of high level widgets which probably are mapped without going through several layers of libraries and vectorization into the driver. It has an optional vectorization/drawing library, many calls are rerrouted into the low level accelerator functions of the driver. Comparing an XFree Server with a WinXP installation on the same machine gives around 3-4 times the performance of Windows compared to XFree around 2-3 times the performance if the RenderAccelerator is turned on. Face it XFree in its current state is rather slow compared to Windows. I think there are many factors to the problem
a) The network layer which possibly drags things down
b) I'm not sure how good the internal threading of XFRee is but given the fact that you need a separate Font Server to get a threading between the X-Server and the Font handling, gives me the impression that this field definitely is a drag
c) Probably the entire X Protocol is too bloated and needs a subset of high level functions
d) The Libraries which probably are a real drag and the biggest cause of the problem
e) The accelerator usage is not really that well as this article seems to show. I'm not sure how viable this is, but could it be that XRandR probably is the wrong road anyway, I think there should be a clear split between a client install and a server and with a clean client install you can drop the entire network layer and add a clean and mean OpenGL based Extension with a good redesigned high level library to hook into.
Is always an operation you want to avoid.
In addition to the OS switching among processes, the OpenGL driver has to perform context switching for the graphic hardware's registers, DMA addresses, etc.
Plus, modern hardware drivers are optimized for games, i.e. full-screen programs. How much context-switching performance do you need? Thus, less development effort, less performance.
The RenderAccel support in the NVIDIA drivers is not complete. That's why its disabled by default. From the benchmark results, it seems that Render wins when there is no scaling involved, and loses badly when there is scaling involved. This suggest an obvious answer --- scaling is not accelerated on the Render implementation on NVIDIA cards! As a result, when scaling is used, the card falls back on a software implementation that is obviously not as good as imlib2's very optimized software implementation. Now, why would NVIDIA bother with Render support if no scaling can be used? Because the only real user of Render at the moment is Xft, which uses it to composit AA text! Of course, the glyph bitmaps are not scaled, because that would reduce sharpness. So NVIDIA released a half-assed Render implementation (they do not, to be fair, that its incomplete, and disable it by default) so the common case of AA-text would get a speed-up!
A deep unwavering belief is a sure sign you're missing something...
Apple's OSX does all rendering through Quartz, (as PDFs) which is accelerated by OpenGL, and called QuartzExtreme.
:-)
That's not accurate. Quartz is really made of two parts: Quartz 2D and the Quartz Compositor.
The Quartz Compositor is reponsible for compositing all the layers (desktop, windows, layers inside windows) on-screen. It offers Porter-Duff compositing, which was developped at Pixar more than 15 years ago. See this post from Mike Paquette for details. Mr Paquette is one of the main developpers of Quartz. Quartz Extreme is "simply" an OpenGL implementation of Porter-Duff compositing and modern graphic cards offer the primitives needed to do that very efficiently.
The Quartz 2D layer is what offers drawing primitives following the Postscript drawing model. The same drawing model is used with PDF (no surprise), Java2D and SVG (and Microsoft's GDI+ ?). This part is not HW accelerated. I am sure Apple is working on it, but it wouldn't surprise me if new HW will be required to make this possible. There is a strong incentive for card manufacturers to offer acceleration, since Longhorn is supposed to use GDI+ extensively. I doubt that such acceleration will fit in the traditionnal OpenGL/Direct3D rendering pipeline.
The Apple JVM team implemented HW accelerated Java2D drawing in their 1.3.1 JVM. Their 1.4 JVM doesn't offer it (1.4.1 was a massive rewrite for them, 1.3.1 was more of a quick port to OS-X using some of their "old" carbon code). There were quite a few problems when HW acceleration was used. I hope they can and will wait for a system-wide Quartz-2D HW acceleration, it seems ludicrous to have the JVM team spend resources on an effort that will be wasted once Quartz2D is accelerated.
See Apple Marketing page, another post from Mike Paquette, and the presentation from Apple at SIGgraph about Quartz Extreme and OpenGL.
If that post doesn't end-up rated +5 informative, I don't know what will !
Debian is so out of date it won't compile any modern software.
The network layer in X11 isn't an issue. By default applications use unix domain sockets. Which is about as light-weight as windows messaging.
.. Well honestly the fastest widget library is everyone's least favorite... Xaw .. .If you want to include all the 3rd party libraries as part of X and somehow blame it on X, then I really have no defense. Of course I could write a cool widget library tomorrow and then you could only complain about how everyone wasn't using my ultra fast/easy library.
In your post, you seem to really be talking about the way a particular implementation of X11 is. XFree86.
The X protocol is VERY fast to parse and extremely expandable. The bad part is that it is quite verbose. X protocol does have high-level functions, but not in the same way you think of them. Take the PostScript extensions for example. You can set up the set of vector operations to draw some button in postscript, send it to the server and draw it 50 times over in different places. Of course nobody uses these "new" (3 year old) extensions.
Libraries that are a drag? Are you refering to things like QT, GTK, Motif, etc?
X's accelleration is broken down into a set of primative operations. X doesn't take advantage of every possible accelleration that your hardware has. It tends to only support things that the X protocol is capable of expressing in a single command. But that's XFree86, there are other X11s that run on other hardware. I assume PC graphics cards' 2D accelleration are design to be easy to interface into Win32, rather than being highly generalized. I've never written a win32 driver before so I don't really know.
“Common sense is not so common.” — Voltaire
And what's wrong with him expressing his opinion? I think he made a good point. Whether you agree or disagree does not disqualify him as a good programmer.
rename the benchmark 3dmark2003.exe
But a GUI is not composed of (overlapping) polygons.
This ends up being even more true if you do any sort of complex compositing (eg: alpha blending,
Alpha blending is a mistake for GUIs.
hardware accelerated mpeg / video
Windows does fine accelerated video, even with cards with 2d functionality only.
openGL windows, etc, etc).
Rendering OpenGL within a Window has nothing to do with the 3d hardware(especially the Window part). You can render 3d using the video card's 3d engine, but the rest of the GUI (which is composed of windows) does fine with 2D.
Enlightenment uses alpha channels, it would be fater to composite in hardware than software.
Translucency is wrong in a GUI. Believe me.
These sorts of operations are not accelerated at all on the 2d path, and have to be done in software.
You only mentioned two: alpha blending, which is irrelevant to GUIs, and hardware MPEG decompression, which has nothing to do with 3d.
Go check out Quartz Extreme at http://www.apple.com/macosx/jaguar/quartzextreme.h
tml (excuse the space in html).
I read it. Plain BS. Here is why:
Quartz uses the integrated OpenGL technology to convert each window into a texture, then sends it to the graphics card to render on screen
And why is that ? just to have drop shadows and translucency ? I certainly don't want translucency, as it is a pain in the ass to work with translucent windows...Windows have very nice antialiasing (ClearType)...I don't need anything else, and I certainly don't need to make each window a texture and waste all the memory.
The graphics processor focuses on what it does best -- graphics -- freeing the Power PC chip to do more operations in the same amount of time
I agree that custom chips should handle trivial things, but why 3D? this functionality should have been incorporated in the 2D engine.
The new Quartz delivers device-independent and resolution-independent rendering of anti-aliased text, bitmap images and vector graphics
Windows does that already.
addition, Quartz can both save and print transparency and the Preview application honors PDF file security. A "Save as PDF" button in the Print dialog streamlines PDF creation
What does a graphic engine have to do with file security ???
Jaguar adds more high-quality fonts to the already exquisite collection, so those who use non-Roman languages such as Arabic, Hebrew and Thai can communicate in their native tongue
Nothing Windows does not have a long time now.
Basically, you don't seem to know how a gui works. Since I have made a complete Window System, I can tell you that it's a big waste of resources to redraw things in a texture, then use 3D to compose the screen.
Of course, you are a proud Mac owner. I can understand that(I just don't understand why you were modded as 4).
Having used Xfree86 and Quartz extreme on the same graphics hardware, I can tell you there's no comparison. Quartz is much faster and much more capable.
Then blame XFree86, don't throw it on the lack of 3D under Linux. XFree86 does not have as a good font rendering as Windows or MacOS, and that is a problem. The other stuff you are mentioning is kiddie's stuff(alpha blending, although impressive, it's like giving candies to children: good for psychological reasons, bad for anything else).
I hope you never need any csenon gas or play the ksylophone
8)
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
a lot
dontcha know 8)
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
Although you have this unreasonable user id and I might sound as a troll I just have to reply to this;
:-)
There is no "way" of writing "proper" code. There is no path leading to good code. Only hinches and guess work.
You know why?
Because computer programming is not a science. Its not a set of building blocks from which you can construct one uniform way of doing stuff. it's not electronics, it's not house building. it's the alchemist job. Turning junk into gold.
You, in some level, claim to have some sort of a key or a solution, you are the code alchemist. a guru pertaining to have the answer.
There is no answer and there wan't be in the near future. As with gold made out of junk - chemist today knows how to do it, it costs about 100 times the gold produces, but there is a way.
Someday, probably after we will be gone for good someone will find the way, giving a tool with which good programs will be born but this way is far far away in the future and all we can do now is decide which of the alchemists we want to be - the one giving false hope or the one actually coding.
I hope this wasn't too bad.
I'll give myself +5 insightful for this one
Total disagreement, for the most part. 1) Extensive changes were made to the 'standard' windows-style GL implementation to make this possible. First and foremost, the complete virtualization of VRAM and some rather brainy texture management inside the implementation, to enable windows with backing stores to be easily shifted into/out of VRAM. 2) 2D composition is a *large* part of windowing systems' work. Moving a window becomes moving a textured quad; screen position no longer comes into play, nor does blitting large volumes of data when windows open, close, or otherwise move. Eliminating that first element of composition, at the window level, means that you can eliminate the cost of operations that users *expect* to be responsive. Regardless of how you feel about *anything* else, these two elements make possible what is a much more fluid windowing environment as a result. The ability to do anything else - whether we're talking about transparency effects, scaling window contents, or general eye-candy, are usability issues. What this does do is enable a whole new set of possible UI features. I'm not going to argue with you - you're entitled to your opinion. Perhaps you'll change your mind once Longhorn comes along and you're using it on your desktop; maybe you'll still think it's pointless then. I'll say this though: Any user who's ever enabled and disabled Quartz Extreme on their desktop mac will see the difference - and it doesn't matter whether you turn off transparency and shadowing effects to see that difference. Move that window, and you'll see it.
-- A mind is a terrible thing.
But it can't be done on X-Windows!!! It's easy to do it on the Mac, because the Mac only has ONE LEVEL of windows. Whereas in X-Windows each little rectangle is a Window. Surely you don't expect every little button on the screen to be its own texture. You will need huge amount of VRAM for that.
That's why I say that it is impractical. Just because the window moves a little bit smoother, it just does not justify the use of 3D, especially under X-Windows.
No - we're talking about compositing here; you can pretty much make every little rectangle a separate bitmap and, quite frankly, get away with it if your GL implementation does the same kind of texture optimization that Apple went through. After all, they're already all little bitmaps, in many cases, being held around in memory and then composited. Smart AGP/VRAM usage means that having to worry about where this stuff is becomes a lot less important. What you're doing is representing a 3D screen - even if you're removing all of the dimensionality of perspective from that. 3D hardware does this a lot better than just 'average', and Quartz Extreme users are the first to get it; Microsoft users of Longhorn will see something like it next, two generations behind where Mac users will be by the time Longhorn hits their desktops. Building a strong compositing engine using 3D operations is not only efficient, it's practical and effective. I have no doubt that, even if you're unwilling to come round now to the idea, you'll change your tune in two years - at which point, you'll be there alongside everyone else screaming that Windows has been doing this for years now, so it's time to change XFree86. The GL work *alone* to make this kind of thing efficient and work well isn't insurmountable, but it's a great deal of work. The community can choose to see the benefits that these changes to GL would allow on Linux, and virtualize texture memory on all supported cards, or wait until everyone else's implementations are already doing it. Someone wanted to know why Apple didn't get it added to 2D hardware accelerators: Simple. Apple, with 3% marketshare, does not call the shots. Microsoft does. Rest assured, now that Microsoft has seen fit to follow Apple into the land of accelerated composition on 3D hardware, we may indeed see some cool stuff come out of hardware; but let's not pretend that anyone other than Microsoft is in a position to do that dictation.
-- A mind is a terrible thing.
i guess that turned out to be a waste of time
Bullshit. OpenGL was originally designed as and still is a client-server model. That is why it has separate client and server state, why it has separate functions for manipulating client state and server state, and why functions to set vertex arrays bind data immediately on function call. That is the point at which the array is sent from the client to the server. That is also why nVidia introduced the extension to specify only a range of an array should be bound. The original thought behind OpenGL was that the graphics engine was the server, and the program was the client. Where these are physically located is irrelevant. The truly high-end monster graphics machines were always tightly-coupled networked clusters with multiple pipes and multiple heads, often in different rooms from where the programmer was working. High-bandwidth connections don't change the fact that they're client-server.
God, I truly despise fuckin' morons spouting bullshit about which they know nothing. And I'm quite curious how you can call 3600 fps anything less than "full speed". Local indeed! I suppose you've never seen a fully-accelerated remote GLX window, either?