Vastly Improved Raspberry Pi Performance With Wayland
New submitter nekohayo writes "While Wayland/Weston 1.1 brought support to the Raspberry Pi merely a month ago, work has recently been done to bring true hardware-accelerated compositing capabilities to the RPi's graphics stack using Weston. The Raspberry Pi foundation has made an announcement about the work that has been done with Collabora to make this happen. X.org/Wayland developer Daniel Stone has written a blog post about this, including a video demonstrating the improved reactivity and performance. Developer Pekka Paalanen also provided additional technical details about the implementation."
Rather than using the OpenGL ES hardware, the new compositor implementation uses the SoC's 2D scaler/compositing hardware which offers "a scaling throughput of 500 megapixels per second and blending throughput of 1 gigapixel per second. It runs independently of the OpenGL ES hardware, so we can continue to render 3D graphics at the full, very fast rate, even while compositing."
It's all about tradeoffs, and always has been.
Nothing has changed.
Either you write generic support which works everywhere and performs with mediocrity at best (e.g., standard Linux on a desktop), or, you optimize for a particular hardware platform and get more performance.
The thing with RP, is that it's a low-power machine, so the generic mediocre performance is pretty awful and you need to specifically optimize to make it usable.
Great, more wayland propaganda. As if exploiting certain hardware features has anything to do with Wayland vs X11. Wayland: Breaking decades of backwards compatibility for no good reason.
Exactly. This article boils down to "wayland performance on pi went from suckass to very nice" which is mildly interesting but the implication that wayland rulez and X snoozes because of that is specious. There is no reason X couldn't see the same performance improvement if it too switched drivers.
When information is power, privacy is freedom.
X11: Being needlessly complex with today's use cases for no reason.
If X11 is so good, why isn't Android using it?
I've seen lots of special ports of packages made to take advantage of the RaspberryPi's gpu. X11 is the most conspicuous one left out. I want to hear you give me this reason.
The time when everything needed to be specifically ported to a machine to make it perform bearably or at all. How I missed having stuff not work without that extra length to go to.
On embedded hardware, that time never ended... And the rPi isn't really fast enough that you can just run in all software, or even with just the relatively feeble OpenGL hardware, and pretend.
You show me documentation on how to write an X video driver, and I'll friggin do it. There is no consistency between the drivers at all to even snag one as a "template". There is no article/paper that I could find that says "Here is how you develop an X video driver".
"When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
Why doesn't any device that actually requires decent GPU throughput use it, including the Mac, the PS2/3/4, etc?
Why did those developers see fit to NOT use the freely available BSD-style code out there and spend their time writing their own rendering pipelines?
For fun?
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
As the video and Daniel's post explain, we don't lose backwards compatibility because we can host legacy X applications in a Wayland window using XWayland. We get all of the benefits of doing top-level composition in hardware, none of the pain of writing (and maintaining) a hardware-accelerated X driver. Can you explain why anyone starting from a clean slate today would chose to accelerate X itself instead?
Amen. X seems to have the highest complexity to documentation ratio of any major software subsystem I've ever come across.
Things like low level OS frameworks and related drivers, which require low latency, high performance, and sane memory footprints, must be ported to the architecture in a language whose compiler/linker spits out native binaries. No python/java/.NET here, because the lower the hog is in the stack, the greater the impact on latency and performance it has.
Wayland is a perfect example of this as it sits very close to the hardware with a driver between it and each device. This concept will never change because at some point the software must speak to the hardware directly no matter how the hardware is designed. If anything, the decade of sandboxed apis are a big reason why we need gigabytes of ram and microwave clocked CPUs to do basically the same things we were doing with desktops in the 90s with acceptable performance. The current situation on desktops (regardless of OS) is a sloppy waste of cycles that could either go into greater performance or power savings (or both, depending). Clean, efficient code is not, nor should it ever be, passe.
Yup. We know lots of people don't love the shiny (or love the speed more than the shiny), so we'll be providing the ability to turn off fades and scaled window browsing. Disabling fades has the nice side effect of removing 120Mpixels/s of blending, so you can have more windows on the screen before the back of the stack falls back to 30fps (for responsiveness the front of the stack will always run at 60fps regardless of scene complexity).
The time when everything needed to be specifically ported to a machine to make it perform bearably or at all. How I missed having stuff not work without that extra length to go to.
On embedded hardware, that time never ended... And the rPi isn't really fast enough that you can just run in all software, or even with just the relatively feeble OpenGL hardware, and pretend.
Not to mention the Pi is only $35 and uses a few watts of power, you cant expect current laptop class performance for that price.
The OP ignores the fact that incorporating this tech into the major Pi distros and projects is only work for the developers of said projects, not end users.
End users just wait for the next software update, and then they get vastly improved graphics performance.
I fail to see what on earth is wrong with a major advance in performance to a specific piece of hardware.
I just smell the acrid stench of cynicism wafting from the general direction of the OP.
99% of Linux users want desktop performance, not remote desktop performance. Put that legacy remote shit into a module if you want.
Only the State obtains its revenue by coercion. - Murray Rothbard
Especially considering that Pi would be a perfect example of a device that benefits from X11-style remote applications -- being based on a video decoder SoC, it has somewhat nice GPU but tiny CPU.
Contrary to the popular belief, there indeed is no God.
I'm not philosophically against clean fast code, but to your point my desktops are probably 98% CPU idle when doing a normal workload, and only really pick up when: Playing games, Playing flash, Doing a compile, Running a development server and testing. The age of low level fast optimization is all but dead. For a brief time during the smart phone revolution, pathetic CPU's were a bottle-neck, but with my N4, nothing I throw at it feels slow or choppy. It has 2GB of ram IN A PHONE. Sure limited spec and fit for purpose devices will need fast low level access to optimize, but that takes time, and quite often we're finding that hardware's faster and cheaper than wasting time optimizing for the apex solution.
Take your question again: In 10 years when our entire assortment of devices has as much horsepower as my desktop computer does today, are we really going to need significantly tight processing? I'd say the better long term solution will be making development faster and hopefully more expressive.
Bye!
If you've been using Linux since 1.0 (I have since 1.2) and have never seen any X11 failings, you're either talking out of your arse or are completely blinded by unrelenting fanboy-ism.
I've seen plenty of X11 failings over the years, ranging from inability to change screen resolution on the fly for about the first decade, poor security, crashes in the video driver taking down the OS, various hacks to get things like multi-monitor or 3d support to work, etc.
Yes, some of those things have been "fixed" via various bodges, in much the same way that the average wannabe Nissan Silvia drifter will "fix" crash damage with a drill and some cable-ties.
High latency, low bandwidth, high security risk stuff like network transparency does not belong in the same process as the rendering engine. It certainly doesn't want to be running as root. Especially when the majority of people simply do not use it, and it can easily be retained via a daemon like every other platform uses.
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
Just because something is "possible" it doesn't mean it is a good idea. The fact that as per TFA wayland got 20% better power consumption BEFORE they took out a lot of un-necessary data copying should be reason enough for Linux people to sit up and take notice.
Mobile devices are future and a 20% plus reduction in power consumption whilst improving performance is nothing to sneeze at.
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
I'm... sorry?
You think SysV init scripts are in any way, shape or form moderately acceptable?!
I have a very simple refutation to that -- the collection of run scripts behind this link.
Go ahead -- have a look. Keep in mind that systems using those mostly one-line scripts all provide not just startup/shutdown/status, but also the ability to auto-restart on failure and lack the propensity for race conditions that pidfile-based locking almost universally used by SysV scripts is so very, very prone to.
Holding up SysV init scripts as a thing that doesn't have to be changed... it beggars belief.
Yes, and the cleaner API is everything. If backwards compat can be maintained (it is) and the codebase can be a lot cleaner (it is) and perform better (it does) then why are people so anti-X replacement?
Open source is supposed to be a meritocracy, yet with all the weston hate around here you'd certainly not get that impression every time a weston thread pops up.
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
Having used PCAnywhere, VNC, X11, ICA, RDP, and PCoIP - X11 rates last in terms of performance. It rates last in terms of features. And before someone says "oh but rootless mode!", RDP and ICA have been able to do that since 1996 or earlier. If X11 is gradually phased out something better will replace it. Like perhaps something developed in the last 15 years.
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
X11 on linux is network capable but really can no longer be classified as network transparent. None of the main rendering engines for X11 on linux are network transparent.
The talk on the state of X11 and Wayland/Weston given by one of the lead developers is a bit of an eye-opener about just how munged up X11 is at this stage.
My complaint was simpler. Hot swap monitors in 2003.
in 2003 I could unplug a monitor from my powerbook G4, and plug in a different monitor with a different resolution without causing anything other than window resizing things(and even that was done mostly automatically)
I tried that with linux in 2010 and not only did it crash out X11 but the automatic tool that was supposed to do it wouldn't restart. I didn't want to manually rerwrite x.conf every time I wanted to plugin in a different monitor(something I was doing several times a day).
To this day I miss aspects of transparent network windows. remote desktop/VNC just are not the same. However they are fast/ stable compared to X over anything but a local 100mbit lan.
I truly wish someone would rewrite X from the ground up with some new ideas on how to do the network transparency.
i thought once I was found, but it was only a dream.
Arguably, the fact that this specific hack had to take place is a bad sign, just not one specific to Wayland, or even (particularly) to the rPi.
There are, attempts at least, at standardizing the interfaces for the sorts of features that this Wayland modification used to get better performance on the pi; but they certainly aren't anywhere near where OpenGL is in terms of adoption, and so the compositing and windowing modification had to be made specifically for the 'DispManX' API used exclusively on these Broadcom Videocore parts.
At least this isn't a situation where applications have to have much platform-specific knowledge; but it's always nice to keep platform-specifics abstracted in some standard way as low in the pile as you can.
You have no idea what I've done mate. I've also used X11 over 10 megabit ethernet and it's still crap.
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
Network transparency, if relevant will be provded by an additional daemon, where it does not have to run as root and does not become part of the local rendering pipeline. Networks are slow, any minimal theoretical performance impact moving the networking outside will incur will not be noticed as it will be imperceptible vs. the latency on even 10 gigabit ethernet.
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
The age of low level fast optimization is all but dead.
I keep thinking that, but then keep running into situations where I have to optimize things. My coworker has been optimizing a piece of code for the last two weeks because our customers find it too slow, and this is on a 64-bit i7 with 16 gigs of RAM (some image processing stuff). There will always be things that need optimization.
"First they came for the slanderers and i said nothing."
At least until Moore's Law ends. Dunno when it will happen but assuming the continued survival of the human race there will come a time when our computers are not becoming more powerful with each generation.
For the short and moderate term you can risk relying on Moore's but in the long run all good things come to an end.
Except said 700MHz machine is running a fairly modern and high end GPU.
The processor in it was designed for media tanks and media players - think Roku, WDTV, AppleTV, Popcorn Hour, and other such devices. The CPU load for those things is low (just enough to display a UI and handle streaming the media to the GPU). The GPU is capable of handling decent 3D performance at 1080p resolution as well as video decode and other tasks.
It's why the processor is weak and why XBMC is stuttery, but when using a GPU accellerated task like video playback, it can play 1080p video.
All these guys have done was to exploit the power of the GPU - including the 2D accelerator, to improve performance even more. The less CPU devoted to graphics task, the more for your tasks that can't use the GPU.
If you think X11 was "recently broken", you're deluded. It's been a steaming turd for a very long time. And whilst you can't polish a turd, you can dump it in sparkly glitter and it will look a little better than before. But it's still a turd.
Just to be pedantic; yes you cans polish a turd mythbusters busted that one.
https://www.youtube.com/watch?v=yiJ9fy1qSFI
and by the way x11 is no turd.
---Saying gnome 3 is better than windows 8 not so much a compliment as it is damning with light praise.
When I used Linux as my desktop (well, laptop) OS on a day to day basis (maybe four or five years ago), the only option to add a second monitor (plug my computer into my TV) using X itself was to restart X. Since that kills all your apps, that is effectively a requirement to reboot your machine just to connect to a TV.
The only way I ever got around those issues were via nVidia's proprietary drivers and control panel. They, at least, could add additional displays without killing everything. Most of the time. Sometimes it just didn't work, or killed X.
I've not used Linux as anything but a headless server OS for several years now, so I can't claim that this situation still persists, but X forced me to deal with headaches that Windows and Mac OS had solved a decade earlier.
Then again, my experience trying to use the latest version of OS X on a macbook pro on a pro-grade video mixer last summer very annoyingly illustrated that nobody has got multi-display anywhere close to perfect yet. I think Windows gets closest (comparing my Macbook to my Windows desktop), but it has its own set of issues. Annoyingly, a lot of these issues are UI-level or configuration-level deficiencies, not technology-level issues like X has (or had, at least).
That's nice. I have remote machines on the end of shitty 512kbit satellite links in africa. We have enterprise licensing for Windows so the costs aren't that bad. We need some level of windows infrastructure in place in any case to handle Exchange (PHBs want it) and the various mining industry tools out company uses to get minerals out of the ground.
The point is thus: irrespective of what platform you run, X11 performance when compared even to offerings by Microsoft (RDP) is just blasted into the weeds.
That, my friends should be a fucking embarassment. X11 on 10 megabit ethernet performs worse than RDP over 256kbit frame relay. It's a fucking disaster.
IF i want to replace all my end user's desktops with dumb terminals, X11 simply isn't going to cut it.
Now, I'm not saying run windows everywhere. I'm simply saying that by any metric you care to use, X11's remote performance is simply horrible. IF wayland starts a push to phase it out in favour of something that is actually usable over something slower than an ethernet LAN, this is (long term) a GOOD thing.
I'll bet you X11 stalwarts are complaining about the need to convert to IPv6 as well? If not, why not?
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
Low level optimization is far, far from gone. It's just what you need to optimize that has changed. CPU number crunching is no longer the bottleneck. Memory is, and drawing things on a screen is essentially mostly memory operations. And we have special hardware to handle this for us to try and relieve the bottlenecks, so now the optimizations is in how you use that hardware.
If you tried running a non-optimized CPU-only system on modern hardware, you'd go out of your mind because it was so sluggish.
It's ending right now. Clock speeds have stalled years ago. Memories are running at crawl speeds compared to CPUs. We're just buying time by increasing parallelism now, but Amdhal's law is waiting around the corner to put a stop to that, too.
The SysV init scripts have one huge advantage though: I can read/debug/understand them and all I need to know for that is a bit of sh(1) and coreutils. I have no use for shaving off 10s from the boot process and I don't start/stop services so fast that I could run into a race condition. I like being able to find out whether the service is today called bind9 vs named or httpd vs apache2 by simple filename completion.
Although your /. id is smaller by 3 orders of magnitude, I'll stick with scripts if you don't mind.
"solved" it. lol. If you have a very, very generous definition of "solved", sure.
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
Not at present, but we're expending quite a lot of effort on getting hardware-accelerated Webkit running at the moment; Wayland is a key enabler for this.
if you relocate the graphics rendering to the GPU and make it perform better, then using that system and sending the rendered data to clients means not only the performance improves, but the load on the server reduces.
RDP has been doing "network transparent" viewing for a long time, its more than sufficient for all, so if we can improve things using this - we should. No need to run X just because its X.
that provides acceptable web browsing performance with flat screen, keyboard and mouse will hasten the transition of the desktop market as we know it. The biggest problem with the Raspberry Pi for me has been the slow desktop experience. The impact of using the hardware accelerated graphics to eliminate this issue cannot be underestimated.
Greed is the root of all evil.
Clock speed, Moore's law, what have they got to do with computing power?
Wirth's law is you enemy.
Bremermann's limit is waiting for you Goaway.
Yes, yes it will be important to have optimized software in 10 years. Remember X was entirely usable (OMG I'm getting old) 20 years ago. That 150MHz Alpha workstation in the lab was amazingly fast, but here we are today talking about tweaking software for modern hardware 20 years later. And since Moore's law doesn't really work any more, we can expect little progress in terms of single core performance over the next 10 years. Besides, the big thing now is power efficiency to extend battery life and that requires efficient code as well. Optimization will always be important. Why use twice the core count when you can just write tighter code? That argument isn't going away.
Many of the X developers disagree with you.
Many of the X developers disagree with you.
And many of them agree with me. Whatsyourpoint?
When information is power, privacy is freedom.