Why OpenSolaris Failed To Build a Community
xtaski writes "Ted Ts'o, one of the earliest Linux developers, points out some serious flaws in OpenSolaris. There is a severe lack of developers, for one. Apparently, after 3 years, the OpenSolaris 'developer community' is still struggling to get the proper tools for developers to develop! Ted also points out some other flaws which make it clear just how disconnected the executives at Sun are from what's really going on in their 'open source communities.' He notes, 'It was never ... Sun's intention to try to promote a kernel engineering community, or at least, it was certainly not a high priority for them to do so.'"
http://tytso.livejournal.com/
I think Sun underestimated the importance of casual users. A lot of times the people choosing an OS for a project (be it enterprise deployment, inclusion with hardware, or just use within IT) go with what they are familiar with and also what their current interests are. When Sun open sourced Solaris, there was a lot of interest from the Linux and BSD communities. A lot of those people decided to download a copy and give it a try. The difficulty these casual users had in grabbing an installable copy and getting it running easily were significant. A lot of people just said, "meh" and moved on. The last time I grabbed a developer preview I still had to fill out a bunch of forms with my personal data then deal with Sun's "download manager" and then spend significant time getting it to install, even within a VM customized to run OpenSolaris in particular. That is still better than it used to be. I only have a success rate of about 50% in getting Solaris to install to date.
For most people I think it is just too much of a hassle and all the developer momentum is on Linux. I guess when Sun thinks about open sourcing Solaris, they see it as a way to try to stop their hardware customers from moving away from Sun, which is fine, but does little to leverage the real benefits of an OSS community such as Linux has been doing for a long time.
Another issue with opensolaris for me was the installation. Being a fairly experienced *nix user, years of sunos, aix, linux, bsd, etc.. under my belt and a fairly competent programmer. I tried quite a few times to install OpenSolaris and there was always some major problem. I never did get a stable system working and finally gave up. That said, this all comes as no surprise to me whatsoever.
"Computers are a lot like Air Conditioners" "They both work great until you start opening Windows"
Linux had a community. It was the Minux community that was starting to had problems with patches. Since the base code had a bad copyright, and thus could not be freely transmitted. And patching patches or still more patches just got out of hand. The GPL that Linux used ended all that and allowed Linux to take off.
You have no idea what's coming with Project Indiana. Installs are vastly faster, and the product is much, much better. Take a look once it's released in May.
Any AHCI-based SATA controller will work. Marvell-based SATA controllers (models 88SX5081, 88SX5080, 88SX5040, 88SX5041, 88SX6081, and 88SX6041), nvidia nforce sata, silicon image 3124. AHCI and Marvell are good choices. There's also a ton of SAS support (e.g. LSI, Adaptec).
Nexenta works fantastically - I love it. I would definitely use it for any storage servers, or high availability servers that do your normal Apache/SQL/P* stack.
:)
However, for desktop and non-standard services, it still sucks. If it's not a web server, and it's not a storage server, don't use Nexenta, use Ubuntu Server. Or Debian if you know what you're doing
So where does OpenSolaris fit in? It seems to be an OS lacking a niche.
My blog
Solaris >=2 is based on SunOS 5 which was derived from closed source system v. SunOS 4 was indeed based on some kind of BSD, but got killed by Solaris 2 long time ago
Solaris >=2 still contains plenty of BSD code. Furthermore, System V contains stuff derived from Berkeley as well.
Without BSD, Sun wouldn't even exist.
You know, when Sun started shipping opteron hardware, the sparc stuff didn't vanish into thin air. It's still very much alive and well.
BTW, what white-box linux platform competes with, say, the Sun Fire X4500 + Solaris?
http://www.youtube.com/watch?v=-zQ5RLAyA7w
Do daemons dream of electric sleep()?
"closing of part of mysql's source is just as big an indicator that they're not committed to being open."
You have that story all wrong. Nothing that previously was opensource is closing. MySQL has released open and closed-source products forever. The decision to make a native backup driver and compression/encryption as plugins to the open/public API had nothing to do with Sun's management. That was decided by MySQL prior to the acquisition.
There is 0 change there. It's an indicator of business as usual for MySQL.
I agree with that last bit, the article is way immature and innacurate.
If you read the comments of one of the blogs cited, you will see OpenSolaris members clearing up the situation and showing how she was a bit hasty in her comments. At least that's my opinion.
Like signing and NDA for a OpenSolaris User Group meeting. Turns out, even by the blogger's own statement that The NDA was for confidential information in case for instance something got left behind in the meeting room. Since OpenSolaris related information is obviously not "confidential" I don't see the big deal. But if shee happens to notice a binder labeled "x5500 new petabyte server runs on 2watts and this is how we do it" then that would be covered by the NDA. From the comments it seems that this is not uncommon when using corporate/government facilities though it is not always the case. Simon Phipps also seems to be eager to get it resolved.
There were some other issues as well. The funniest is that the original governing body that Sun set up to run the project in the period before the community had a chance to elect their own members was mostly composed of non-Sun employees, but when the community actually elected their own board members, it was mostly Sun employees.
The trademark issues might be a different matter and I can't really comment on them but it sounds like something that should be resolved.
I think this guy is just yelling "the sky is falling" because he wants it to.
I have to admit that I feel the same way. Oh sure, there are some nice things (Solaris Volume Manager, once you get the hang of it, is actually not bad though I still have some gripes), but on the whole it ends up feeling like I have to go and reinvent Linux from scratch just to get the system working like I think it should.
;)
Good thing I used to run Gentoo otherwise that kind of thing might actually tick me off.
"Just a fox, a whisper."
Incompetence?
Dtrace?
ZFS?
Zones / Containers?
Ultra SPARC T1, T2, T2+?
They took their source code and chip designs, opened it up with their version of opensource license, while keeping control of what gets put back into the distributions for the OpenSolaris and Solaris projects, and it's working - quite well.
If opensource were all on an even playing field, there would only be one opensource license.
Considering the numerous versions and variations, there's obviously some things that everyone just can't agree on for a licensing model.
Who is general failure, and why is he reading my hard drive?
Linux just works? Yeah, maybe for a small system running a simple app stack.
/dev/sde came from vs /dev/sdf. When you have 20 luns mapped to the same host from two different arrays, its kinda important to know which drives come from which array and what the corresponding lun numbers are. That said, most linux admins I've talked to didnt have a clue about what I was talking about since they never had a san.
/proc and watch the fun start.
I had to setup an oracle cluster: Thanks to Oracle's support policies, we could not use Solaris x86. Nor could we use RHEL5 (no Oracle 9i support), so RHEL4.6 it was. Should be easy, right? Well tested "enterprise" class linux that can do everything the big boys can do.... We took the hardware we were going to use for solaris and switched it to linux. A pair of Sun x4600M2's, 128G of ram, 4 Dual core AMD's. Sun fully supports linux on this box and RedHat lists it on their HCL. Should be easy.
The basic OS install was more or less easy, once we battled through the serial port redirection setup (guess most linux users never used a serial port before. After all, why bother when the box sits under your desk). I stil like serial ports over video for one major reason: issue resolution (when bad things happen, having that panic string saved by a console server can really save the day)
Ok, so the system was kickstarted and now it is time to set it up for use as an oracle DB. This is a production system, and we need lots of space (4TB) and High Availability. This means redundant connections to everything, mirroring and clustering.
Issue #1: multipathing drivers for the SAN. With solaris, you just plug the thing into the san and all of the storage that the host has access to just showed up. Multipathing was instant and I didnt have to do jack. I could see what devices mapped to which physical array with a simple command. I didnt have to guess which array
Issue #2: dynamically add luns: With solaris, you just change the mapping on the array and the host picks it up and auto creates the dev links. That was easy. On Linux? you've got to be kidding me... You get to echo some crazy strings into several spots in
issue #3: IP Multipathing. With solaris, dladmin is used to create a bond (if it is going to the same switch and the switch supports bonding) or use the built in ip multipathing to do an active/failover setup if you are going to multiple switches. Very well documented and very easy to do. With linux... yeah, bonding is a fun task. Need to go to multiple switches? no such luck, you are screwed. I eventually used VCS to take over the systems main IP and uses its IPMultipathing agent to do the job for me. VCS on solaris just hands the task off to mpathd since the OS already does it for you.
Issue #4: zones: dont get me started. I dont want to run another entire OS, I just want name space isolation and chroot is so primative it is not even funny. Zones gives me everything I want with minimal overhead. It would have been nice to have since there are a few oracle products that dont play nicely with clusters (*Warehouse Builder*) because they imbed the host name everywhere. We could put it under Xen, but this is an app that moves huge amounts of data around, not exactly a good candidate for virtualization. Zones let us get around Oracle's brain dead use of the hostname, no such luck with linux.
Issue #5: 3rd party drivers vs the new kernel patch. If I install a 3rd party device driver in solaris and upgrade the kernel, I dont have to rebuild/reinstall the driver. Linux (even redhat 4.x with their "back port") forces me to rebuild/reinstall every damn time. Its great if the driver is standard with the kernel, but if you need something outside of that (lsi multipathing drivers to get around #1 and 10G NIC drivers in my case) and you are screwed. No wonder up2date ignores all of the kernel* rpm's by default.
Issue #6: Whats the system doing? Solaris: `mdb -k` and dtrace. Linux: still trying
It seems Sun is doing everything in its power to alienate a developer community.
-Wouldn't let the opensolaris board call the project opensolaris. Probably a legal quagmire of their own creation. The consequences of that lead to this resignation. http://mail.opensolaris.org/pipermail/ogb-discuss/2008-February/004488.html
-There's this gem, most of which I don't pretend to understand. The punchline is on the bottom. http://cryptnet.net/mirrors/texts/kissedagirl.html
-There's this gem, where even Ian Murdock links in suggesting the difficulty is happening above his level. http://ianskerrett.wordpress.com/2008/02/22/a-solution-for-suns-os-community-problems/#comment-17418
http://www.maxineudall.com/2010/02/should-economists-be-sued-for-malpractice.html
Here goes:
/proc files). Then we tried the mppBusRescan utility that comes with the lsi drivers. Noe worked because they were only operating at the scsi level, not the FC port level. The missing magic? we have to first force a lip on the FC ports by echoing more stuff into more /proc files. `cfgadm -c configure` is so much easier since it does everything in one shot, and that is only if you turn off the autoconfigure option.
:-) Talk about two vendors who have not really bothered to advance their OS in the past 10 years, linux is awesome compared to those two. (of course, my experience with those two have been helping the poor admins try and do half of what linux can do)
1) We did it as part of the finish script in the kickstart script. It was more than just adding it to the grub.conf, you also have to tweak a few other files to actually get the damn thing to start a login prompt on the serial port. Compared to solaris where I just add a console=ttya in the add_install_client script (yeah, we jumpstart everything) as a boot param and the installer takes care of the rest.
2) I ended up loading the lsi mpp drivers. Using a combination of veritas dmp debug commands and the mmpUtil command, we can map which lun it comes from. the vx commands give us the uuid of the lun, and the mppUtil commands tell us what lun has that uuid. Oh yeah, and we use VxVM for the fast resync capabilities since we run a campus cluster. ZFS offers a similar function of only copying the changed blocks over when (not if) something goes wrong. I hate waiting for a 2TB volume to fully resync, and so do the DBA's who then bitch about the performance hit.
3) We first tried the magic that redhat suggested (echo "bla" into a bunch of
3) 802.1ad is bonding, aka EtherChannel in the cisco world (truning to everybody else). I mentioned that we are running between two different switches and bonding is not supported in that config. This is the problem with linux. I say I want to do something, and everybody jumps on the wrong answer. (yes, cisco will support it between 6509's running the not yet out Sup 1440's.... like I trust that to work, 10G is just easier for the bandwidth)
4) linux-vserver... its close, but zones takes it to the next level. And I hate patching the kernel since you are just asking to get something to not work since you are now one of a very small group doing it. If i wanted to do that, i'd work for redhat/suse/oracle/whoever. Yeah, it was fun in the early days of slackware when you had to more or less build a new kernel from src to do anything, but I really dont have time for that anymore.
5) ps and top are good for easy problems. Here is an one that those tools didnt spot: I had to figure out what was eating a system alive and sending it into a tailspin. Turns out that oracle's enterprise agent was spawning thousands of sub jobs that lived and died in under a second. dtrace spotted this in an instant. top and ps will never saw it. (once you know the problem, strace will also spot it, but you first have to know what process is causing the problem) Yeah, when somebody says the system is 'acting funny', I reach for top, ps, ptree and the like. If nothing shows up in 30 seconds, its time to dig deeper. That requires tools like truss, snoop, dtrace and mdb (for the really nasty problems). I dont get paid more than everybody else in my group to solve easy problems. I get paid more because I get to solve the nasty ones.
As for zfs, the first release was cool, but had a few 'issues' that needed to be sorted out. At least the sales engineers I talked to warned me to wait for an update release or two before I used it in production. That was then, but this is now, and it rocks. We now use it as the base FS for all of our zones and as a failover fs with Sun Cluster. The damn thing just works. I would not put oracle on top of it due to a few minor strange things oracle does, but for everything else it is great.
And dont feel so bad, I think poorly of HPUX and AIX too