Making ZFS and DTrace Work On Ubuntu Linux
New submitter Liberum Vir writes "Many of the people that I talk with who use Solaris-like systems mention ZFS and DTrace as the reasons they simply cannot move to Linux. So, I set out to discover how to make these two technologies work on the latest LTS release of Ubuntu. It turned out to be much easier than I expected. The ports of these technologies have come a long way. If you or someone you know is addicted to a Solaris-like system because of ZFS and DTrace, please, inquire within."
So what am I supposed to do about all the kernel panics and absurdly slow IO and transfer speeds?
The issue with ZFS and Linux has always been more about copyright than implementation.
I just looked at this article as my employer uses Debian and Ubuntu heavily and I've been pushing for ZFS on our file servers. There is no mention of ZFS version, the feature set available, or even a link to the source material.
There isn't much mention of how to use ZFS. I happen to know most commands, but I think this article would be difficult for a beginner even though it seems to be targeted at that demographic.
MidnightBSD: The BSD for Everyone
Just use solaris? ZFS, dtrace, SMF, and about 100 other things you dont have on Linux (I've been a Solaris and Linux user since about 1994).
So there's a list of 10 steps to install zfs and that's it? Didn't do anything? zfs/zpool upgrade -v? zvols? zfs send/receive? snapshots? rollback? Scrub? Performance tests? Compression? Encryption? Can I export my pool from my Solaris 11 SPARC system and import it into linux, make some changes and then move it? L2ARC support? Separate ZIL support? Case sensitivity?
I know this isn't exactly a great comment, but is it at all possible that someone make a judgement as to the value and truth of a submission before putting it up?
I've been running ZFS on FreeBSD for a few years and it's lived up to its promises, but I think I'll be migrating off of it. The problem is that I trusted Sun. They did some goofy things, but you knew where you stood with them. They release ZFS under an Open Source license? You could take them at face value and know that you were allowed to use it. But now that Oracle holds the reins, I have no desire to depend on any Sun-borne projects anymore. Yes, ZFS is Open Source. So was Java, and Google just spent roughly a bazillion dollars defending themselves for using something that looked like it. I can't afford to take on a case like that.
Other than the Oracle-owned btrfs, what ZFS alternatives are available and ready for use today?
Dewey, what part of this looks like authorities should be involved?
I always thought the hold up on ZFS and DTrace on linux was the fact the CDDL and GPL didn't play nicely with each other. It was never a technical reason.
I've been running both on FreeBSD for a couple years now. Still don't have any production machines with ZFS yet, but I've found DTrace to be a life saver on more than a few occations.
"The problem with socialism is eventually you run out of other people's money" - Thatcher.
I use ZFS on Ubuntu 11.10 in "production" for my main workstation and fileserver with a 3x3TB raidz pool with an L2 ARC. I/O is blindingly fast, and it has been rock solid. It serves about 10 machines, and feels an order of magnitude faster than the md/lvm based xfs array it replaced.
I write 10GbE drivers for Linux, MacOSX, FreeBSD and Solaris. I make heavy use of Dtrace for both debugging and performance analysis. I feel naked without Dtrace, and I've used the linux dtrace a few times for debugging. Unfortunately, I've never had dtrace run on linux for more than a few minutes without crashing a machine. This is not necessarily bad, and often just a few seconds is all I need. But I would never run linux Dtrace on any production machine, whereas I use it all the time under Solaris / FreeBSD and MacOSX and often have customers run Dtrace probes on those OSes to diagnose issues.
So an article lacking knowledge of the technologies, any sort of testing, anything beyond "make install" or "apt-get install", will make it to the Slashdot homepage? This person openly admits that they didn't test ZFS beyond creating a zpool, and they don't know enough about DTrace to try... anything.
As an aside, why was Linux capitalized, but Solaris was not?
- oZ
// i am here.
3) Partition the new drives.
)9 .... “sudo zpool create zfs-blog raidz /dev/sdb1 /dev/sdc1”
Ha ha ha. You know part of the magic of ZFS is management of the entire disk drive. No partitioning
Look: The ZFS on Linux project is a noble effort, and I am sure many Linux users will eventually benefit, and it will maybe be good enough for them to not switch to Solaris.
But none of this stuff is production quality yet on Linux, and the performance VS Solaris is questionable. Linux doesn't have the components to implement all the integrations and beneficial "layering violations" ZFS has on Solaris; and they won't for a long time, Sun spent over 10 years and tens of millions on development of Solaris and their filesystems, I don't think it's reasonable to expect to see the same kind of polish on Linux for ZFS or Dtrace at this point, and we don't; lots of work and funding could change the situation, but for now the Linux implementation doesn't hold a candle to the Solaris implementation.
You can go ahead and add: COMSTAR, SMF, FMD, and an excellent native NFS server implementation, to the list of things Solaris has but Linux doesn't.
The Linux implementation of even ZFS is less mature and sheds benefits of ZFS. Including ease of management. 10 commands just to get setup? Geez.
With Solaris, you have ZFS out of the box and you just do "zpool create tank mirror c1t0d0 c1t1d0 mirror c1t2d0 c1t3d0
No "partitioning" ZFS manages the disks, including disk cache, and fault management.
You can be pretty darn sure zfsonlinux doesn't have the same level of FMA reporting / fault management capabilities.
No attribution to the original authors of the tools, and the dtrace is a one-line change to the original dtrace-for-linux port.
Why “sudo make all” when “make all” will do?
The DTrace integration is via a kernel module, so the license on DTrace is irrelevant..
There are a couple of interfaces in Linux that should be externalized for getting stack tracebacks into user space in a standard manner without caring about binary architecture (they are currently static). I've personally used a modified Linux with DTrace mods and these functions externalized, and it's rather stable and usable. Specultive tracing is also a lot better for finding the origin of some random errno in the kernel, or who in user space is calling gettimeofday() a bazillion times in order to time stamp X events.
Obligatory disclosure: I was on the team that did the DTrace port to Mac OS X.
-- Terry
I've been using zfsonlinux for a few months now, ever since I migrated my file server from OpenSolaris to Ubuntu Server, and I've generally been pretty happy. It's been stable and fast (faster than osol was, anyhow). My only complaint is that mounting filesystems on boot seems eternally broken.
In previous options, there was a config file option for a workaround, and the filesystems usually (but not always) got mounted on boot. Then that solution was removed in favour of an updated mountall package; unfortunately, this new solution never works. I'll boot the system, no filesystems mounted, but running mountall from the command prompt gets everything mounted OK... Sigh.
It's a must have for a sysadmin.
How many times have you had application folks or devs come by your desk and complain "things are running slow".
Good to be able to find the root cause and sometimes propose a workaround under an hour.
Now that we are making the switch to Linux, people complain that we're not quite as fast as with Solaris when it comes to finding out why stuff is running slow..
So yeah in the perfect case Linux might have a slight edge over Solaris in performance, but that doesnt matter when the tools to diagnose shitty in-house apps are missing.
I've been using ZFS on Linux also a while on my Ubuntu based backup/media box. No problems so far, and the average transfer rate of a 100 GB disk image has been 50 MB/s from internal drive A to internal drive B (non RAIDed, Asus E35M1-I DELUXE Mini ITX with 8GB of mem and 2*Western Digital Caviar Green 2TB SATAIII 64MB,). The CPU usage hits maximum while transferring, and ZFS also uses most of the RAM quite efficiently.
OpenIndiana has only made three "development releases" since 2010, it is not a production grade system. Just a hobbyist system.
I did a quick test with 2 identical VMs on my desktop with Intel SSD, I installed the ubuntu-zfs as from the article and I installed btrfs-tools.
The VMs have 4 CPUs and 4GB of memory, 3 virtual disks.
The btrfs has RAID1 data and meta data, the ZFS setup used RAIDZ as in the article:
mkfs.btrfs -m raid1 -d raid1 /dev/vdb1 /dev/vdc1
(I needed to create the partitions, for some reason the ZFS version didn't want to work without it)
My quick stupid test, create a large file:
ZFS:
500+0 records in
500+0 records out
524288000 bytes (524 MB) copied, 16.8489 s, 31.1 MB/s
real 0m16.853s
user 0m0.000s
sys 0m0.480s
btrfs:
500+0 records in
500+0 records out
524288000 bytes (524 MB) copied, 15.232 s, 34.4 MB/s
real 0m15.234s
user 0m0.000s
sys 0m0.640s
New things are always on the horizon
Congrats... this is a good summary on getting these working under Ubuntu. I did the ZFS install "naked" (without a summary as good as this) with a 10.04 box about a year ago and it has run great guns. Now, having said that it's good for what I use it for which is a temporary location to dump my SQL backups to from a large email archive using dedupe prior to running it off to tape... and another zpool mounted as an archive VMFS volume through NFS to our VMware farm so we can archive decommissioned virtual machines for 30 days prior to deletion per our policy. I am not 100% convinced I would use it for anything production though; supportability is still an issue with this and as such I remain a little dubious whereas with most of our system I can call a vendor and have them fix it. As the storage admin I find this a great way to keep up with the demands for storage while having a relatively transparent way (for my admins) to put stuff into a place where it doesn't take up so much space.
Now having said that there are some caveats; as the zpool gets really large the ability to delete files becomes slower and slower when it's deduped. This is because a lot of database transactions take place to remove the files particularly when there's a lot of deduplicated blocks... and this problem is a lot worse under Ubuntu than it was under OpenSolaris (which is where I first played with ZFS). There are times also that when reading the SQL backups to dump them to tape it can make both storage pools unresponsive enough that VMware drops the NFS datastore and I have to manually remount them. Far less than perfect... but good enough for what we use it for.
I have recently taken a decommissioned physical server (a DL380 G5 with two processors and 16GB of RAM) and put OpenIndiana on it to play with ZFS some more and it is working fantastically well. In my tests though it still has the slowdown issues, high utilization in one pool won't cause the other pool to grind to a halt when both are deduped. Also, it's been nice to (at least in test) create a ZVOL on my ZFS and present it through fiber-channel to my VMware hosts as a potential replacement for the NFS volume on Linux (I have only Emulex cards, and I have yet to see a properly working Emulex target mode under Linux). So far my testing has gone marvelously and I have found dedupe rates to be about the same as the NFS mounted volume... though slightly lower. I suspect that's probably because the data isn't really block aligned all that well but it still saves me a bunch of storage when we have 30 almost identical virtual machines being archived! On the bright side there I have not yet seen utilization get so high on the OI box that it causes any significant issues or dropouts that cause VMware to complain; so far it's been rock solid. I may migrate my ZFS stuff to the OI box and get it off my Ubuntu box... but at the moment they're both working great and I have no complaints.
The big kernel lock has been gone for a few years.
ZFS likes 2GB of RAM, 1GB is the min.
I'm looking for a solution that runs on 512MB RAM and a 512MB CF disk and a Via c7 CPU.
All the "F*NAS" guys have cut off support for people like us.
Everyone forgets Crossbow and Zones ...... A virtual network stack in a separate, partitioned machine.
Run software, Bind, LDAP, AMP Stack, VirtualBox, all in a separate, zfs snapshot compatible system.
I can roll out a new Zone, with a predefined software stack in 5 minutes, fully scripted, fully partitioned, with external network access, plus load a new VM, and start it.
Shut it down, snap shot, and copy it for a new zone.
Rolling out a new vnic is simple and quick
dladm create-vnic -l ${pays_nic} ${new_nic}
add it to a zone
zonecfg -z ${zone_name} set iptype=exclusive; add net; set physical=${new_nic}; end
zlogin ${zone_name} ipadm create-if ${new_nic}
zlogin ${zone_name} ipadm create-addr -T dhcp ${new_nic}/ipv4
Done!
For a VirtualBox just add the ${new_nic} to the machine
Need the vnic mac address to make this work
dladm show-vnic | grep ${new_nic}
get the mac address
zlogin ${zone_name} VBoxManage modifyvm ${vm_name} --nic(1-4) bridged --bridgeadapter(1-4) ${new_nic} --macaddress{1-4) ${new_mac} --nictype(1-4) 82543GC --nicpromisc(1-4) allow-all
And then start the Machine backup.
I love the whole Solaris Software stack...... Just can't afford Oracle.
OpenIndiana is nice, but I worry about the long term ZFS issue, version 28 might be the final universal version.
Why on earth should I want to switch to Linux if I could use a Solaris-based OS for free?
It's not "just" ZFS and Dtrace, its the whole package of components and how they are designed to work together that linux just can't match in server space.
Linux is great on embedded systems, in the need for some esoteric device drivers or when you'll need some specific software (often touted as "posix compliant") which won't compile under solaris (guess who's fault is that...) . But in the later case there is a backport of brand-z (linux zone) available and KVM got integrated into the latest Illumos kernels too...
So I see people moving away from linux to a free solaris-based OS whose different flavors have each their own strenghts which go far beyond what anything else has to offer...
dragonflybsd has hammer, i use it on my fileserver. It has snapshots, deduplication and online realtime backup as simple as hammer read | ssh host | hammer write sort of.
Best of all, it works with as little as 256mb ram. I even ran it with 64mb although some cleanup won't be done then. I never got why linux wouldn't support that. It could be in kernel instead of fuse. Although fuse is not bad (i like plan9!) it's kinda slow. Hammer is neat. Version 2 is on the way btw. Dragonfly is a very interesting bsd.
Have you tried one of the following replacements?
http://sourceware.org/systemtap/wiki/SystemtapDtraceComparison
I don't like the *BSD. They are very low on features, and, since
cross-platform software need to be built for the lowest common
denominator, it holds back progress.
Here is a link to getting these listed technologies working together
Native ZFS
ISCSI
PROXMOX 2.0
KVM
http://blog.wanfuse.com/?p=22 check out other links at blog.wanfuse.com lots of good stuff!