Ask Slashdot: Linux and Fibre Channel Storage Systems
"The connection to the Clariion storage sub-system will be made using point-to-point Fibre-Channel via two Emulex LP8000 64-bit PCI host bus adapters over copper connections/cables. The FC interfaces and a Gb NIC will be plugged into a triple-peer 64-bit PCI bus. The whole thing should be able to sustain approximately 120 simultaneous MPEG feeds at 1.2Mbps/feed on our 100baseT LAN."
Eventually, we will be clustering several servers in order to increase the number of simultaneous feeds, and also plan to expand video-on-demand service (educational only, primarily for foreign language curriculcum) to the dorms and other locations on campus, possibly the Internet at large if permission is granted from the video title's author (sorry guys, no free StarWars d/l's). Our initial storage capacity will be 180 GB (approx. 144 GB w/ RAID 3 striping). But we expect to scale up to 2 TB in the next two years.
Another advantage to using Linux would be its clustering support (under Caldera for example). We will need that capability soon, and Microsoft still seems to be dragging behind, with only a handful of 3rd party solutions available."
I guess i don't quite see WHY you would need to do such a thing... You could buy an awful lot of TV's and DVD players for that kind of cash. Put one in every classroom that might need it, or make them mobile. As for getting it into the dorms, I'm sure there is an existing cable tv system that could easily carry a few extra channels. Get some DVD players and video modulators and pay some student $6 an hour to swap discs when necessary. As a student myself I am sick of seeing my tuition money thrown away on useless crap... If we got along fine without it last semester, we can do without it this semester. A hundred grand could build a whole LAB of computers for students to use.
http://www.cs.wpi.edu/~claypool/mqp/raid/
Please moderate this up. They used the Emulex LP6000 not the 8000, but depending on how different the card is the driver may work or may be easily tweaked to work.
I wish you all the luck and love Linux but did you look into getting a Sun instead?
Err, he's using a gigabit Ethernet card, not multiple 100bt ethernet cards. So multiply your available bandwidth by 10. (And gigabit by definition is switched).
One thing to consider: NT is not good at handling large filesystems. According to many people, once you get past 100 gigs of storage NT becomes massively unstable (gee, and it isn't for less than that?!). Linux is not ideal here either, especially for video on demand, where you will run into the 2gb file size limit.
The "safe" bet here, that would avoid the massive instability of NT when dealing with large data sets and the incapability of Linux, would be to go to Sun. You say you are in an educational environment. Sun has approximately 50% discounts in that environment for many of their products. In one case at CUNY, Sun quoted a system for the exact same price that an equivalent high end Linux system would have costed from VA Research or Penguin. Sun also has very good support in the educational environment, because they want to "seed" the field.
So what happens when the machine goes down?
What happens when something... blips.
What happens when you need to do system upgrades?
What happens when you need to do maintenance?
Building one big fancy server is -- in my opinion -- a really bad approach to solving this problem. Sure, I'd love to have a box like the one you describe. It sounds cool. But having the coolest box on the block does not necessarily build the best solution to the problem.
Suppose instead that you bought a small farm of rack-mount PCs. Equip each with a nice 10K RPM SCSI hard drive and a 100BaseTX network interface. If running Linux or *BSD, such boxen could get by with CPU and memory that is modest by today's standards. NT, well, will require more to do the same job. Such boxes can be put together for around $2k a pop. My guess is that you'll get much more aggregate horsepower building it out of a cluster of such boxes than the same amount of dollars would buy you in a big fancy server.
Then you simply make your software able to cause the video to stream off the box that has it. You might also add the cability for the boxen to automagically clone frequently used streams from their home machine to others and load-balance.
If a server goes down, you just restore its data off of backups onto some of the other ones and shuffle where things are homed.
If you need more capacity, to a pretty high degree, you just buy more inexpensive boxes.
This would of course require some clever software to make it all work. Your application sounds like one where some custom software is involved anyway, and it won't be that hard to do the stuff needed to pull off a clustered solution.
Sun is a great company, and will be around for a long time.
I'd suggest the Enterprise Server 4000, as you can fit 3 or 4 processor/ram boards in it, and still have enough slots available to talk to 8 or more storage arrays. You can also get your quad-fast-ethernet card, and it works like a champ.
Solaris is designed to work with lots of processors, lots of ram, and lots of disk space. You'll have much more success than you would clicking on stuff in Windows NT.
We have Enterprise Servers here running several hundred TB's of data for all sorts of things... oracle databases, simulation data files, testing data files... it works great. And you don't have to sit at the thing in order to administrate it.. just use your favorite method to export your display back to your local X machine (or use the text utilities).
-- Erich
Slashdot reader since 1997
Wouldn't a decent Sun box be better? Or maybe something from SGI -- they seem to be big into high bandwidth, multimedia stuff.
The "upkeep" for Linux can be so much more expensive than NT?
/usr/local and /home, and voila again. Every workstation is identical (DHCP-configured Ethernet), if one goes down a user can go to his neighbor's and log in and be right where he expects to be, and I can slide in an identical workstation and he can log in and be back at work, with his icons and etc. all where he left them. Where is the "upkeep so expensive" here? Takes fifteen minutes to install Red Hat 6.0, another five minutes max to set up the NFS mounts? Then I can use 'ssh' to install OS software updates on all the workstations with one command, while all local updates go into /usr/local and are thus immediately available? So where's the "so much more expensive than NT" here?!
bull****.
My brother is NT-certified. He installs high-end NT hardware to monitor custom-built RTU's (real-time control units) that control things like oil pipelines and utility transmission lines. They do not take chances with these things, they use Compaq servers with NT pre-installed (thus avoiding the biggest problem with NT stability, buggy device drivers).
Yet every week a server crashes. For an oil pipeline, thousands of dollars worth of oil can flow through the pipeline, unmonitored, unbilled, while NT reboots and rebuilds its filesystem. Sometimes they even have to fly someone out to an oil rig in the Gulf of Mexico to get NT back running.
Point: If reliability is $$$, NT ain't it.
Meanwhile, our Linux servers just run. And run. And run. It was a pain to get some of the services going (NIS in particular was painful), but once they're going, they just run.
Workstations are even easier. I just installed Red Hat 6.0 straight off the disk, told it to use NIS for passwords, and voila. Then NFS-mount
-E
Send mail here if you want to reach me.
A plug for the company that pays my salary:
MountainGate sells a filesystem called " CentraVision" which sounds like it might be similar to what you're looking for. It is designed for streaming video and shared fibre channel RAID boxes; there are clients for Irix 6.2 through 6.5 and Windows NT. Partitions can be customized for specific types of media, and performance is top of the line. It is a distributed file system, so there is a lock manager but no server - clients load data directly from the drives. Thus it is scaleable.
This system gets you two things: a filesystem capable of handling large amounts of data with grace and speed, and a copy-free way to share the same data between more than one video server.
If you're not going to use more than one video server, why not look into SCSI-3 instead of fibre channel? The whole point of fibre channel is that multiple workstations can have direct access to drives. If you don't need that, SCSI-3 gives you 160mb/sec transfer rates (as opposed to fibre's 100mb/sec), and SCSI equipment is generally cheaper and easier to find than fibre.
There is no linux port of CentraVision yet. I'm writing this in hopes that you'll phone up the marketing department at 800 556 0222 and ask for a Linux port, thus making it easier for us linux-friendly engineering types to get management to approve the project... :-)
(Just don't ask them about the MacOS port. That's my department, and I'm getting tired of them asking when it's going to be ready!)
-Mars
At work we've been looking into a method of setting up a large amount of fast network storage. We know the traditional methods of doing it (one way would be by attaching a RAID-5 unit to our Enterprise 450.)
/*' moments. Because it's just storing file deltas, it doesn't take as much space as you'd think.
:)
We've been talking with Network Appliance. Their NetFilers look pretty good. They claim to be very fast, and are highly scalable. They natively speak both SMB (NT's file system) and NFS (for Unix). ALL they do is storage. You can't program them; they're highly specialized rackmount machines that tie into your NT domains (and presumably your NIS domains, though I didn't ask that) pretty much seamlessly. You can organize your available space as Unix, NT, or shared.
It has one very, very cool capability in the filesystem. You can take a snapshot on a given day/time, which basically means they copy all the inodes at that time. If anything changes subsequently, the original drive data is preserved, and only the new data is written. The inodes in the fresher filesystem are updated to point at the new data, where the old data is still there, being pointed to by the old inodes. It gives you an online revision history, which would be exceptionally nice right after one of those "*doh, I just did rm -rf
Another real advantage we see to it is that because it's a specialized device, if something else on the network needs to be taken down for maintenance (say, the backup server), the network storage is unaffected. The only thing that's at all likely to take your storage down is, well, storage problems. And it's fully hot swappable and running fibre channel internally. It looks really good.
I think you're talking around $100K - $120K or so for a midrange box with 150GB of storage. I believe they presently scale to 1.5TB, and are building ways of scaling further -- they claim much further. Tons of options for high speed networking on it, too.
It's just a thought -- instead of doing what you are doing, which is using PC hardware, switching to a specialized device, with its custom, highly reliable hardware, might be a very appealing solution -- as long as all you're doing is streaming the data off the server. If you need the server to fiddle with the bits before they arrive, you'd have to have another server in the way, which is probably defeating most of the purpose in a specialized unit anyway.
My $0.02. I don't have actual experience with these yet. If anyone does, please chime in.
> Windows NT currently supports 4 GIGs of RAM.
> Linux only supports 1 GIG... but I have heard some people say that it supports 2.
Erroneous. NT and Linux both have the same split memory model, 2 gigs for user, 2 gigs for kernel. A patched kernel on NT allowes a 3/1 split, as does a patch for Linux from SGI. It's quite disingenuous of Microsoft to push this line, and as it's more easily disproven than performance numbers, it's likely to bite them in the ass if they continue to.
I've finally had it: until slashdot gets article moderation, I am not coming back.
There are two places to look for Linux fibre channel support. First, at the University of New Hampshire, someone is the leading force behind fibre channel drivers. Second, at the University of Minnesota, the GFS project uses fibre channel for their Linux-based file system work.
Note, I work for EMC, which also makes large multi-terabyte storage systems.
You might be interested in GFS. It's a cluster filesystem. The idea is that a bunch of machines all talk to the same drive(s), instead of going through a single server.
http://www.globalfilesystem.com/
Generally, it is a good idea to have multiple servers in case one goes down, but it also makes sense to consolidate storage. Storage systems like the EMC Symmetrix or Clariion raid arrays are designed to *never* go down. (When the Clariion does go down, the EMC sales force will be there. :) EMC can even do live upgrades of the embeded code. (Imagine upgrading to the next Linux kernel without rebooting!)
The point is that you can't get the performance and reliability out of a small storage system that you can out of the enterprise storage systems. Of course, in some cases, you can build a system based on replication of the data, which for static data may work, but often as not, when all the costs are factored in, you're better off with a consolidated system.
Part of the fun of Ask Slashdot questions is that they not only answer the original question, but they explore all the related tangents.
So if you're interested in fibre channel, multi-terabyte storage systems, media servers, and such, there's probably going to be a lot of interesting stuff here.
And as to your point, Linux may be a good solution if they can figure out the server architecture. They're already talking about using a cluster of servers, so the CPU power isn't a big issue. I wouldn't assume that not knowing about the level of fibre channel support indicates that someone is clueless about Linux in general.
Rule #1 - what is the limiting factor?
Bandwidth? Machine costs? or your time?
It really puzzles me how people wish to skimp on the hardware then set themselves up for later hassles and risk of (expensive) failure.
Lets take a look at a baseline video server from SGI, Origin 200 with MediaBase with say 100 seat configuration, web management tools + 200 Gbyte FibreChannel and network bits would cost about $50K minimum upfront (guesstimate here based on educational discounts and extrapolation of bits and pieces we've purchased over the years) + 10% maintenance/year. Extra for their FailOver system.
OK, now the hand-rolled Linux version. You need to look for
a) streaming software (Darwin?) for multiprocessor
b) decent high-end file system (port SGI XFS?)
c) tuning the sucker for the best SCSI and network parameters
d) video library management software (none as yet, perhaps someone port SGI OpenVault?)
e) system management to monitor the whole thing
Lets assume you've got a collection of genius hackers at 1 man-years worth at each task, working for nothing except glory, you can probably get it done for $20K and 5 man-years worth of pizzas and coke.
Cheap at that price.
Rule #2 - If you don't know what you're doing, make sure you get damn good advice from people who've done it before.
Rule #3 - You pay peanuts, you get monkeys.
LL
You might want to post some benchmarks (I know, lies, damned lies, and benchmarks) comparing CLARiiON FC boxes to comparable EMC boxes before posting something like that.
I understand DG carries a bit of a reputation with it, but CLARiiON arrays are generally considered top of the line for the market they are in. CLARiiON also gives some features EMC boxes just don't. They may not be as high end (8 GB cache might not be available), but they compete very well in their market. If CLARiiON has a fault, it's more with their PR departments than their product itself. [This URL shows how NEC recently broke some record for best performance on a specified benchmark using CLARiiON boxes for storage: Press release.]
Also, the previous AC who said CLARiiON was porting their drivers and management software to Linux, pending a port of a third party tool; I wasn't aware such an operation was officially underway. A search of CLARiiON's web page shows no reference to the word "Linux" anywhere.
-SDog
Not representing or approved by my company or anybody else.
You have a machine here that is way into six figures and you want to try Linux to save money? Guy, the cost of NT is neglible in your case. By all means ivestigate your options, Linux, NT, Solaris, etc, but dont be an idiot and try to save an extra 10 or 20 grand. Choose the best OS (whatever that is) and you'll end up with the cheapest solution.
This is not the greatest sig in the world, this is just a tribute.
We have a .5TB netapp here. It is used for logs, home dirs, databases, ....
I'd bet that with the dedicated hardware, a netapp with a GigE interface would spool data to the net as fast or faster than a {Sun|NT|Linux} box would, regardless of caching. The dedicated hardware, and an OS that fits on a floppy would just plain do a better job.
However, you are still running RAID[45]. These have adequate read performance, but cannot exceed the speed of a single mirror. Since the server is to be primarily read dominated, RAID0+1 (striping and mirroring) may be a better choice. It still provides data integrity, and much better speed in this application. Although you need more disks to achieve a certain size volume, the controller hardware is cheaper, and generally more reliable.
With any decent stripe/mirror implementation, each mirror can satisfy read requests independently. Having 4 mirrors is like having 4 independent copies of the data. Need more bandwidth to the files, add another mirror....
Bottom line: If you want RAID4/5, and NFS or CIFS, the netapp is they way to go. It will be faster than any general purpose box with comparable equipment. If you want RAID0+1 or any other protocols, go somewhere else.
However, the original note specified GigE as the network connection with ~144Mbit active load. Just about anything can supply that load without a problem. Unless you are planning to up the network bandwidth to 500+mbit, this discussion is headed in the wrong direction.
Eric Brandwine
Eric Brandwine
An engineer is a person who solves a problem you did not know you had in a way that you do not u
Just thought you might want to make sure your company doesn't care about maxing out ram.
Windows NT currently supports 4 GIGs of RAM.
Linux only supports 1 GIG... but I have heard some people say that it supports 2.
For a high end video server, RAM is just as important as HD space or CPU power. NT may just be the better choice in this situation.
BTW, using NT for this solution won't increase your overall costs as much as you may think.
Often, the upkeep for Linux can be so much more expensive than NT (if you build your machine correctly) that it more than covers that initial costs of the OS.
yes that would work well. the kernel is granular enough that that can be done without a problem ( in linux 2.0.x, only one processor could be in the kernel at one time, in 2.2 each processor can be in seperate parts of the kernel, eg SCSI support, file system, and tcpip stack) now if you were running multiple NICs or multiple SCSI adapters only one processor could be in the SCSI driver at one time. I believe that in 2.3.7 they made the file systems, or at least ext2, much more granular. any number of processors can be in the file system drivers at one time. the TCP/IP stack and the block/SCSI drivers need this same kind of treatment.
Draw up a design document with all your requirements and then get talking to solution providers (be it IBM, SGI, Sun, etc etc etc or 3rd parties) and get them to present you with proposals and costs of their solutions... If what you are trying to do is something relatively new or is of a much bigger scale that done elsewhere, many of the big players will probably come in with a nice offer in return for being able to say... 'you want solution X, we know how to do it on a very large scale as demonstrated on site Y'.
The chances of getting that sort of help sound reasonable from the scale to which you are looking, and they will also probably give you a nice support package and good response as their pilot.
If you want to put in a system that lasts, go pro, otherwise it'll be you that won't last from spending too much time fixing/tweaking/everything else on the system.
just my 2p worth...
-~ Given a choice between two theories, take the one which is funnier. ~-
At Enterprise Rent A car we have 128X5.3Tb storage systems and they are powerd by SUn Solaris and NT. NT does just fine. The data never gets lost for any reason at all. If a disk goes down the system automatically picks up a spare and rebuilds the data with no problems. Our storage systems are made by SUN ( or that's what it says on the massive cabs ) Our Data warehouse is one of the largists in the USA and it runs on Solaris and NT just fine.
I have been designing systems like the one you are attempting for a while now and must admit that making Linux do what you want it to is a beautiful thing. Unfortunately, this will depend on how much work you are willing to put into tweaking it into shape. I often get tempted time and again to cut my costs with PC technology, but in the end... I wuss out and go with midrange (E450 - E4500 ish)
sun boxes and Veritas cluster server. Its gawdaful expensive, but when the customer dosent care about $, it goes. If you have time to tweak though... and limited cash.. then PC may be the way, and linux has the reliability. Just remember, a PCI bus and a fast processor alone do not a bandwidth slinging server make, no matter what intel tries to tell you..!