Ask Slashdot: Linux and Fibre Channel Storage Systems
"The connection to the Clariion storage sub-system will be made using point-to-point Fibre-Channel via two Emulex LP8000 64-bit PCI host bus adapters over copper connections/cables. The FC interfaces and a Gb NIC will be plugged into a triple-peer 64-bit PCI bus. The whole thing should be able to sustain approximately 120 simultaneous MPEG feeds at 1.2Mbps/feed on our 100baseT LAN."
Eventually, we will be clustering several servers in order to increase the number of simultaneous feeds, and also plan to expand video-on-demand service (educational only, primarily for foreign language curriculcum) to the dorms and other locations on campus, possibly the Internet at large if permission is granted from the video title's author (sorry guys, no free StarWars d/l's). Our initial storage capacity will be 180 GB (approx. 144 GB w/ RAID 3 striping). But we expect to scale up to 2 TB in the next two years.
Another advantage to using Linux would be its clustering support (under Caldera for example). We will need that capability soon, and Microsoft still seems to be dragging behind, with only a handful of 3rd party solutions available."
Some Issues...
Quick calculation: 100Mbps LAN environment / 1.2Mbps MPEG feed = 83.333 simultaneous feeds MAXIMUM at 100% LAN saturation (leaving each server NIC), which isn't very likely even if it's switched, you can best expect 95%, 90% most probably... Therefore each NIC will yield ~75 feeds at once. So I suggest subnetting each of the servers' NICs onto their own subnets, and then you have a choice: either plug all those into a 100Mbps switch; or each of the NICs into a 100Mbps HUB each, where each HUB could only facilitate ~75 connections each...
For the most part, you are, unfortunately, ahead of your time a tad...
The problem is that Windows NT will bomb here too.
Agreed Linux isn't the solution. He needs to get a real computer, not Windows NT. SGI, Sun, etc. all have computers that will do what he wants. With Windows NT he's just as experimental as with Linux, he just can't fix it when it crashes (which it will do -- over, and over, and over again).
I'd have to second this. Good as Linux is, there's no question that Sun makes systems that can answer this need. True, the OS isn't as decked out as Linux is, but it has custom built hardware.
For something as high-end and core as this, either Sun or HP will provide something with 24/7 support.
NT might do fine at reading data from a storage system, but for streaming real-time video data it's as ill-suited as Linux. Somebody else mentioned an SGI system which was perfect. Why are we talking toy computers here?
Go with the guys made for this stuff: sgi.
I did some work for a company which used 2 NetApps. These were about 70GB apiece, I think. They were using NFS; one was mounted as /usr/local on all the SGI workstations, and I forget what the other one did. A good-size company with lots of people accessing the App, and I didn't notice any slowdowns when accessing /usr/local. Apparently, they're very stable...these were using uwSCSI RAID. They actually have their own UNIX-ish minimal OS, so the interface is familiar. Cool boxes. The snapshot feature is nice to use as an emergency backup, too.
Cost cutting is probably the worst reason to pick Linux. My group at the MIT AI Lab (http://www-swiss.ai.mit.edu/) has standardized on Debian, both on the desktop end and on the server end. We have a large enough budget that we don't care if we blow a few grand more or less on each computer. From the equipment you're buying, it also sounds like the license costs of NT aren't a big deal.
We choose Linux because it just works better. It's a faster server, easier to program for, better for the work we're doing on the client end, clusters nicely, and if push comes to shove, we can fix stuff that we don't like. It's a time-saving measure, and it lets us do more. We're also in acadamia, not industry. As such, we care about using open systems (and, unlike much of sold out acadamia of today, we also still care about keeping our own work open).
Cost isn't the reason to switch to Linux.
it is very expensive, especially since schools are always running out of money ;)
heres a thought:
build a lab of linux boxes and use them to make network storage for the video and then use them as a cluster to serve up the video. Meanwhile, people could be using the ternminals. You then reserve a certain number of cycles on each computer for serving up the video.
Just a thought.
SGI Mediabase FAQ:
http://www.sgi.com/software/mediabase/faq.html
Of course, projects like these are what drives Linux R&D. But from an economical perspective, one is better off with an "off-the-shelf" solution.
Don't know where you're living, but this page lists a few customers:
t ml
http://www.sgi.com/software/mediabase/success.h
I just want to know where he found a x86 motherboard with 64-bit pci. DPT has a 64-bit pci fibre channel card that was released a couple weeks ago, with linux drivers. But I can not find anyone that sells a x86 64-bit pci motherboard. I think there are some "server vendors" that have them, but I want to build the system myself instead of depending on the vendor and whatever fscked up proprietary configuration it has.
-dp dave@unif.com
I am a trained and experinced NT profesional.
:-)
Windows NT is not the tool for this job.
The solution I would have used depends on how much time I had. I would personally researched the possibilities of using Linux, but not used it right away.
I would been thinking in direction of Sun, IBM, HP, SGI and others.
This is a job for the big irons, but I would definitly has researched Linux. If not just because of the kick of it..
check out Concurrent Computer Corporation.
claims theoretical support for 4 gigs at http://www.freebsd.org/FAQ/FAQ51.html. Cdrom.com has some impressive stats too on their new box. The problem with trying ultra high-end stuff on PC hardware is the best you can really say to your boss is that it should work. Companies like Sun or SGI have already deployed hundreds (if not thousands) so you shouldn't end up with any suprises along the way.
Another alternative could be to buy a bunch of smaller (high end P2 and moderate memory+disk) clones and set a bunch of those up. New videos would be added to the servers in round-robin order and more popular videos could be placed on multiple servers. Now you would have one (or maybe more for reliability) redirectors which could redirect the request to the right server using mod_rewrite. Through the use of an external program (which would do a database lookup most likely), the user could be redirected to the proper server based on regular expression patern grabbing from the URL. See the "External Rewriting Engine" part in the Rewrite guide. It is also possible to transparently proxy the request so the user never sees http://www19.example.com/ugly/url/movie.xyz, but you probably aren't http streaming your movies anyway.
I currently run Netapp filers and I whole heartedly agree with most of what was said, with a few exceptions. The price is the first one. We paid $88k for 180 gig Filer 740 including support contract. We do use it in our NIS, and it setup in about 10 minutes. As far as streaming media support, any front end will do as long as the storage is handled by the back end. I think Linux would be a great choice, I have been using Linux since the first kernels and it is turning out to be a very stable and scalable solution. One note though, use the newest kernel you can get your hands on, they support SMP much better. And as far as quad proccessor boxes go, you can get them from Dell in the new 6350 chassis for around $15k with a gig of ram, I should know, I just bought eleven of them. Good luck on your project!
Ok, I can't think straight. If all you want all your movies to be identified by a single ID then you could use something like this in mod_rewrite:
/cgi-bin/metagen?$1 [T=application/x-httpd-cgi,L,PT]
RewriteRule ^/movies/([^/]*)$
All on one line ofcourse. What this does is take any request in the form "/movies/somename.xxx" and calls "/cgi-bin/metagen?somename.xxx" Now, your metagen script can figure out what server the actual file is on (possibly looking at IP address too...) and generate the proper MIME/metafile response back to the browser to start the client which will contact the proper server for the movie. If needed, you could expand the URL string with another () section and add +$2 on the other side. It looks like this may not be needed at all because SCRIPT_URI/URL still contains the original request!
For the [] stuff, T sets the MIME type (forces server execution in this case), L stops processing RewriteRules here, and PT passes the request back into Apache because my cgi-bin is a ScriptAlias directory. Adjust your line as needed.
Hmm... Microsoft were cheaper in more than one way back then, and still are the cheapest in one of them, namely quality. No matter how many billions of dollars they put in R&D and QA they can never match the QA of the opensource movement where *everyone* running a particular piece of software can read it's source code and optionally fix a bug they find. I'm no kernel hacker, but I have submitted bug-reports *and* suggested fixes. A couple of them actually made it into the official kernel sources. :-)
Suggest a fix to microsoft and you get told that "in the next version that feature will be obsolete anyway" or something similar, just because they don't know where Bill wants to go tomorrow. And believe me, I was told *exactly that* when I reported a bug in MS Proxy Server 1.0 regarding usage limitations for certain groups of users on my network (I ended up using a linux box with squid and ip-masquerade, still runs as a charm). Heck he doesn't even know where he (or we) want to go today
/Anonymous c0warD
It burns me up to see cash spent on the fancy-shmancy cr*p. My tuition climbed to $5000 bucks this year. (That's a lot for canada!).
They put laptop ports in the middle of a hallway - no desks, just ports sticking out of the wall! What, am I supposed to crouch in the hall to check my mail?? How about letting brains drive the technology!
And now this useless WebCT business. What is wrong with straight forward lectures and discussion?
I work in a fibrechannel lab at a large corporation Im sure you all have heard of... We do compatibility stuff on many OS'es and architectures, but I dont think the drivers are there for Solaris/intel. Its the one OS glaringly missing... (we even have linux) Solaris/sparc, however, is a definite good solution.. Clariion is even supported by Veritas without any fancy tweaking on most of their products.
Compaq does still make rather nice fibrechannel raid boxen... http://www.storage.digital.com will redirect you properly....
Rogue Wave? It's been ported for some time, and is part of the Portland Group compiler set.
This makes me laugh.
I use the win32 API every day. What a piece of shit. MS with all it's billions produces win32, a grotesque, poorly documented, inconsistent sorry excuse for an API.
You argument about how brilliant MS is doesn't hold any water with me.
You can try the XFS filesystem from SGI instead. It was released into Open Source by SGI a while back and it has proper journaling. This means that it doesn't have to fsck forever like Ext2. The max size I don't know though.
/Michael Sjölin [Framtidsfabriken]
Just to let you know, CLARiiON is an OEM to many companies, compaq being one of them.
And I happen to personally know that there is a major project going on in CLARiiON to develop Linux support. I have worked on some of the code myself.
A couple years but back, a foaf compared netapp versus auspex versus a big ultra enterprise box w/ fiberchannel raid array as nfs server for a large usenet news spool with a few ultra1's as a pool of load-balanced nntp servers. Each ultra1 had it's own dedicated fddi segment back to the nfs server in this test. The all-sun solution lost the race big time in both speed and ability to deal with stress. The netapp was fastest with low to moderate quantities of nntp clients hammering the pool of ultra1 nntp servers. At highest client loads, the netapp was overwhelmed. The auspex yielded pretty consistant throughput, almost as fast as the netapp under moderate loads but degraded more gracefully when we hammered the crap out of it with the highest loads. I dunno how well this scenario would compare to a streaming media server....... YMMV
PS: This was back when, AFAIK, netapp was still using x86 pentium processors. I believe now their boxen may be alpha-based ????
fucking sucks. Look at their worthless AViiON piece of shit.
it uses intel pIII garbage and runs NT. how worthless.
Really, I am not trying to win a prize for the strangest suggestion of the say, but this is the sort of thing that mainframes excell at. They aren't too fast, they are a little odd, but they a)stay up for ever, b)were dealing with 4+TB data sets fifteen years ago, c)have incredibly nice tools out the wazoo (Workload Manager, anyone), and d)are built for bandwidth, not speed. Perhaps I am missing the point, but if you need to serve thousand and thousands of files at the same time, 24/7, then I would give IBM or Hitachi a call and start looking for 60 year old men with rayon ties and clipboards.
Some points:
IBM will negotiate on licences. They will also work with you to cut costs. AFAIK, though, there isn't any academic pricing for mainframes.
Look at the bottom line when IBM hits you with the licencing and upkeep fees -- mainframes are designed to stay up with not real trouble for years at a time. Also, mainframers are less expensive than you think, and they know how to run systems properly -- you can expect essentially no screw-ups from the old farts. This is nice.
Like UNIXes, mainframes come with lots of tools. No extra costs there.
You can cluster mainframes AND everything in the system is at least triple redundant AND you can swap everything (including CPUs) on the fly while it is running with no problems at all.
Mainframes, like the DG-UX stuff, are a job to keep running, and IBM service is pretty nice, too.
I love Linux a lot, but if you need a mainframe, you need a mainframe. It sounds like you need a mainframe.
The NetApp can also handle static http content as well.
:-)
I've used several NetApp's for several years. In some situations, they will lock up, but NetApp is quick to fix serious problems and good at following up with you about _any_ problems so long as you have "autosupport" enabled as you're supposed to
Even if one does reboot (for whatever reason) it only takes a few minutes (~2 in my experience) to be back up and slingin' data.
I've used them for web server storage, realaudio/video storage, distributed home directories (mainly web and admin staff), ORACLE database storage, and for USENET news storage. The news storage array is the only one that has ever shown anything near "unstable" (read: locks up or needs to be restarted every few months -- generally due to a drive access faults.) The web storage array has been restarted a half dozen times in the past three years.
(I think the news array problems were drive related and not a problem with the NetApp itself.)
I have some limited expirience with FC storage
so let me add a little tad.
Light Pulse is a nice card however it's
Tachyon based. Our company tried to qualify
them and they briefly popped on the web page
but customers complained about data corruption.
I do not know if they rectified that.
I would suggest Qlogics or ISP-2100A based
board. That at least would work. Driver for
Linux is being maintained.
DG array is a solid piece of equipment.
If you use JBOD type of config (which you
should, for the bandwidth sake), do not try
to save money on host adapters. Remember,
disk has about 12 MB/s rate off the magnetic
head (more on the outside, less on the inside).
Six or seven busy drives are all your loop
needs to saturate.
If you find it acceptable, please use real
optics, either SM/LW 9um or 62.5um MM/SW.
I did not like fiber before because it never
worked well and is a bitch to fix if breaks.
However I spent time in a Sun shop and they
use it everywhere with exellent result. For
a big installation fiber is a life saver
because it is much more resilient to noise
(remember - Gigabit speeds), and you never
risk to burn your ports with a floating ground.
Comments to other posts:
* One guy noticed that different drives
may work or not work. This is a sad truth.
FC-AL is not like 100base-T where you may
plug everything into everything and expect it
to work.
Drive manufacturers do test their drives but
their imagination is quite limited. Large number
of drives per loop, high I/O rates (IOPS),
use of both FC-AL ports in the same time -
all these conditions contribute to instability.
Then there are crappy enclosures of course.
* One poster claimed that 100base-T does not
go to the rated bandwidth. This is false, as
my personal expirience suggests. However,
driving 1000base network is difficult indeed.
Its bandwidth is mostly CPU limited. 550MHz
Xeon is about what you need for one CPU-load
saving Gbit card and a lefover to do an MPEG
compression for a stream or two.
* Use of plain SCSI - this is definitely an
option to consider as FC-AL is not a magic
bullet. Basically, if your system is small
in size, say a rack or two - SCSI would work
fine. Up to four racks and 5m in size is a
borderline, when SCSI must be differential,
but FC-AL is too expensive. When your system
has 10 cabinets of storage, you better use FC
(Sun E6000 with 5,500 disks).
* Many posters suggest to look for Sun, SGI
and EMC, NA, etc.. This may be a good bet.
Many of these companies have special offerings
for video streaming. For example, Sun used to
sell so called "Media Server", which was the
only product in the Sun to employ RAID-4,
for predictable response times.
But I am afraid it would be damn expensive.
--P
That's not entirely true. I don't know about "real world" rates, but the Intel 82557 (rev. A -- the oldest of the Intel chips [no longer made]) has been proven to sustain 87Mbit/s.
I have personally brought a Cisco 7000 to tears with one of these creatures in a P120 base Linux machine. The poor Cisco didn't resume forwarding traffic until a few seconds after I disconnected the Linux box. (Granted, that was an '040 based Cisco. But I still thought it was funny.)
we were asked to design and collect data for stuff like this a while back. Pick up issues of computer graphics world and other graphics mags..they have loads of stuff on things like streaming video. Also look at research papers from multimedia events..some ppl actually build systems like these and test em and write about em in the papers...
SGI machines are purpose built for VERY large file systems and high bandwidth access. I saw one of their machines ziping (yes zipping) around a 200GB dataset the other day. No hicoughs at all. The whole filesystem was several TB so no problem with IRIX support there.
Wasn't really flamebait.. I was just tired of seeing people use buzzwords as an "after the fact (bought hardware)" type deal...
Let's just think of it this way..
If you buy hardware you obviously have to research if your software can support it. He bought hardware for NT, researched for NT. Him doing this "Oh lets run linux because it's a BUZZWORD!! WOOOHOO!" is crap.. I listen to enough of it at work, I really don't want to see it promoted at Slashdot.. (Like I have any control)...
Other than that, I was in a really bad mood...
-D.A
(and no I don't work for Microsoft...)
(Sorry for being an AC, but revealing my /. identity will probably lead to someone finding out which company I'm talking about here, which again would be bad for me.)
That it does, but you'll also want some redundancy in storage, if you're going to have a solid production environment. I would suggest having a setup with dual independent storage servers.
Yet the EMC system where I work has problems often enough, also with data corruption. Not very nice, and no wonder EMC doesn't want to refer to this customer on their web site. As a side note, EMC doesn't have a product by the name "Clariion"; maybe you're thinking of someone else.
But as a customer you don't CARE about the sales force when you have problems, you want the problem FIXED, especially when you've spent a 7 digit amount of dollars on the system. Sales people don't fix anything. Tech people sometimes do.
My advice would be to go for some relatively better proven and (hopefully) more reliable technology than what EMC offers, such as Sun's StorEdge or Network Appliance's F-series.
I've got a problem. I'm trying to get Linux to install on a E3500. And the hard drives are not recognized because of the FCAL. Any solutions? Robert Elsner
They are running some modified version of NetBSD.
With requirments like that I've serious doubts that either Windows NT OR Linux can cut it. Based on the fact you're running on PC based hardware I'd say your best bet is Solaris.
Obviously part of the question is, which of the above three OS's can work with the hardware. We know NT can, it sounds like Linux might, and I would assume that Solaris can.
Right now with a just 144Gb, any of the OS's can cut it, but I've serious reservations about either NT or Linux being able to scale up into the Terabyte range and be able to still perform well.
Oh, and yes, I do have experience with LARGE fileservers. I play daily with servers that range from 300Gb to 1.6Tb.
We're currently working with Emulex to develop a driver for the Lp[68]000. The driver will be released under the GPL, targeted at 2.2.n kernels. This is part of the development effort for ZFS (The Zesty Filesystem). We don't have a solid timeframe for completion, but work continues apace, as Emulex seems very committed to solid Linux support.
-John Justice
justice@quantumres.com
Quantum Research Service
I guess i don't quite see WHY you would need to do such a thing... You could buy an awful lot of TV's and DVD players for that kind of cash. Put one in every classroom that might need it, or make them mobile. As for getting it into the dorms, I'm sure there is an existing cable tv system that could easily carry a few extra channels. Get some DVD players and video modulators and pay some student $6 an hour to swap discs when necessary. As a student myself I am sick of seeing my tuition money thrown away on useless crap... If we got along fine without it last semester, we can do without it this semester. A hundred grand could build a whole LAB of computers for students to use.
http://www.cs.wpi.edu/~claypool/mqp/raid/
Please moderate this up. They used the Emulex LP6000 not the 8000, but depending on how different the card is the driver may work or may be easily tweaked to work.
I wish you all the luck and love Linux but did you look into getting a Sun instead?
So what happens when the machine goes down?
What happens when something... blips.
What happens when you need to do system upgrades?
What happens when you need to do maintenance?
Building one big fancy server is -- in my opinion -- a really bad approach to solving this problem. Sure, I'd love to have a box like the one you describe. It sounds cool. But having the coolest box on the block does not necessarily build the best solution to the problem.
Suppose instead that you bought a small farm of rack-mount PCs. Equip each with a nice 10K RPM SCSI hard drive and a 100BaseTX network interface. If running Linux or *BSD, such boxen could get by with CPU and memory that is modest by today's standards. NT, well, will require more to do the same job. Such boxes can be put together for around $2k a pop. My guess is that you'll get much more aggregate horsepower building it out of a cluster of such boxes than the same amount of dollars would buy you in a big fancy server.
Then you simply make your software able to cause the video to stream off the box that has it. You might also add the cability for the boxen to automagically clone frequently used streams from their home machine to others and load-balance.
If a server goes down, you just restore its data off of backups onto some of the other ones and shuffle where things are homed.
If you need more capacity, to a pretty high degree, you just buy more inexpensive boxes.
This would of course require some clever software to make it all work. Your application sounds like one where some custom software is involved anyway, and it won't be that hard to do the stuff needed to pull off a clustered solution.
Sun is a great company, and will be around for a long time.
I'd suggest the Enterprise Server 4000, as you can fit 3 or 4 processor/ram boards in it, and still have enough slots available to talk to 8 or more storage arrays. You can also get your quad-fast-ethernet card, and it works like a champ.
Solaris is designed to work with lots of processors, lots of ram, and lots of disk space. You'll have much more success than you would clicking on stuff in Windows NT.
We have Enterprise Servers here running several hundred TB's of data for all sorts of things... oracle databases, simulation data files, testing data files... it works great. And you don't have to sit at the thing in order to administrate it.. just use your favorite method to export your display back to your local X machine (or use the text utilities).
-- Erich
Slashdot reader since 1997
Strange, how you never have moderator privs when you want them?
You're language doesn't worry me, I just didn't understand what you wrote.
Wouldn't a decent Sun box be better? Or maybe something from SGI -- they seem to be big into high bandwidth, multimedia stuff.
The "upkeep" for Linux can be so much more expensive than NT?
/usr/local and /home, and voila again. Every workstation is identical (DHCP-configured Ethernet), if one goes down a user can go to his neighbor's and log in and be right where he expects to be, and I can slide in an identical workstation and he can log in and be back at work, with his icons and etc. all where he left them. Where is the "upkeep so expensive" here? Takes fifteen minutes to install Red Hat 6.0, another five minutes max to set up the NFS mounts? Then I can use 'ssh' to install OS software updates on all the workstations with one command, while all local updates go into /usr/local and are thus immediately available? So where's the "so much more expensive than NT" here?!
bull****.
My brother is NT-certified. He installs high-end NT hardware to monitor custom-built RTU's (real-time control units) that control things like oil pipelines and utility transmission lines. They do not take chances with these things, they use Compaq servers with NT pre-installed (thus avoiding the biggest problem with NT stability, buggy device drivers).
Yet every week a server crashes. For an oil pipeline, thousands of dollars worth of oil can flow through the pipeline, unmonitored, unbilled, while NT reboots and rebuilds its filesystem. Sometimes they even have to fly someone out to an oil rig in the Gulf of Mexico to get NT back running.
Point: If reliability is $$$, NT ain't it.
Meanwhile, our Linux servers just run. And run. And run. It was a pain to get some of the services going (NIS in particular was painful), but once they're going, they just run.
Workstations are even easier. I just installed Red Hat 6.0 straight off the disk, told it to use NIS for passwords, and voila. Then NFS-mount
-E
Send mail here if you want to reach me.
I read about how D. Becker and the folks at CESDIS used NIC binding to increase bandwidth (binding the NICs together into a virtual superNIC). Would it have a comparable effect to the binding of NICs to a processor in NT?
Just curious...
Codifex Maximus ~ In search of... a shorter sig.
You'd have to detail how this works for me to understand how you can be powered by solaris and nt on what sounds (to my reading) like the same set of storage arrays. I'm really confused about what the post really means.
-Peter
== Just my opinion(s)
Not with a 64 bit system and that sort of system size. If I had to be Intel/Linux I'd use GFS. (actually I'd prefer to scratch the whole thing and go with a Sun UltraEnterprise, but that's another matter entirely)
A plug for the company that pays my salary:
MountainGate sells a filesystem called " CentraVision" which sounds like it might be similar to what you're looking for. It is designed for streaming video and shared fibre channel RAID boxes; there are clients for Irix 6.2 through 6.5 and Windows NT. Partitions can be customized for specific types of media, and performance is top of the line. It is a distributed file system, so there is a lock manager but no server - clients load data directly from the drives. Thus it is scaleable.
This system gets you two things: a filesystem capable of handling large amounts of data with grace and speed, and a copy-free way to share the same data between more than one video server.
If you're not going to use more than one video server, why not look into SCSI-3 instead of fibre channel? The whole point of fibre channel is that multiple workstations can have direct access to drives. If you don't need that, SCSI-3 gives you 160mb/sec transfer rates (as opposed to fibre's 100mb/sec), and SCSI equipment is generally cheaper and easier to find than fibre.
There is no linux port of CentraVision yet. I'm writing this in hopes that you'll phone up the marketing department at 800 556 0222 and ask for a Linux port, thus making it easier for us linux-friendly engineering types to get management to approve the project... :-)
(Just don't ask them about the MacOS port. That's my department, and I'm getting tired of them asking when it's going to be ready!)
-Mars
The Interphase Fibre Channel Adapter (5526) will have 2 different drivers for Linux support.
UNH has developed an independent driver that will be posted on their website, and I will be providing an Interphase supported driver soon.
The Interphase supported driver/hw is being used at the Univ. of Minn. GFS research project. Interested/serious beta site users can send email to mark@iphase.com.
EMC makes a solution for what you're trying to do called the Celerra Media Server.
http://www.emc.com
Great company, we're using them for backend storage for our NT file servers, as well as our HP Oracle servers. Makes it much easier to manage your file storage, very fast, etc.
http://www.plutotech.com/
With Linux on the other hand, you have developers that have the potential to create something better than Microsoft could ever dream up in a million years (even with their billions of dollars), however those developers don't have the millions of dollars to build and test Linux for many of the high-end solutions needed by companies/research groups. When a new technology emerges unless that company wants to oppose Microsoft, they have to allow Microsoft to have access to the hardware to build their drivers, but most of them won't don't open up because they don't want their competitors to get their proprietary information from open source code in the kernel or drivers.
Most Linux solutions are created on the fly with real live systems. They don't have the option "Oh lets spend 3 months build and test these drivers/systems".
The next few years are going to be trying times for both Linux and NT. They both have the potential to win. Linux I know and hope will put up a fight more dreadful than anything Microsoft has every seen. Then they will come to realize money is not everything.
We need to make a FAQ for Linux's limitations.
Linux's 2G file limit is caused by the VFS on 32-bit Linuxae. I think it doesn't apply to Alpha and UltraSparc, though I'd have to verify.
Matti Aarnio's Large File Summit patch takes care of this limitation on x86, PPC and MIPS.
At work we've been looking into a method of setting up a large amount of fast network storage. We know the traditional methods of doing it (one way would be by attaching a RAID-5 unit to our Enterprise 450.)
/*' moments. Because it's just storing file deltas, it doesn't take as much space as you'd think.
:)
We've been talking with Network Appliance. Their NetFilers look pretty good. They claim to be very fast, and are highly scalable. They natively speak both SMB (NT's file system) and NFS (for Unix). ALL they do is storage. You can't program them; they're highly specialized rackmount machines that tie into your NT domains (and presumably your NIS domains, though I didn't ask that) pretty much seamlessly. You can organize your available space as Unix, NT, or shared.
It has one very, very cool capability in the filesystem. You can take a snapshot on a given day/time, which basically means they copy all the inodes at that time. If anything changes subsequently, the original drive data is preserved, and only the new data is written. The inodes in the fresher filesystem are updated to point at the new data, where the old data is still there, being pointed to by the old inodes. It gives you an online revision history, which would be exceptionally nice right after one of those "*doh, I just did rm -rf
Another real advantage we see to it is that because it's a specialized device, if something else on the network needs to be taken down for maintenance (say, the backup server), the network storage is unaffected. The only thing that's at all likely to take your storage down is, well, storage problems. And it's fully hot swappable and running fibre channel internally. It looks really good.
I think you're talking around $100K - $120K or so for a midrange box with 150GB of storage. I believe they presently scale to 1.5TB, and are building ways of scaling further -- they claim much further. Tons of options for high speed networking on it, too.
It's just a thought -- instead of doing what you are doing, which is using PC hardware, switching to a specialized device, with its custom, highly reliable hardware, might be a very appealing solution -- as long as all you're doing is streaming the data off the server. If you need the server to fiddle with the bits before they arrive, you'd have to have another server in the way, which is probably defeating most of the purpose in a specialized unit anyway.
My $0.02. I don't have actual experience with these yet. If anyone does, please chime in.
>> people seeing how much "linux sucks"..." Why?
>> If Linux sucks, why do you not want people to
>> see that, and share your attitude?
Actually, I think that is the only halfway sensible comment he made. While I agree with you about not covering up, it is also not a good idea to promote Linux for something it is not good at.
Now, i'm not an expert on Linux in highend hardware like that, or Linux and video but if Linux is not the way to go for this situation (which seems to be the overall tone of the responses I've read) then we shouldn't advocate Linux for this situation. Why set ourselves up for failure?
Or maybe I have no idea what i'm talking about... which very well may be the case 8 )
Steve
I've had moderate success using the Qlogic fibre channel card under Linux. The performance is excellent, and fibre channel offers a number of interesting advantages over old-school SCSI. But I'm not able to reccomend at least this card, as it's driver is far too buggy for a production environment. On common occasion, the driver wouldn't pickup all the drives, or would timeout when detecting drives. Sometimes the drives would stop responding all together.
I agree whole-heartedly with some of the people here that a different hardware vendor would be favorable. SGI machines generally have much higher bandwidth bus architectures, thereby are more suited for your particular application. They may be expensive, but (I'm no expert here) 1 SGI machine may be worth 2 or more PC's doing the same streaming.
If you must use Linux with PC hardware for the streaming, I'd say go with old-school SCSI. The stability of the SCSI drivers is excellent. There are very large SCSI drives available now, so the 127 device chains of Fibre Channel become a minor selling point.
-Brendan
"Note, I work for EMC, which also makes large multi-terabyte storage systems."
When EMC came to my company to pitch your products about 4-5 months ago, I asked about Linux support. "What's Linux?" was the wrong answer, and yet, they gave that as an answer. At the time I told them they should seriously be looking into Linux as that was part of the decision of bring EMC into our organization.
I'm hoping things have changed at EMC....
> Windows NT currently supports 4 GIGs of RAM.
> Linux only supports 1 GIG... but I have heard some people say that it supports 2.
Erroneous. NT and Linux both have the same split memory model, 2 gigs for user, 2 gigs for kernel. A patched kernel on NT allowes a 3/1 split, as does a patch for Linux from SGI. It's quite disingenuous of Microsoft to push this line, and as it's more easily disproven than performance numbers, it's likely to bite them in the ass if they continue to.
I've finally had it: until slashdot gets article moderation, I am not coming back.
It's funny how this is an ask slashdot today, as I asked a question to a newsgroup yesterday. It seems like there is limited linux support, but it exists if you get the right card. It also seems like Fibre Channel is very bleeding edge tech -- it makes me wonder just how good the drivers are simply due to lack of usage. I did not know this when I bought a fibre drive by mistake (Don't ask). Now I have a nice 9 gig fibre drive, and nothing to use it with. Also, would you be able to get faster rates using a card and drive on a 64 bit PCI like one found in an alpha? Finally, any suggestions on where to get a cheap controller card?
I have to agree with a number of the other posters.. we've explored using large Linux-based systems at my workplace for storage and internal video/music streaming, and in the end decided to go with a 20tb hierarchichal storage system based on IBM storage and IBM ADSM HSM software, with a couple of high availability Sun servers.. The biggest problem for a project like this, i'd say, is that you want to _guarantee_ that your machines can be up full time. NT can't do that because of stability issues with disk and I/O issues like this... Linux can't do it for the same reason, plus support issues. If you don't know how to do a project like this already, doing it on Linux isn't going to be the best idea.
I love Linux, but at the same time, I have to be realistic about its limitations. Linux (and every other operating system) isn't the best tool for every job...
Standard Disclaimer: I do _not_ speak for my employer.
---- noi non potemo aver perfetta vita senza amici -- Dante
The only reason that I didn't suggest SGI, who certainly do produce excellent products for this sort of thing, is my own worries about SGI's future in the UNIX market. Their strategies about NT in the desktop environment don't exactly engender the most confidence in their ongoing support of IRIX :) Admittedly, I do have a Challenge S and an Indy lashed on to my DSL though...
hee hee.
-s
Standard Disclaimer: I do _not_ speak for my employer.
---- noi non potemo aver perfetta vita senza amici -- Dante
We've been putting together a RAID system for our office, using Linux on intel. SCSI is no problem -- Mylex and others have excellent controllers, and even the DPT stuff appears to work.
However, DPT Fibre Channel has been an unmitigated disaster. Unless you purchase DPT hard drives, you are basically on your own. The latest Seagate 18 GB Drives (ST318203) do not work at all ("oh, we haven't tried those yet"). Neither do the older Seagates (ST118202), despite tech supports assurances to the contrary ("We know they work, they've been tested"). Don't even get me started on the inconsistencies of what tech support claims ("we never told you that would/wouldn't work", usually as part of a "it's the drives, not the controller" refrain, and in direct contradiction to earlier statements, with the offensive implication that I am somehow lying about what I was told earlier.). Despite Seagate's extreme helpfulness, well above and beyond the call of duty, it looks grim. Unless you are feeling particularly masochistic, or enjoy having your project used to debug their hardware, I strongly suggest avoiding DPT for your project.
In fairness, there is a (slim) possibility that the controller is defective -- a replacement should be in later today or tommorow. I will follow up with a note mentioning the success or lack thereof. If we are not successful getting the replacement DPT controller to work, we will probably return all of our DPT equipment and abondon fibre channel as too bloody a technology (still, even after two years) and look into a Mylex Ultra2-wide RAID solution instead.
The Future of Human Evolution: Autonomy
Subject says it all. After three weeks of struggle with this thing by tech support at DPT, Seagate, and myself, with all of us dispairing of figuring out what was wrong, I did something I should have done much earlier. I swapped out the PC with another which was earmarked for another task, and which had a different motherboard. The Fibre Channel controller is now recognizing the drives and their capacity properly. 1+0 RAID arrays are building as I type this.
/. The other techs have been more helpful, and Jackie Wolf's efforts have been nothing short of heroic. Ditto for our mutual contact at Seagate. This project, which I was dispairing of ever completing, is now back on track, albeit behind schedule.
In light of the recent, extensive help the folks at DPT have given me, as well as the folks at Seagate, I retract most of my negative comments about their service, and all of the doubts I raised with respect to their product. It seems I had the misfortune of getting one grouchy tech on the day the question was posted to
In summary, the DPT Millenium Fibre Channel (dual port) controller is working fine with the Tyan DLUAN1836 Motherboard. It had problems with the ABIT BH6 Motherboard. Seagate disks work flawlessly, as long as they have the latest firmware (ST118202 v 0006 or later and ST318203 v 0002 or later).
The Future of Human Evolution: Autonomy
I am working with a major telecom company who has invested about $1 million in SGI hardware and configuration services, and I have to say that I am unimpressed, at least so far, with SGI's MediaServer thingie. Go with SUN.
If you go to sun's web site... there's some rather interesting news about their new servers. Apparently they're breaking some I/O records. I'm guessing this would be good to look into. Plus, don't the UE 10k's handle up to 128 processors? If I'm right... (not 100% sure) for the money you've spent... you could have a machine about 2-4x as powerful and about 256x more scalable... (depending on OS)
Linux is not 100% on big multi-processor machines. Big thing that was pointed out with the Mindcraft benchmarks was that Linux does not bind seperate NICs to seperate processors. The question is, can you bind your gigabit NIC to a different processor than your Fibre Channel setup? The major problem with Linux and multi-processor machines is that the kernel hackers don't HAVE quad processor Xeons :) And since only a small fraction of hardware support is paid and not just people writing drivers for what they have, unfortunate little work gets done on the big high-end machines :( Linux is not (yet) the answer for mega-machines like that, but for clusters of smaller machines, it's the way to go.
:)
Oh, and BTW, I realize I've said some things that may get me flamed (I mentioned the Mindcraft benchmarks!) but hey guys, I'm a Linux fanatic as much or more than most of you, but you gotta be realistic.
If I'm not back again this time tomorrow...
ok you want video and you want TB of the stuff
you want it fast
!!!! you dont want NT unless your hardware can get round the prob
SGI is where its @ people
all the rest are like patchs and fudges to make linux do this or research for the future which is a good thing.
but SGI do this for a liveing
THATS THE DIFFERANCE
a poor student @ bournemouth uni in the UK (a deltic so please dont moan about spelling but the content)
err no sun make sure that they get into uni's so people know their hardware
SGI make you pay and well they should if you want supercomputers get a SGI
ok their software is top notch you are just no familair with it thats all a SGI user can walk up to it and expect it to be the same
you have to learn a bit
john jones
a poor student @ bournemouth uni in the UK (a deltic so please dont moan about spelling but the content)
"Often, the upkeep for Linux can be so much more expensive than NT"
What??? Care to explain that...what the current per hour rate at M$ for support?
You should be aware that Ext2 tops out at 1TB, though that may get fixed.
Also, when dealing with large Ext2 file systems, they take forever to fsck, mainly due to the number of inodes. You can help overcome this by reducing the number of inodes. For example, if you know that the average file is 10MB, then maybe only have one inode per MB (instead of one per 4K). You should also use sparse superblocks (which requires a 2.2 kernel).
Minor correction: EMC and Clariion are separate companies. With that in mind, my post makes a bit more sense. Hence, EMC's sales force is what you get when a competitor's system goes down. EMC's customer-service is what you get if EMC equipment has trouble.
Also, redundancy is the key to any high-end storage system. While the individual system will have redundancy (mirroring, raid, etc.), you also want remote mirroring to a separate storage system, possibly in another city for disaster recovery.
My obviously-biased viewpoint is that while you may have had some trouble with your EMC setup, in general, EMC is more reliable, faster, and more proven than solutions from any other company.
There are two places to look for Linux fibre channel support. First, at the University of New Hampshire, someone is the leading force behind fibre channel drivers. Second, at the University of Minnesota, the GFS project uses fibre channel for their Linux-based file system work.
Note, I work for EMC, which also makes large multi-terabyte storage systems.
You might be interested in GFS. It's a cluster filesystem. The idea is that a bunch of machines all talk to the same drive(s), instead of going through a single server.
http://www.globalfilesystem.com/
Generally, it is a good idea to have multiple servers in case one goes down, but it also makes sense to consolidate storage. Storage systems like the EMC Symmetrix or Clariion raid arrays are designed to *never* go down. (When the Clariion does go down, the EMC sales force will be there. :) EMC can even do live upgrades of the embeded code. (Imagine upgrading to the next Linux kernel without rebooting!)
The point is that you can't get the performance and reliability out of a small storage system that you can out of the enterprise storage systems. Of course, in some cases, you can build a system based on replication of the data, which for static data may work, but often as not, when all the costs are factored in, you're better off with a consolidated system.
Part of the fun of Ask Slashdot questions is that they not only answer the original question, but they explore all the related tangents.
So if you're interested in fibre channel, multi-terabyte storage systems, media servers, and such, there's probably going to be a lot of interesting stuff here.
And as to your point, Linux may be a good solution if they can figure out the server architecture. They're already talking about using a cluster of servers, so the CPU power isn't a big issue. I wouldn't assume that not knowing about the level of fibre channel support indicates that someone is clueless about Linux in general.
Rule #1 - what is the limiting factor?
Bandwidth? Machine costs? or your time?
It really puzzles me how people wish to skimp on the hardware then set themselves up for later hassles and risk of (expensive) failure.
Lets take a look at a baseline video server from SGI, Origin 200 with MediaBase with say 100 seat configuration, web management tools + 200 Gbyte FibreChannel and network bits would cost about $50K minimum upfront (guesstimate here based on educational discounts and extrapolation of bits and pieces we've purchased over the years) + 10% maintenance/year. Extra for their FailOver system.
OK, now the hand-rolled Linux version. You need to look for
a) streaming software (Darwin?) for multiprocessor
b) decent high-end file system (port SGI XFS?)
c) tuning the sucker for the best SCSI and network parameters
d) video library management software (none as yet, perhaps someone port SGI OpenVault?)
e) system management to monitor the whole thing
Lets assume you've got a collection of genius hackers at 1 man-years worth at each task, working for nothing except glory, you can probably get it done for $20K and 5 man-years worth of pizzas and coke.
Cheap at that price.
Rule #2 - If you don't know what you're doing, make sure you get damn good advice from people who've done it before.
Rule #3 - You pay peanuts, you get monkeys.
LL
IMHO:
You can rule out both Linux and NT right off the bat because PC hardware is not intended to do what you describe. It isn't that it can't do what you describe, it's just that it isn't going to do it reliably in any sort of cost effective way. Like previous posters I would suggest SGI, Sun, or maybe even IBM, Hitachi, HP, or a few others. Now you're into different OS because of the different hardware required.
Remember that PC's were/are designed as light to medium duty machines.
What you're doing sound like something that's a lot of fun to do if you have a spare $quarter-million to tinker with. You'll be advancing the state of the art and I know you'll have a great time doing it. I would love to join you, it sounds like so much fun.
But what you're doing is also totally unnecessary. Maybe you need this thing for some big television or movie production but in that case you need the reliability of hardware designed to do this. But I have to agree with previous posts that your described intent doesn't justify the huge expense.....you could get by with lower cost solutions.
If you really are doing this just to advance the state of the art then the choice is simple: GO LINUX. You won't find Microsoft interested in helping you tweak NT to your specific needs on this project unless they get a major piece of the action. Nothing beats Linux for advancing the state of the art....and having fun while you're doing it.
. Quit playing Monopoly with Bill. Switch to one of many non-Microsoft products today.
You might want to post some benchmarks (I know, lies, damned lies, and benchmarks) comparing CLARiiON FC boxes to comparable EMC boxes before posting something like that.
I understand DG carries a bit of a reputation with it, but CLARiiON arrays are generally considered top of the line for the market they are in. CLARiiON also gives some features EMC boxes just don't. They may not be as high end (8 GB cache might not be available), but they compete very well in their market. If CLARiiON has a fault, it's more with their PR departments than their product itself. [This URL shows how NEC recently broke some record for best performance on a specified benchmark using CLARiiON boxes for storage: Press release.]
Also, the previous AC who said CLARiiON was porting their drivers and management software to Linux, pending a port of a third party tool; I wasn't aware such an operation was officially underway. A search of CLARiiON's web page shows no reference to the word "Linux" anywhere.
-SDog
Not representing or approved by my company or anybody else.
It's always fun to concentrate on what Linux can do really well and employ that with absolute no nonsense sweetness. but -
My views on this is that the more discussion on what Linux CAN NOT DO is very important. All it will do is open the door for more support, more efficiency and more enjoyment for those that choose to use it. Maybe some pages should/could/or already are setup to note -
-supported/unsupported hardwares.
-what Linux can/can't do and it's efficiency on that note.
I'm sure a central area like this could be of nothing but benefit for the ever so funky LinuxOS
and the people that develop and use it.
if there is already something like this, slap me if you please and then leave a URL.
cheers
ORiON
- We seek not the answers, but to understand the question.
There are issues (at least I read about them) with threading needed in the Linux TCP/IP stack. I do not know how big this issue is, but study it yourself. I am also aware that Ext2 (or the Linux kernel in general) has a maximum of only 2GB on a 32bit computer. They only way I know around this is to use a 64bit computer. With you handling multi media I would think you could run into a 2GB limit quickly.
While I think it would be really cool if you could pull it off, the potential of disaster here is huge. Your primary objective here should be to get this system to work, and work well. Look carefully at the technologies you have available. If NT or Sun is the best way to go then, swallow your pride, and go do it that way. Give Linux another few years to evolve and be 100% ready for this kind of task. By then they will want a new system anyway.
But if Linux is not ready for the challenge, and you use it anyway, you will make the OS _AND_ the project look bad, and many of the not-so-technical tech people will wary to try either again very soon. And, needles to say, this would give Microsoft more ammunition.
Sorry I can't give you much than this.
FYI: I use Linux as my primary (only) OS at home and (before I graduated) was the administrator of my High School's Linux web server. I love the OS but we need to be carefull not to say it can do things it can't.
END
You have a machine here that is way into six figures and you want to try Linux to save money? Guy, the cost of NT is neglible in your case. By all means ivestigate your options, Linux, NT, Solaris, etc, but dont be an idiot and try to save an extra 10 or 20 grand. Choose the best OS (whatever that is) and you'll end up with the cheapest solution.
This is not the greatest sig in the world, this is just a tribute.
Bullshit. NT crashes often and needs too much more handholding than _any_ Linux server. Crawl back into your Redmond reality. Linux beats NT, when it comes to reliability, cost-effectiveness.
I did some work for Clariion once, and I now work for EMC (because they bought the company I wanted to work for), and I have to say that I wouldn't consider them to be in the same market. Each has some features the other doesn't. Some of those features are useful, some aren't. For example, Clariion likes to make noise about "end-to-end FC" when it really doesn't matter (to a host) whether it's FC, SCSI, or duct tape on the far side of the array controller.
So what are the significant differences? EMC arrays scale to more disk capacity, more cache, more ports. EMC arrays support load balancing between concurrently-active array controllers better, though still not as well as I believe they could/should. (BTW, be very careful before you challenge that claim, since I wrote some of the original software to do this on Clariion, and within the organization that did so on EMC) EMC support is very highly regarded, to the extent that we're wary of even someone as good as HP "diluting our trademark". Think about that. Of course, EMC equipment also costs a bundle. As for performance, that's mostly a trackless swamp that I won't get into except to note that you can't talk performance without talking scale. In my experience, talking now about both storage systems and hosts, optimal low- or mid-range performance and high-range scalability are rarely found in combination. More often, some companies focus on winning the benchmark contests at the low or middle ranges - and they succeed - but don't even have products in the high range. That's not a criticism; it's a perfectly good and moral business strategy. In the end, the only valid benchmark is your workload.
I'm not going to make any recommendations for either EMC, Clariion, or any of our numerous competitors in either the disk-array or NAS spaces, though. Just offering some food for thought.
Slashdot - News for Herds. Stuff that Splatters.
Sorry to be wasting our time with this kind of foolish flamebait, but I must take issue with this. How much was D.A paid to post this? I suspect that MSFT will give you lots of money to be a jerk like that. Aside from FUD to people outside of the hacker community, this could be a tactic to demoralize hackers. It sure spoils my motivation to get out there and code. I wouldn't put it past MSFT. I guess he just wants to make a buck...
Aside from conspiracy theories, why do people post pure flamebait like this? D.A, do you actually feel this strongly that Linux sucks? Why? And, if you do, why do you bother to come here and post comments like this? Why do you hate people that don't know as much as you (assuming you have any idea what you're talking about)? The unlearned are to be educated, not hated.
Also, an attitude D.A expresses is:
"I'd rather him run NT and not risk more people seeing how much "linux sucks"..."
Why? If Linux sucks, why do you not want people to see that, and share your attitude? Any attempt to sheild an audience from the reality of a product is really missing a major point of many developers: to give people choice over their software. Choice is not really choice, if you do not have real data to base your decision on! If Linux sucks, let them see it. If not, let them see it. There is no reason not to.
One more thing: you don't have to post flamebait. printf("rtfm: %s\n", location_of_document);
Thomas
-- rm -rf / tells you if you have root or not
>While I agree with you about not covering up, it
>is also not a good idea to promote Linux for
>something it is not good at.
True... I hadn't construed his comment that way. I suppose consumers might see that Linux is not good for streaming video, and decide that it must be good for nothing, and that would be bad...
Interesting issue. I thought (and I bet he did too!) that it was just flamebait!
feh.
Thomas
-- rm -rf / tells you if you have root or not
You are not going to beat an SGI Origin 200 at price/performance on an application like this. This system was designed for exactly this type of continous, high throughput data serving. SGI already uses Clariion in their Fibre Channel solutions, and has a very solid Gigabit ethernet adapter available. SGI's XFS filesystem has a 64-bit journaled architecture which can address filesystems in the thousands of TBs.
The O200 starts at $10-12K, so you could use FailSafe to have redundant services available.
Hmm, using that logic, everyone should still be buying just IBM solutions. IBM is still a much larger company and spends much more than Microsoft. :)
Everything is relative and specific to each case. For us, I looked into Exchange to support our 25,000 user mailboxes. The client access licenses alone -- even with our huge educational microsoft select pricing -- is over $100,000. The hardware required to run an Exchange server that size is out of this world. Hell, you're not even supposed to have 25,000 users in a single NT domain no matter how big the box is.
Contrast that with a simple unix/linux/*ix box running sendmail for peanuts.
All of us should try and keep bigotry and prejudices behind us. There is no ideal solution that will fit every need.
The anual Siggraph show is starting next week (or is it this week) in Los Angeles. In addition to all the 3D stuff, there are lots of booths with Video editing and serving in mind. I think the cost for the 3day show is $89.00 US. Go to
www.siggraph.org to find out more.
I was there last year at the show in orlando. There were over 12 vendors demonstrating media serving technologies.
We have a .5TB netapp here. It is used for logs, home dirs, databases, ....
I'd bet that with the dedicated hardware, a netapp with a GigE interface would spool data to the net as fast or faster than a {Sun|NT|Linux} box would, regardless of caching. The dedicated hardware, and an OS that fits on a floppy would just plain do a better job.
However, you are still running RAID[45]. These have adequate read performance, but cannot exceed the speed of a single mirror. Since the server is to be primarily read dominated, RAID0+1 (striping and mirroring) may be a better choice. It still provides data integrity, and much better speed in this application. Although you need more disks to achieve a certain size volume, the controller hardware is cheaper, and generally more reliable.
With any decent stripe/mirror implementation, each mirror can satisfy read requests independently. Having 4 mirrors is like having 4 independent copies of the data. Need more bandwidth to the files, add another mirror....
Bottom line: If you want RAID4/5, and NFS or CIFS, the netapp is they way to go. It will be faster than any general purpose box with comparable equipment. If you want RAID0+1 or any other protocols, go somewhere else.
However, the original note specified GigE as the network connection with ~144Mbit active load. Just about anything can supply that load without a problem. Unless you are planning to up the network bandwidth to 500+mbit, this discussion is headed in the wrong direction.
Eric Brandwine
Eric Brandwine
An engineer is a person who solves a problem you did not know you had in a way that you do not u
I haven't used it, but Intel sells their AC450NX "Server Platform", a quad Xeon board sporting five 64bit PCI slots and two 32bit buses for six 32bit slots. It also supports 8gig of ECC (isn't that architecturally impossible?)
Yours for the low, low cost of an internal organ of your choice.
http://www.intel.com/design/s ervers/ac450nx/index.htm
--- Bigger bits, softer blocks, tighter ASCII.
First, get a quote from Hitachi Data Systems for their 7700e open systems platform. You will be amazed at it's price, redundancy, and scalability. It supports up to 16GB of cache, and scales to 6.5 TB. I saw a quote for one of these that included support and the management software for about $250K (Not including much disk). It has a large initial footprint, but scales by adding disk cabinets, so it isn't THAT big for the top end of scaling.
Second, NT and Linux are both bad ideas for this kind of large file, massive network connectivity environment. (I love Linux, but 31bits of addressing is very limiting on large servers.)
NT is bad because:
If you find that your hands are bound with using NT, then look for alternative FileSystems. Veritas might be a good place to start looking. I don't know if they have NT FS, but one of their account executives could point you to someone who does.
You could use HP/SGI/SUN/DG. I know that HP is expensive, but it's management tools are nicer than SUN's. SUN has nice/easy hardware, but if you don't have in depth UNIX SA experience, you will have problems making the box admin friendly. I haven't worked with SGI and DG enough to comment, but I know that they would meet the requirements.
Lastly, you might be much better off looking into the client server communication protocols involved and seeing if it might be possible to use many hosts behind a CISCO Director or a BIG-IP box, (load balancing, redundancy through the IP layer).
Just thought you might want to make sure your company doesn't care about maxing out ram.
Windows NT currently supports 4 GIGs of RAM.
Linux only supports 1 GIG... but I have heard some people say that it supports 2.
For a high end video server, RAM is just as important as HD space or CPU power. NT may just be the better choice in this situation.
BTW, using NT for this solution won't increase your overall costs as much as you may think.
Often, the upkeep for Linux can be so much more expensive than NT (if you build your machine correctly) that it more than covers that initial costs of the OS.
Lemme see... you're getting this huge system, Quad Xeons and all that... and you're worried about the cost of NT as opposed to Linux?
...Student, Artist, Techie - Geek *
Hmm...
Mong.
* Paul Madley
*...Slacker, Artist, Techie - Geek *
Remember: Nothing is Cool.
>>Software crashes usually aren't OS related. Windows terminates applications that start to do bad things.
um, I dont think so. from my expearance, this mechanisim has a sucess rate of about 25% in NT. This is one thing that I love about Linux, I realy can kill a rouge application ( Netscape comes to mind ) without re-booting the machine. There are several times when an NT server just froze cold, no BSOD, no error messages, nothing. now in some cases, it was a hardware problem like bad RAM, but atleast with linux, it gave me some idea of what went bad. with Linux, I have had 2 panics in the past year on about 3 machines. in one case, I accidently disconnected the IDE cable from the disk that held the swap partision, the second, I had a bad root disk on a slackware install....
Draw up a design document with all your requirements and then get talking to solution providers (be it IBM, SGI, Sun, etc etc etc or 3rd parties) and get them to present you with proposals and costs of their solutions... If what you are trying to do is something relatively new or is of a much bigger scale that done elsewhere, many of the big players will probably come in with a nice offer in return for being able to say... 'you want solution X, we know how to do it on a very large scale as demonstrated on site Y'.
The chances of getting that sort of help sound reasonable from the scale to which you are looking, and they will also probably give you a nice support package and good response as their pilot.
If you want to put in a system that lasts, go pro, otherwise it'll be you that won't last from spending too much time fixing/tweaking/everything else on the system.
just my 2p worth...
-~ Given a choice between two theories, take the one which is funnier. ~-
Buy ten small boxes and you've just increased the chance that your system will go down by a factor of ten. I wouldn't look forward to doing any sort of driver, BIOS, or software update on the 10 small boxes let alone take each one apart and upgrade memory or processor.
:]
I also think it will be difficult to come up with a simple Redundant Array of PCs (RAP?) solution where any single system can fail and the others will take over seamlessly. However, if you achieve this in a few lines of perl please post the source.
My 2 cents - buy that big fancy server. Pay extra for features like redundant power supplies, fans, and RAID 5. Get a rack mount. Add capacity by adding RAID enclosures.
At Enterprise Rent A car we have 128X5.3Tb storage systems and they are powerd by SUn Solaris and NT. NT does just fine. The data never gets lost for any reason at all. If a disk goes down the system automatically picks up a spare and rebuilds the data with no problems. Our storage systems are made by SUN ( or that's what it says on the massive cabs ) Our Data warehouse is one of the largists in the USA and it runs on Solaris and NT just fine.
Clariion will be porting all their stuff to linux as soon as we get a linux port from RogueWave.
I use to be worried that I was apathetic, but I just don't care anymore.
I have been designing systems like the one you are attempting for a while now and must admit that making Linux do what you want it to is a beautiful thing. Unfortunately, this will depend on how much work you are willing to put into tweaking it into shape. I often get tempted time and again to cut my costs with PC technology, but in the end... I wuss out and go with midrange (E450 - E4500 ish)
sun boxes and Veritas cluster server. Its gawdaful expensive, but when the customer dosent care about $, it goes. If you have time to tweak though... and limited cash.. then PC may be the way, and linux has the reliability. Just remember, a PCI bus and a fast processor alone do not a bandwidth slinging server make, no matter what intel tries to tell you..!
I'm not sure if the intended output is for a desktop PC or a television. If television is the destination, you may want to look at a box made for serving video. Yes, some solutions are on SGI. Avid uses them as well as some others. Quantel, Sony, Pinnacle, Techtronics all make servers. A lot of these boxes are able to mix and match different formats. Most offer uncompressed, MPEG2 and DV formats.
How about some web sites?
http://www.tvbroadcast.com
http://www.videography.com
http://www.tvtechnology.com
Scott
It's mandatory to wash your hands before returning to the land of Dairy Queen.
I'm surprised no one else has mentioned going toward a Mac platform for what you want to do. You mention that you want to stream video, but don't specify the format that you want to do it in.
.ASF - Uh, yeah right...
.RM - Real may be the standard but it is a server hog and the cost for the server software... Yikes!
.AVI - We're talking streaming here...
For streaming seriously look at the Quicktime format. The server software is only $500 or so and runs like a dream. Everyone can use it and there are plenty of encoding solutions out there to make QT files to stream out.
Our current solution where I work for doing what you propose, is filming everything in DV, encode it in a render farm consisting of 4 G3 computers with BlueICE cards, all fiber channeled to 360GB of drive space.
If you're looking for quality streaming video at a great price, seriously look into getting a couple of G3s. You don't need to spend thousands of dollars of multi-processor NT/Linux solutions when a non-expensive non-MS platform already exists doing exactly what you need.
This discussion has been pretty informative on how to build a high BW server, but I'm not completely convinced that this brute forced method is the best solution for what is essentially a TV broadcast. ( but with interactive designs )
Why not use Multi-cast video streaming with channels that repeat every 10 minutes or so ( kind of like pay per view with 4 channels that break into half hour blocks ). This way you have a fixed number of connections that scales with the number of titles, not the number of users.
Thus you have clients request a title, and on the next 10 minute block, all users that have requested that title ( or more specifically a 10 minute section of a title ) will be multi-casted together. The clients can be set up with varying sized caches ( based on their free disk space ), so if they need to pause or replay a section, BW doesn't have to be eaten ( and while they are pausing, they are still downloading future portions of the movie along with anyone else. ) I'm not sure how well a mainstream player such as RealPlayer, media player or quicktime would work ( not familiar enough with them ), but if this is an educational project, a custom built system shouldn't be too hard to tackle. ( though a mainstream ( no pun intended ) product would be easier to maintain ).
There are many variations that could be played with. The big problem that I see with brute force is that if you have 5 guys in a hall downloading the same 1.2Meg/s file, the BW is going to be too saturated for the rest of the hall downloading porn ( in jest ). Perhaps there is a switching configuration which would allow this media to be completely independant; I'm not experienced enough in that area, but it definately doesn't work on a generic shared network.
The cool thing about this sort of system is that you can easily divide an conquer. If your 150 connects are all independant titles, you can cluster them ( one, two or three titles per server ). Here you could easily take advantage of cheap, stable Linux and x86 and build 50 or so $3-400 boxes for $15,000 - $20,000.
You could stick with simple UDMA IDE drives ( $150 a piece ). You could even load directly off a DVD player ( $90 a piece ). Down time would be minimized and localized. ( tape restore to a new drive if one fails, and you only lose the title(s) local to that disk. Or not at all if using DVD ).
Titles that are more popular could have enough RAM installed to cache most of the system ( aside from the mentioned 1-2G limit on Linux ).
Because broadcasts are in blocks, they can be loaded as a continguous chunks ( 1.2Mbps * 600s = 72MBytes per feed ), and can easily be cached in memory. This removes the need for _any_ high performance disk hardware ( including SCSI ), since it would only take 9 - 36 seconds to load a chunk into memory.
A 2 hour video clip would only have 12 segments, and thus your 150 titles would require a maximum of 1,508 channels. If properly switched and distributed, each machine would only need 12 - 36 channels or 14.4Mbps - 43.2Mbps, which could be handled by a 100BaseT NIC rather comfortably. Thus all generic parts seem to suffice.
The only penalty I can see is that a user would have to wait a maximum of 9.9 minutes before their feed begins. I think this is a fair trade-off, given it's one step closer to interactive TV.
-Michael
Uh "ok, sure, you're smart".
Think about it. Who makes Microsoft software? Microsoft the company or Microsoft engineers? All Microsoft employees are collegde grads, many of Microsoft's researchers are Phds.
You think a 20something sitting in a basement could do and think up things better? I've NEVER NEVER seen anything original come out of Linux. Eventually, with a lot of wasted effort, Linux hackers will reinvent Unix - a bloated Unix.
When MS want to make something, they spend a lot of time planning how the APIs etc should be arragned, rather than just hacking it just do it works. If you're a programmer, you'll understand how you can just DO IT to make it work for you, or DO IT the longer way and make it easy to maintain.
With the release of Windows 2000 (RC1 is out, and Beta3 is public) you'll realise that Windows NT is ahead of Linux in so many areas. Microsoft does what it does for profit, but also cause Gates has a fascination that many of us share for "cool" technology stuff - why do you think he's got a house built on computers?
Windows 2000 already supports 32 processors, 64GB memory, has a solid and proven distributed application foundation, and is very fast comapred to Linux on mid-high end computers. Sure, Linux can run on old 386 machines, but as long as you give Windows that 'initial' helping of resources, it goes up from there.
Remember, until Linux came along, MS were basically the cheapest solution compared to "the others (sun etc)" who were charging an arm and a leg. Now Microsoft comes along with something with the potential to match Unix, and everyone bashes MS for price??
I'm sorry, but you have to be more specific. What exactly about the win32 api is "shit" to you?
Poorly documented? Hrm, get a copy of the MSDN cds, go out and buy a book on the Win32APIs or goto msdn.microsoft.com and read, or ask questions.
It's the most complete API I've used.
I'm going to be pretty quick cause I don't think you understood my intention.
:P
An NT solution *could* and most likely will get into 6 figures with your arrangement, but this is oppsed to the 7 figures of traditional Unix solutions. If you get it from someone like IBM you can get 99.99% type assurance etc etc.
I don't think that everyone should run around (and they have been for years) yelling Linux is better! Linux is faster! Linux is better designed! Linux is based on 30years of good computing (unlike NT which didn't learn from Unix) - oh but Linux isn't Unix - it's better.
NT is good for many solutions, as is Unix. Linux however is still immature, and I don't like the programming model. Too unset.
And I've lost more than a *wee* bit of data with Linux - which is why I wouldn't use Linux for a data or file server. Try letting linux play with the harddisk and then turning the machine off and on a couple of times. Eventually, Linux won't be able to boot, and you'll have to do a manual fsck, which sometimes doesn't work.
Windows NT/2000 doesn't mind it at all.
And Microsoft has BILLIONS, but they don't spend ALL of that on Windows. Most of it goes into theory and experiemental research. I'm glad someone is spending that kind of money on that sort of stuff these days.
Microsoft has billions. Microsoft spends all those billions developing Windows. That doesn't make sense.
As for that problem with the PPC, bad luck, but it looks like a problem with Pocket Streets rather than WinCE. Software crashes usually aren't OS related. Windows terminates applications that start to do bad things.
Ok, so i wasn't that brief
NT has proven to be faster than Linux with SMP, and it also works very well with TB databases thank you :P
From what I've heard, 100baseT has only been able to sustain transfer rates of 45mbit / second max. I have not seen any systems where the 100baseT does not get start to max out at this rate. It's worth keeping note of because you will likely not be able to stream more than 35-40 1.2mbit streams simultaneously without multicasting over a single fast ethernet connection.