Large-Scale Video Archiving?
BondHeadGuy asks: "Ok, say you have 1000+ cameras emitting 30 frames/second worth of 640x480 grayscale video...and you have to store it indefinitely. What do you do? This is a real question, believe it or not. 30 frames/s * 300 KB/frame = 9 MB/s per camera. 100:1 video compression brings that down to ~90 KB/s. But 90 KB/s * 1000 cameras = 90 MB/s, or ~8 terabytes/day. Retrieval, though, can be essentially arbitrarily slow. Reliability should be good enough to not be annoying long term. Is there a solution that: has 8 TB/day storage capacity, can handle the 90 MB/s write speed, and lets you save some bucks on the (slow) read side?"
Although you are probably looking for a digital solution, don't overlook the solutions that already exist. Security camera VCR's (available at RadioShack et al.) can put 24 hours (or more) of video on a single VHS tape. Get a few VCR's (at $200 each), and a pallet of VHS tapes at Sam's club, and you could record all the video you want!
Like a security camera in a stairwell or something? In that case you can use motion detection to start/stop recording and save well over 100:1. The choice of video codec is going to be important if it's for security (so faces, etc. can be recognised), but if not, you can crank the compression ratio up quite high on most codecs, especially the video codecs that do frame-by-frame motion differencing (i.e. not MJPEG).
Be pragmatic and only archive 15fps. This cuts your archive media costs by ~50% no matter what solution you choose. 15fps should be adequate, although who knows your exact parameters.
A suitably large DLT library with a fairly large number of drives would probably do this. Couple it with some HSM (Hierarchical Storage Management software) and you're probably all set.
In terms of sizing, assuming you get 6MB/s per DLT drive, you'll need at least 15 drives. Go for 20. This gives you room to do cutovers, and the like. I'd recommend fronting this with a LARGE disk for scratch space (preferably solid state, but if that's not in the budget, a big old SCSI disk'll do.) You'll need a pretty hefty server to handle all this (at least a pair of Sun E450s for redundancy). You'll also chew through at least 200 tapes a day at a native capacity of 40G/tape.
HOWEVER, this is by no means cheap. The virtue of the fact that you're talking about 8 terabytes a day should be a clue to that. The sort of tape archive, tape supply, and tape library you'll need is... vast. You're talking very high-end hardware here. You'll need a good cataloging system, and some serious software to maintain all this. You'll need to keep about 75% of your drives streaming all day every day. Tape costs alone will run to about 10k/day, let alone electricity, storage, maintenance and initial outlay. I'd venture a project like this is probably a $15 million dollar outlay to do it right, with at least 2 full time support staff and budget on the order of $40k/day . But if you've got the money, go for it.
100:1 video compression brings that down to ~90 KB/s.
Very interesting problem, with one more very interesting challenge that hasn't been raised yet:
Because the video is streaming in 24/7, you'd have to build a real-time compression system that could handle the 9MB/s and produce a 100:1 ratio. You could perhaps distribute that across multiple machines/CPUs, or build a custom parallel hardware setup to handle the encoding, but at this scale, the overhead of everything might prevent you from reaching the essential criteria of real-time.
Does anyone know what the hardware requirements are for real-time encoding one 640x480 stream? Now, multiply by 1000.
Since you did not state a retrieval time or storage/retention needs, I am going to offer to scenarios; one for long term, fast access storage, one for short term and/or slow access storage.
Storing 8TB/day for a long time with quick access would probably require a tape silo, which is essentially a tape library the size of a small house. StorageTek is one of the leaders in silos (And might be the only vendor making them these days.), and they make some pretty nice stuff. Their PowderHorn 9310 is a nice model for bulk storage and quick recovery. A downside to the silos is that they do not often handle DLT tapes, which can make it hard to use tapes outside of the library.
If you do not need fast access to the data, and have time to root through tapes for restores, just get a smaller tape library (Anything in the 50-100 tape range from ATL/Quantum Adic or Qualtstar running SuperDLT drives controlled by Veritas Netbackup would give you an easy way to handle all the data. NetBackup has excellent archiving capabilites (IE record data, wipe data from disk.), works on just about any platform out there, scales well, and keeps files in GNUTar format for easy access. As for storing the tapes themselves, if you have a small retention time just keep around a few hundred tapes to cycle through. If you need to store the data for a long time, get a few thousand tapes and a set of nice shelves to keep them on. If you do not have somewhere to store them, Iron Mountain does a great job storing data, I have worked with them before and toured one of their facilities, and I can vouch that they do a great job storing data.
I've looked into almost this exact problem (we had about 100 hours of full color video/day - broadcast quality)
Your going to have to get VERY friendly with your local "Storage Area Network" vendors. What we came up with as a best SHORT term solution was this - Store the video on Video tape or DVD (depending on quality requirements - DVD is NOT broadcast quality), and then use multiple players - things like DVD jukeboxes/tape changers. They can either be manually loaded, or a robot. You then use a cache to store the vidio on a last in/last out basis if you need fast playback (assumption here - the most recently used tapes are most likely to be used again)
Encoding isn't that bad a problem - you just use multiple encoding stations - You say you have 1000 cameras - you're probably going to need better than 1000 encoding stations (don't forget spares) - you batch up 1/2 hour (for example) files and write those out to the SAN when your done - while one station is encoding, the next is recording, and you batch the encoded file up into Near line storage, so you don't NEED real time
Storage is going to take space/money BIG MONEY - your talking about 30 DVDs worth of data/day depending on your robots. Figure 1000s/day
Charlie
-- 73 de KG2V For the Children - RKBA! "You are what you do when it counts" - the Masso
Obviously this is some sort of security system that watches a large amount of space. So we are talking either a Casino or park of some kind. If not, then these are the people to ask.
:)
Also, is keeping all of the footage forever a requirement? Or just some of the footage? I would think you may want to keep the footage for a couple of days or weeks at most. If something requires footage to be kept longterm then you would move that from the harddrive to cdrom or dvd.
This is a job for a cluster of iMac's if I ever saw one
This is not the sig you are looking for...
This is actually a pretty easy question to answer:
Don't Do It.
This is someone either playing a theoretical game (in which case, the answer is "outsource it") or its someone who has no idea what they really want. You have, ultimately, many conflicting specs here.
You may as well ask for a space shuttle that can fly to pluto in two minutes with no fuel.
Any system that is recording a thousand video inputs is unlikely to need 30 fps for 24/7 (I can't think of anything short of national security installations that would even desire to record 30 fps 24/7, and you'd still have trouble justifying 1000 cameras to cover every building in Washington, DC). Not to mention the logistical implications of DELIVERING 1000 full-frame video feeds to a central location -- you could saturate the entire radio spectrum for the eastern seaboard or have to build the largest gigabit LAN ever deployed.
If you have a real question, please ask it, but this is as bad as a pointy-headed boss spouting off insane specs as the "requirements" for a project because he wants to be on the cutting edge.
And BTW, you won't need 300k per frame for a grayscale 640x480 video image (except that you desire insane specs, which point we've already covered). A fine quality image could be stored in 25-50k, even less depending on the real needs (of which this project seems to lack).
Recursive: Adj. See Recursive.
The Casino industry is probably the most advanced in the business of surveilence... the average Vegas casino probably approaches the scale you're talking about already, however they probably don't archive indefinately.
However, any information I've seen shows them to still be mostly analog capture for any storage, or at least digital-to-analog conversion for storage.
Although they probably won't talk about their security systems, they'd be a great resource.
MadCow.
I used to have a sig, but I set it free and it never came back.
The guy who wrote the post, is working on face-recognition technology. (thanks for researching him, cwhittenburg)
This exactly the type of thing that could be done, but shouldn't. Stay away from this, we can only hope that any competent people will stay away from this and they will never get it to work.
Sleep with one eye open.......
Why 640x480? That's higher resolution than broadcast TV. Do you need that? Broadcast TV is 460x360. Capturing at that resolution will lose you detail, of course, but if it's detail you can lose, your storage requirement just dropped by 40%.
And since you said retrieval can be "arbitrarily" slow, I'd look into using VHS videotapes--even if you store compressed digital on them--as a storage medium. They're slow as hell for rerieval, but the media might be cheap, especially compared to the likes of AIT and such.
You're thinking too centrally. Each camera (in a setup such as this) is likely distributed. The cameras can likely be grouped into clusters with a local digitizing system for each. These can transmit back to a central server for archival via a standard switched 100MB/s network (since the total aggregate bandwidth is 90MB/s, the local nodes can't reach that.) Assuming 10 cluster distributed homogeneously, you get 9MB/s from client to server. You still have to deal with the aggregate bandwidth at the server, but the network is no longer clogged (most switches have backbones around 12Gbps+). I'd go with either fiber Gig-E to the server, or channel bonded Fast Ethernet (4 channels should do it, yielding 400MB/s local bandwidth at the server.)
Not really. Rooms dedicated to tape archival maybe. But for the most part it could be done with several large tape cabinets and off-site storage for tapes older than about 5 days.
That's an awful lot of data. Why exactly are you doing this? What is the application? Who are you working for? If you are working for/in the United States, does this application meet the requirements of the 4th and 1st Amendments to the Constitution?
Just a few minor questions.
sPh
Still compression might be the way to go, except not compression of the single frames, but some way to only store each x'th frame, and between such a store, and the next xth, only store differences, and again compress the entire stream (stored frames +differences).
The efficiency of the differences storing should be improved by a preprocessing (to compression) step to try to reduce small variations in color,
iow to make two surfaces match
in color in the digitized picture, if they are equal in color
If one, while designing the difference-detection
algorithm, is able to differ between background and foreground, one could try to further increase
compression, while maintaining quality by using lossy compression for the
background only, while keeping the foreground (e.g. faces, important with security cams) sharp.
Since some security camera's send home nearly static, or a set of static images recurring after
a certain time (moving cams), this should increase compression.
I can't imagine that something like this is not already available, e.g. as a sideproduct of all
the research that went into the DVD/DivX/MPEG
standards.
I'm now afraid of canadians. Maybe their beaver is out to get me.
Anyways outing the guy was poor form. Especially if it's the other Rob McCready in the USA working at I think Caltech. He gave a great lecture on the GPL and it's ins and outs last year. He's also big on analysing large amounts of data theory through mathematics and computers.
Only 1000 cameras?
I mean 1000 cameras is only enough to put one camera into each home in a fairly small community. Most of the solutions I'm seeing posted so far don't scale up very well. What if you need multiple cameras per home? And what to do about large cities? Maybe this should be a seperate Ask Slashdot question?
I'll see your senator, and I'll raise you two judges.
with time delay you dont get full frames.
heres a solution :
3 Breece Hill (or SpectraLogic OEMed) 2.75TB AIT-2 autoloaders with 2 sony ait-2 drives (66Mb/s throughput per autoloader) each - $30,000 each x 3 = $90,000
3 x Sun Ultra 2's @ $2000 each = $6,000 (yep they can do 66MB/sec and max out the breece hills and they can handle the compression/video..total throughput will be 66MB/sec x 3 = 198MB/s which should be more than enough to handle the video feed.)
AMANDA tape backup software (free download) and Sun solaris + a shell script.
30 AIT-2 50GB catridges (about the size of a cigarette pack) x 3 = 90 catridges (should fit in a small briefcase) per day for 2.75 x 3 = 8.25TB/day @ $3000/day
So..it can be done for $100,000 and running costs at $3000/day (plus 1 guy 10 minutes to swap the catridges from the autoloaders and cycle em at the end of the day)...which is fairly cheap for this setup. A small standalone sony AIT-2 drive in a PC can read back the tar files of the video if necessary. One large room should be able to hold all the tapes for a couple of months or so. still at $90,000 its not exactly cheap.
The original question is basically meaningless without a description of the real project requirements.
You've put some big numbers up there, which all the hardware-heads have been happy to answer for you. But without real information, this is all just big-clock-speed hardware masturbation.
The real question is: what kind of single system/application would produce 24,000 hours of unedited high-quality video per day and storing it until the end of time?
Most respondents seem to assume that you're running a network of security cameras. If so, other posters have indicated that your quality or recording time requirements are probably 1-2 orders of magnitude too high.
If you're producing something where this video is actually going to be watched by people who care about beautiful full-color full-frame-rate production quality, where is your 200 million dollar production staff that will be watching and cataloging this data? And if you can afford that, surely you can afford at least one knowledgable systems engineer who knows how to design a storage system!
My absolute favorite bit of your post: "Reliability should be good enough to not be annoying long term". So how much lossage is "annoying"? Will you save space by randomly culling videos from the previous 7 days? If the whole system breaks down once a month, will that be annoying? If you lose one minute out of each hour, will that be annoying?
This is the sort of problem that can be designed and solved if you've got the need and budget for it. But it's not a turnkey solution. That's why you hire engineers.
That'd be a storage nightmare.
I don't think so.
Let's assume one camera per VCR, full 30 fps. That's 3 8-hour tapes per day per camera, 3000 tapes a day from 1000 VCRs. 1000 VCR's should cost you $100,000 and take up one
medium sized room (power and AC will need to be enhanced). 3000 tapes per day shouldn't cost more than $3000, or $1 million per year.
You'll only need a few tape monkeys at any given instant, because they'll be around one tape needing changing every 28 seconds. A days's worth of VCR tapes (assuming we pack them in boxes with NO room to spare and stack the boxes in blocks) will take up about 1.5 cubic meters or 50 cubic feet (based on 1x4x8 inches per tape, my rough estimate). That means for a year's worth of tape you need 550 cubic meters or 20,000 cubic feet, which is 3300 square feet if piled six feet high. 3300 square feet is about the floor-size of one big house.
Question to original: Are you still sure you want to do this? If so you might be best off "spreadking the load around". IE: Don't do it all in one place. There are a million convenience store camera's and vcr systems in the world, but they're not all in one place.
Off-hand I can only think of one thing that would handle 3,000 terrabytes per year, and that's if the half million people using Morpheus donated 6 Gigabytes of space each year to your cause.
so the motion/change detecting algorhithms may be of no use. Face-recognition in a crowded walkway in an airport would require 30fps or close to it. The compression can be done at the camrea so that is not an issue.
There seems to be (2) issues at hand:
1) How to store this data given his original paramters (I would want to work with worst-case numbers myself anyways)
2) SHOULD we be helping without knowing what we are helping with (we already have in some ways)
My personal view is that if we are helping someone bid on the recently beta-tested face-recognition systems (in Miami I think it was) that will be part of the new anti-terrorism work we should know that up-front before we help..
So then, what IS the system being used for?
Sometimes boldness is in fashion. Sometimes only the brave will be bold.
Instead of trying to store 8TB a day of video, why not look at ways to reduce the video. If the video is for security reasons(that is my assumption) then why do you need 30fps? Why do you even need one frame every second?
:)
Another solution may be to take an initial image and then simply record the changes, similar to what CVS does. It's much more efficient to store just a few changes than an entire image*30 every second. This solution would probably require a lot more computing power, then it's easier to add computing power than infinate storage space.
MPEG-4 may do some of these things, if so, you already have a solution. If not, get cracking a "fun" algorthim
Sometimes I feel like a nut... Ok so it's most of the time