Large-Scale Video Archiving?
BondHeadGuy asks: "Ok, say you have 1000+ cameras emitting 30 frames/second worth of 640x480 grayscale video...and you have to store it indefinitely. What do you do? This is a real question, believe it or not. 30 frames/s * 300 KB/frame = 9 MB/s per camera. 100:1 video compression brings that down to ~90 KB/s. But 90 KB/s * 1000 cameras = 90 MB/s, or ~8 terabytes/day. Retrieval, though, can be essentially arbitrarily slow. Reliability should be good enough to not be annoying long term. Is there a solution that: has 8 TB/day storage capacity, can handle the 90 MB/s write speed, and lets you save some bucks on the (slow) read side?"
300k/sec seems very excessive. You could try converting it all to mpeg4 with a DivX encoder (http://www.divx.com) and that should compress it right down. If you've got sound in there too, strip it out or at least convert to MP3.
You can do all this with a great program called Virtual Dub (http://www186.pair.com/vdub/)
You've got mail. Pattern baldness. - Crow
What your really looking at is some kind of Heirarchical Storage Solution. What happens is that once you have predetermined how much data will be saved from the camera each night. You can get some kind of disk array to store it on. That disk array will also be attached to some kind of HSM solutions such as what is provided by StorageTek's SAMFs. That solution will automatically backup the data that is stored on your disk and remove it from your disk so new data can be stored on the costly disks. From now on your OS and applications think that the data is on disk but in reality its on tape. When the data is requested the software will automatically get it from tape and place it back on the disk. This can be rather costly however.
Go with a FC solution - stay away from EMC, as they will try to sell you a massive Symmetrix for your needs. Sounds like you need a building block approach, one block a day. Doesn't need to be TOO fancy, eh?
Here are some options for FC disk storage:
- Sun T3
- EMC Clariion
- Compaq Storageworks
- HP VA7400 -- my fav
Just to warn you, you're looking at something on the order of 20k/day to operate this setup... now, I'm sure the price would go down QUITE a bit if you're purchasing 8-10TB a day, but even still, it's a huge cost.
I looked at a 10TB solution from the above vendors, and the cheapest I got it was $0.0425/MB!
hitachi has several very large storage arrays that are very competitive with EMC last i checked. again, that is if you need it to be in digial format and need it to be online.
alex
Plus it can store 12TB clustered.
Since I thought the problem originally specified usage of 8 TB/day, stored indefinitely, I can't really see how this solution could work, as you would quickly overrun capacity, and I suppose buying new machines every couple of days is not an option.
So what HSM means is Hiarchial Storage Management. Basically, when file hits a threshold of time, space or whatever, it will take that file and put it to tape. Then, it will replace the original file with a stub of a file that says 'when this file is needed, it's located here!'
Now, for tape storage, I highly recommend going with LTO as a tape format. You might consider doing SCSI LTO tape drives with a Crossroads 4450 connected to Broccade switches to make a SAN as well. By putting it on a SAN, you'll have the ability to spread around your clusters that you'd be putting in. LTO can spool data at about 10-20 MB/sec. Hence, if you get an STK or IBM storage library with LTO, you can fit around 20 tapes in there, and do 200 MB/sec. Plus, LTO has variable speed when writing to it, so it's better than DLT in that regard. Not to mention LTO's 100 Gig native capacity and a better compression ratio than DLT. (2.0 vs 2.2) Then, it's just a matter of cycling tapes through. If you're honestly talking that high amount of data to keep INDEFINATLY, then you might want to look at STK's Powererhorns, which hold around 2000 tapes. Plus, you can always add another wall of Tapes if you're not getting the throughput you're expecting. Or you could look at some of the larger scale robots out there, but they don't support LTO tape format yet.
By doing the EMC SAN solution to an STK powderhorn, you're looking at an enterprise level solution that will support you for years to come. Course, this comes from someone who's a vendor-neutral consultant with experience with similar technology, so your YMMV. `8r)
Let us know how it goes!
Gonzo Granzeau
"Nothing the god of biomechanics wouldn't let you into heaven for.." -Roy Batty
Not a problem. Compression is in-camera. So the data stream out of the back of the camera is already compressed. Rob
All of the various requirements were compiled into the post :). No, we can't expect to have largely static images. The most we'd want to store along with the video is timestamps. Any other data that results from processing the video stream will be dealt with separately...this is just for archiving the raw video. I'm not sure what the break-even point is on speeding up retrieval. Assume for the moment that we don't want to spend more than 5% of the total cost on speeding up retrieval.
Thanks - Rob
Comment removed based on user account deletion
Apparently you didn't actually read my post. I said I was assuming 100:1 compression on the video stream. That brings each frame down to 3K, which is much better than 25-50K.
30 fps for 24/7 is what our customer wants. End of discussion.
Gigabit ethernet is becoming common, and you only need the serious bandwidth for the last few links before the destination. Everything before that can run on 100 Mbit.
I hope this clears some things up.
NASA also thought about this, all the way up to Petabytes.
Actually, I developed (past tense) a hardware-based object detection system. Faces are interesting objects to try and learn to detect, because they have intresting and considerable variation. They also make for a fun demo and get more press. But I could have been detecting anything. Recognition, on the other hand, is different; it is the detection of individual faces rather than faces in general, and it would be nearly impossible to put entirely into a hardware system because of the large face database required.
That system rocked. But it has nothing to do with this post.
Rob
Any video compression algorithms worth using for this kind of application do comparisons from one frame to the next, and only compress the differences (except for occasional reference frames.) Some of them do substantial motion compensation to model the differences, others don't. Many of them let you tweak the frequency of reference frames - is it every 10? Every 100? Do you need the ability to go backwards, or is smooth forward and clunky backwards good enough?
Very few locations actually generate much motion on a 24-hour basis, except for road traffic cameras, and I'd be extremely surprised to see an application need to store those on a long-term basis (as opposed to storing for a week or so in case there are traffic accidents - anything you need longer than that should probably be handled by license-plate recognizers.)
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks