Large-Scale Video Archiving?
BondHeadGuy asks: "Ok, say you have 1000+ cameras emitting 30 frames/second worth of 640x480 grayscale video...and you have to store it indefinitely. What do you do? This is a real question, believe it or not. 30 frames/s * 300 KB/frame = 9 MB/s per camera. 100:1 video compression brings that down to ~90 KB/s. But 90 KB/s * 1000 cameras = 90 MB/s, or ~8 terabytes/day. Retrieval, though, can be essentially arbitrarily slow. Reliability should be good enough to not be annoying long term. Is there a solution that: has 8 TB/day storage capacity, can handle the 90 MB/s write speed, and lets you save some bucks on the (slow) read side?"
Check out www.emc.com , if you need MASS storage, they're who you need to talk to.
Yep, I never spell check.
More incorrect spellings can be found he
Not at 30 frames per second. They get lots of time on a tape by recording only a few (or maybe only 1) frame per second. The problem specified 30 frames per second.
300k/sec seems very excessive. You could try converting it all to mpeg4 with a DivX encoder (http://www.divx.com) and that should compress it right down. If you've got sound in there too, strip it out or at least convert to MP3.
You can do all this with a great program called Virtual Dub (http://www186.pair.com/vdub/)
You've got mail. Pattern baldness. - Crow
What your really looking at is some kind of Heirarchical Storage Solution. What happens is that once you have predetermined how much data will be saved from the camera each night. You can get some kind of disk array to store it on. That disk array will also be attached to some kind of HSM solutions such as what is provided by StorageTek's SAMFs. That solution will automatically backup the data that is stored on your disk and remove it from your disk so new data can be stored on the costly disks. From now on your OS and applications think that the data is on disk but in reality its on tape. When the data is requested the software will automatically get it from tape and place it back on the disk. This can be rather costly however.
Go with a FC solution - stay away from EMC, as they will try to sell you a massive Symmetrix for your needs. Sounds like you need a building block approach, one block a day. Doesn't need to be TOO fancy, eh?
Here are some options for FC disk storage:
- Sun T3
- EMC Clariion
- Compaq Storageworks
- HP VA7400 -- my fav
Just to warn you, you're looking at something on the order of 20k/day to operate this setup... now, I'm sure the price would go down QUITE a bit if you're purchasing 8-10TB a day, but even still, it's a huge cost.
I looked at a 10TB solution from the above vendors, and the cheapest I got it was $0.0425/MB!
The Teradata Database is an expensive but workable solution. This is a database warehouse engine tuned for large scale data storage. It supports online retrieval in an OLTP fashion so the retrieve does not have to be a typical warehouse style query. This is a solution from NCR that currently hosts databases in excessive of 100tb.
No I don't work for NCR but I have worked with their database.
hitachi has several very large storage arrays that are very competitive with EMC last i checked. again, that is if you need it to be in digial format and need it to be online.
alex
So what HSM means is Hiarchial Storage Management. Basically, when file hits a threshold of time, space or whatever, it will take that file and put it to tape. Then, it will replace the original file with a stub of a file that says 'when this file is needed, it's located here!'
Now, for tape storage, I highly recommend going with LTO as a tape format. You might consider doing SCSI LTO tape drives with a Crossroads 4450 connected to Broccade switches to make a SAN as well. By putting it on a SAN, you'll have the ability to spread around your clusters that you'd be putting in. LTO can spool data at about 10-20 MB/sec. Hence, if you get an STK or IBM storage library with LTO, you can fit around 20 tapes in there, and do 200 MB/sec. Plus, LTO has variable speed when writing to it, so it's better than DLT in that regard. Not to mention LTO's 100 Gig native capacity and a better compression ratio than DLT. (2.0 vs 2.2) Then, it's just a matter of cycling tapes through. If you're honestly talking that high amount of data to keep INDEFINATLY, then you might want to look at STK's Powererhorns, which hold around 2000 tapes. Plus, you can always add another wall of Tapes if you're not getting the throughput you're expecting. Or you could look at some of the larger scale robots out there, but they don't support LTO tape format yet.
By doing the EMC SAN solution to an STK powderhorn, you're looking at an enterprise level solution that will support you for years to come. Course, this comes from someone who's a vendor-neutral consultant with experience with similar technology, so your YMMV. `8r)
Let us know how it goes!
Gonzo Granzeau
"Nothing the god of biomechanics wouldn't let you into heaven for.." -Roy Batty
And to make it even better, drop the requirement that the video has to be 30 fps. For retrieved video, 5 fps plays great if you are looking at movement of people. 15 fps is great if you are looking at movement of cars and such.
Trust me, I work on a security system that digitally records between 16-32 cameras in a retail environment (though we do have customers with 60+ cameras). We normally record at 2 fps during activity, and a much lower rate when not. Customers choose image sizes of approximately 10k per image (with 720x243x2 source images). We don't require that the user has tons of storage, so they typically get about a week's worth of video. Backup is very simple, using DAT tapes.
Greg
Easy, compress the stream AT the camera, or at least, off the server. That way the data transmission is already down to its lowest point. And yes, for a scalable solution the system should be broken up across multiple compression units/servers.
Why does this topic seem so familiar? Oh, its what I do for a living! (though on a much smaller scale)
Greg
Careful...I've used NetApp boxen, and (I'm assuming you're talking about NAS) they just don't have the sand to handle that kind of load reliably. Granted those that I worked with are now ~18 months old, but we had an Oracle update stream hitting one with a bandwidth of less than 15 MB/s and it started dropping NFS connections. If you try this, be careful to use high speed connections and to multiplex onto as many interfaces as possible!
A hero is someone who knows when to run away. I am a hero. -Trent the Uncatchable
Not a problem. Compression is in-camera. So the data stream out of the back of the camera is already compressed. Rob
All of the various requirements were compiled into the post :). No, we can't expect to have largely static images. The most we'd want to store along with the video is timestamps. Any other data that results from processing the video stream will be dealt with separately...this is just for archiving the raw video. I'm not sure what the break-even point is on speeding up retrieval. Assume for the moment that we don't want to spend more than 5% of the total cost on speeding up retrieval.
Thanks - Rob
Comment removed based on user account deletion
Apparently you didn't actually read my post. I said I was assuming 100:1 compression on the video stream. That brings each frame down to 3K, which is much better than 25-50K.
30 fps for 24/7 is what our customer wants. End of discussion.
Gigabit ethernet is becoming common, and you only need the serious bandwidth for the last few links before the destination. Everything before that can run on 100 Mbit.
I hope this clears some things up.
Casinos monitor every gaming table using 1/2 speed VHS tapes manned by operators swap tapes on a regular basis.
Actually, they are continuous loops, constantly erased unless something funky shows up. They only save the stuff they may need for later lawsuits. Like idiots tripping on spilled water claiming it's the casino's fault instead of their own dumb ass. Sorry, I'm all for tort reform.
Actually, I developed (past tense) a hardware-based object detection system. Faces are interesting objects to try and learn to detect, because they have intresting and considerable variation. They also make for a fun demo and get more press. But I could have been detecting anything. Recognition, on the other hand, is different; it is the detection of individual faces rather than faces in general, and it would be nearly impossible to put entirely into a hardware system because of the large face database required.
That system rocked. But it has nothing to do with this post.
Rob
I am an Engineer for a company that does only storage, so I might be able to offer some suggestions. The best solution would probably be SamFS, which is a Hierarchical Storage Management product developed by LSC software, now part of Sun. SamFS runs only on Solaris Sparc, so that means a Sun box. Your reqs. would max out an E450, so you should look at a 4500 or 4800 at the minimum. For disk, avoid Sun T3's like the plague. They suck. For your needs, a Clariion FC4700 running RAID 3 is perfect. So perfect, that Sony just signed an OEM agreement to sell Clariions with their video editing solutions. For tape, I would suggest LTO drives in a StorageTek L700 library. SDLT is too new to be trusted. Also look at AIT-3 in SpectraLogic Gator 64000 libraries. If you have the cash, the ultimate tape solution would be STK T9940 or 9840B drives in a StorageTek 9310 powderhorn (as seen in the movie Eraser). Unfortunately, a powderhorn with no drives is about $200k, T9940 drives are $35k each, and 9840B drives are about $30k. Good luck.
I am honestly (and, I guess, naively) surprised by the suspicion. I thought it was just an interesting little problem that slashdotters would probably like to kick around. The details are at best marginally interesting...except to competitors, which is why I left them out.
I understand that people are legitimately worried about surveillance these days. I am more worried than most given my research on face detection (see some other post here). The academic literature overlaps significantly with recognition, so I have a pretty good idea of just how unreliable it is.
Even so, the contrast between people's assumptions and the boring reality in this case is really quite staggering. It is not a government project. It does not involve public surveillance on some vast scale. We are not misleading our customers. Despite what you assert, there are legitimate private facilities that need this sort of thing.
Rob
Any video compression algorithms worth using for this kind of application do comparisons from one frame to the next, and only compress the differences (except for occasional reference frames.) Some of them do substantial motion compensation to model the differences, others don't. Many of them let you tweak the frequency of reference frames - is it every 10? Every 100? Do you need the ability to go backwards, or is smooth forward and clunky backwards good enough?
Very few locations actually generate much motion on a 24-hour basis, except for road traffic cameras, and I'd be extremely surprised to see an application need to store those on a long-term basis (as opposed to storing for a week or so in case there are traffic accidents - anything you need longer than that should probably be handled by license-plate recognizers.)
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks