Ask Slashdot: Cloud Service On a Budget?
First time accepted submitter MadC0der writes "We just signed a project with a very large company. We are a computer vision based company and our project gathers images from a facility from PA. Our company is located in TN. The company we're gather images from is on a very high speed fiber optic network. However, being a small company of 11 developers, and 1 systems engineer, we're on a business class 100mb cable connection which works well for us but not in this situation. The information gathered from the client in PA is s 1½mb .bmp image, along with a 3mb Depth map file, making each snapshot a little under 5 megs. This may sound small, but images are taken every 3-5 seconds. This can lead to a very large amount of data captured and transferred each day. Our facility is incapable of handling such large transfers without effecting internal network performance. We've come to the conclusion that a cloud service would be the best solution for our problem. We're now thinking the customer's workstation will sync the data with the cloud, and we can automate pulling the data during off hours so we won't encounter congestion for analysis. Can anyone help suggest a stable, fairly price cloud solution that will sync large amounts for offsite data for retrieval at our convenience (nightly Rsync script should handle this process)?
Bring your own server. Depending on the time frame/duration of the project, it might be more cost effective to rent a quarter or half rack in a datacenter and build/buy your own servers. High initial up front cost, but does save money in the long run.
...WHY are you using BMP in the first place? Does whatever you're generating these on not have the processing capability to compress to PNG before transferring? I mean it SOUNDS like it'd save 10-20% off the total transfer...Anyways, what I'd do is I'd simply plop a server rack at the source that takes all the images for a given hour or whatever, tar.gz.bz2.whatevers them & send them over. Otherwise, I mean, Amazon wouldn't be TERRIBLE?
the sales guy oversold your capabilities. Instead of asking about cloud options, why don't you just pick a server host with a good reputation (Amazon and Rackspace come to mind) and pass the costs onto the client?
sysadmins and parents of newborns get the same amount of sleep.
Assuming you don't need real time analysis(doesn't look like it from problem description). Send a couple 500gb hard drives and have someone mail you the daily load of images each day with overnight shipping.
Assuming 5MB of data every 5 seconds, you're dealing with ~90GB of data a day. So, looking at Amazon's pricing model (http://aws.amazon.com/s3/pricing/), assuming you delete the data after you pull it, the storage total should be in the range of $0.095 * 90GB = $8.55/mo. Transfers into S3 are free. You'll be transfering ~2.7TB/mo out (90GB*30), at $0.120/GB, that's $324.00/mo in transfer fees.
Now, if that data isn't being accumulated 24/7 (ie. if it's only 8/5 for example), that lowers your monthly fees to the $80 range. Sure, you can shop around for someone who will charge you less for transfers (though if they're not charging at all, they may start complaining at the volume you're transfering data around), but $350/mo in fees to help keep a project that's making you money from killing your network? Would sound doable to me.
That huge bandwidth is a major load requirement of the project. That bandwidth is going to cost you or your client too much money. I think you should simply look into separating the functionality so you can do the analysis on customer site, and you only "get"(pulling from db, webservice, or a rss feed) the analysis results right there on customer's site, and the rest of your application sit where it is now. From the sounds of it the images are first saved somewhere on customer's network, so perhaps it is not much of a stretch to install your analysis app right there?
if you're going to sync nightly anyway, why bother with a cloud service? just sync at night.
// -- http://www.BRAD-X.com/ --
You're probably going to pay less for a second cable modem line than you will to store that much data in the cloud. Cloud processing is fairly cheap - cloud storage is expensive.
And then you won't have to re-tool anything else in your processes, except maybe adding another route or two. If you're doing that much data processing, the $200/mo for the line shouldn't really be a huge expense on the contract.
If you're looking to scale out this service to lots of companies, then the calculus might be different.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
It is more expensive than a cloud unless you are really big. Many startups that used to use Amazon's service decided with virtualization it was cheaper to use their own after they needed fiber connections and others to host massive bandwidth for all the boxens on the cloud.
With 1/2 down your speed will be adversely affected. With VSphere is about $7,000 including a CentOS or Windows Server License and Windows Server 2012 with HyperV is the same price. You can host VMs and have data backed up elsewhere for redundancy. Yes this will eat up data and raise costs with your T3, but it will consume less data than clouding everything.
Repeat the cloud does not save you money with all the hidden costs.
http://saveie6.com/
we're on a business class 100mb cable connection
100mbps = 12mbyte/s (give up 15-20% for the packet overhead, 10megabytes/sec).
Distilling that summary into the data that mattered:
1.5mb image, 3mb file each under 5 megs.
and
images every 3-5 seconds
The files are 5megabytes total.
In a perfect world, they'd transfer in 0.5 seconds.
Leaving 2.5 - 4.5 seconds for the porn.
Let's assume they are the bigger size, 5megabytes, and they transfer in the more frequent number, every 3 seconds.
5MBytes/3s = 1.66667 Mbytes/s = 13.33333 mbits/s.
Why is a facility with a 100mb/s line incapable of handling this?
How did a problem where a 100mb/s line can't handle 13.3333mb/s come to a conclusion of "Fix it with the cloud?"
In any case, if you want to do a cloud setup, just about all of them will handle small 13.3mb/s constant rates and you'll pay for it more than if you figured out why your line isn't keeping up.
I don't understand what the issue is here. What the OP seems to be really asking is how to move the bandwidth requirement to overnight, when no one is using their connection for other business purposes.
If time-shifting the syncing to off-hours is acceptable, why do you not install a server with a beefy hard drive at the client location to do just that?
Have you explored the idea of compressing the data at the client side before sending it your way? Bitmaps often compress very well, especially if you can batch very similar ones together. A script to make a gzipped tar file every 5 minutes might do wonders for your data requirements.
If you're ready to shell out the money for a cloud provider, why not instead shell out the money for a second connection to dedicate to this client?
What does moving the data through a third party in "the cloud" offer over any of (or a combination of) these three approaches?