Slashdot Mirror


Ask Slashdot: Linux Mountable Storage Pool For All the Cloud Systems?

An anonymous reader writes "Many cloud systems are available on the market like: dropbox, google, sugar sync, or your local internet provider, that offer some free gigabytes of storage. Is there anything out there which can combine the storage into one usable folder (preferably linux mountable) and encrypt the data stored in the cloud? The basic idea would be to create one file per cloud used as a block device. Then combine all of them using a software raid (redundancy etc) with cryptFS on top. Have you heard of anything which can do that or what can be used to build upon?"

117 of 165 comments (clear)

  1. There are several options here by Omnifarious · · Score: 5, Informative

    The first, and most interesting, is Tahoe LAFS. It does come with a FUSE driver, so it can be mounted like a regular filesystem. It is cloud-based and redundant to a degree you choose yourself. All copies stored are encrypted, so the only person who can read them is you. I'm not sure though if fetching from more nodes than you strictly need to reconstruct your original file actually buys you anything with that system, but I think it does.

    You could also use something like a mountable version of Google Drive and then layer fuse-encfs on top of it. That's not quite as secure as encrypting at the block layer. The overall shape of your directory hierarchy is available, even if the individual file names and their contents are obscured. That should probably be good enough for most purposes.

    1. Re:There are several options here by Omnifarious · · Score: 4, Interesting

      BTW, doing this at a block device level is likely a very poor idea. Block devices are very difficult to get right in a distributed fashion from a synchronization standpoint. They also are likely to cause a lot of excess network traffic since the units the system deals with are poorly matched to the logical units that are actually modified. A good distributed solution to this problem will at have to know something about the fact that you have individual files to be at all reasonable to use.

    2. Re:There are several options here by ultrasawblade · · Score: 4, Interesting

      If you can mount a cloud service as a folder in Linux somehow, then Tahoe-LAFS can work. I know Dropbox lets you do this but am unsure about the other systems. If the cloud service allows upload/download via HTTPS, this could be worked around nontrivially by writing something using FUSE to translate filesystem requests to HTTPS requests recognized by that service.

      You would have to have a "client" running for each cloud service. Each client has a storage directory which needs to be configured to be the same as the local sync directory for the cloud service. While Tahoe-LAFS is intended to have each client in a "grid" run on separate machines, there's no reason why multiple clients on the same grid could not be running locally. You'd just have to edit configs manually, setting the IP address to 127.0.0.1 and choosing a different port for each "client", and also making sure the introducer.furl is set accordingly.

      Tahoe-LAFS's capability system is pretty neat. Clients never see unencrypted data and you can configure the redundancy and "spread-outness" of the data however you like. Tahoe-LAFS's propensity to disallow quick "deleting" of shares also works well with possibly slowly updating cloud backends - Tahoe is designed to prefer to "age out" shares containing old files periodically rather than support direct deleting.

      And Tahoe works as well on Windows as it does on Linux (it's a python script) so if your cloud service is Windows only that is no disadvantage.

    3. Re:There are several options here by Omnifarious · · Score: 1

      Oh, yeah. *sheepish grin* Of course, most popular existing cloud services do not support LAFS out of the box. :-/ So yeah, you'll have to layer it on something in the manner suggested above.

    4. Re:There are several options here by fuzzyfuzzyfungus · · Score: 4, Insightful

      I get the impression that, while Tahoe LAFS is the good option, the submitter of TFS is looking for the super-cheap option. He wants some sort of terrifying 'RAID-0-over-a-handful-of-different-interfaces-to-a-half-dozen-free-services-so-I-can-scrape-together-a-couple-gigs-here-and-a-couple-there' amalgamation. Unless he's planning some redundancy, that sounds like a recipe for data loss even if it were simple to set up, and you'd still be looking at a relatively paltry amount of storage space.

      It sounds to me like the submitter needs to decide whether he wants to step up and pay for some actual hosts(for which Tahoe LAFS would probably be a good option), or one of the more paranoid dropbox-clones, or whether this is simply an exercise in cobbling scrap together because that can be amusing sometimes...

    5. Re:There are several options here by Omnifarious · · Score: 5, Interesting

      Tahoe sort of achieves this in an odd way. Directories contain hashes of the file they reference instead of an inode number. This means that a Tahoe node often doesn't even know who a file really belongs to, even though it knows its length.

      The main issue with block storage is this...

      Suppose you modify a data section of a file in a btrfs filesystem mounted on some kind of weird encrypted block device. There will be a whole tree of blocks that get modified, all the way up to the root node. All of these blocks have to be written before the root block is, and for a small file there will be several more blocks that need updating than there are data blocks on the file.

      These two issues create a big synchronization problem and a lot of extra traffic.

      In contrast, a good distributed filesystem protocol that's aware of individual files can send a single message that contains some kind of identifier for the file, and the new data it should contain. This message will often be smaller than a single filesystem block, and it will also usually be compressed before it gets on the wire. Much more efficient and while there are synchronization issues between updates to individual files, within a file there aren't any.

    6. Re:There are several options here by Threni · · Score: 1

      > You could also use something like a mountable version of Google Drive

      How do you mount Google Drive on Linux? It seems simple to the designers of Dropbox but it's eluded those at Google. Basically makes it unusable. It's doubly frustrating that you can't download files onto Android devices using the Google Drive app, even though the app already lets you play, for example, mp3s you've pushed up there from Android, so they're actually preventing you from using functionality the app already possesses. It's like they want you to use third party software or something...

    7. Re:There are several options here by Venotar · · Score: 2

      How do you mount Google Drive on Linux? It seems simple to the designers of Dropbox but it's eluded those at Google.

      Insync: https://www.insynchq.com/#112472431252847033039/settings An official agent would be better - preferably something that provides a FUSE driver. Something OpenSource would be even better; but Insync works fine. Behaves very dropbox like and even supports multiple google accounts. You're correct, though - without Insync or a better option Google drive might as well not exist, as far as Unix users go.

    8. Re:There are several options here by Threni · · Score: 1

      Does InSync support Google authentication tokens? It's surprising how many Android apps seriously expect me to trust them with my Google account credentials because they can't be bothered to use tokens.

    9. Re:There are several options here by CdBee · · Score: 1

      No - AND its based in Malaysia. Which is why I'm awaiting an official Google app to access Drive. Hell will freeze over before I trust a company based in neither the EU or USA with something as important as my passwords...

      --
      I have been a user for about 10 years. This ends Feb 2014. The site's been ruined. I'm off. Dice, FU
    10. Re:There are several options here by tlambert · · Score: 2

      Suppose you modify a data section of a file in a btrfs filesystem mounted on some kind of weird encrypted block device. There will be a whole tree of blocks that get modified, all the way up to the root node. All of these blocks have to be written before the root block is, and for a small file there will be several more blocks that need updating than there are data blocks on the file.

      These two issues create a big synchronization problem and a lot of extra traffic.

      A btrfs style filesystem already has this problem with local storage, it just doesn't become immensely evident unless you are using media where the burst transfer rate gets swamped by the amount of data in a set of consecutive data transfers. As soon as you overwhelm the steady state average rate, the effective burst transfer rate drops to the sustained transfer.

      You can see this relatively easily on Samsung and Sony ARM devices with eMMC mass storage instead of SSD, and you can see it on SSD mass storage for small values of 'mass', where the transactions can be split over many chips to effectively get a parallel bus for transfers, or on bigger SSD devices where the controller isn't clever enough to use that trick.

      At which point, I'd say that btrfs is the wrong tool for the job in this environment.

      But in reality, using cloud storage, which frequently means WebDAV, isn't appropriate for this use case anyway, give the number of transactions required for a read/modify/write operation to do a partial block update of an index, even if it wasn't a btr index, and was clever enough to use an infix tree (and as far as I know, no one is using infix trees in filesystems, for fear of the Sun-now-Oracle over-broad patent on the things).

      Even for straight WebDAV, without all the additional complications proposed by the OP, you are going to be better off doing local caching and replication of selected data and metadata, ala CODA, and only doing periodic syncing. Of course, then you'd be using an FS like CODA, which has more or less solved this problem already, rather your favorite FS flavor of the week, which is bound to piss some people off.

      In closing, let me point out that none of the cloud storage providers to date are willingly hosting the necessary APIs to implement something like CODA; they've internally solved the distribution and replication problems with their own one-off solution which they are not sharing, since that the strategic value of what they are trying to sell in the first place.

    11. Re:There are several options here by Omnifarious · · Score: 1

      I wish Google had a nice capability system that would allow you to give a revocable authorization to the app so it didn't have to know your password. They do for some things, but I don't think they do for Google Drive specifically.

    12. Re:There are several options here by PetiePooo · · Score: 1

      A btrfs style filesystem already has this problem with local storage, it just doesn't become immensely evident unless you are using media where the burst transfer rate gets swamped by the amount of data in a set of consecutive data transfers. As soon as you overwhelm the steady state average rate, the effective burst transfer rate drops to the sustained transfer.

      You can see this relatively easily on Samsung and Sony ARM devices with eMMC mass storage instead of SSD, and you can see it on SSD mass storage for small values of 'mass', where the transactions can be split over many chips to effectively get a parallel bus for transfers, or on bigger SSD devices where the controller isn't clever enough to use that trick.

      At which point, I'd say that btrfs is the wrong tool for the job in this environment.

      What filesystem would you suggest? YAFFS?

      It would seem to me that there are parallels between the write limitations of flash memory and the need to avoid latency in distributed writes to cloud providers. I'm envisioning a YAFFS filesystem built on same sized blocks that are individually encrypted (and compressed) and written to the various cloud providers. Each block could be replicated (raid1) or broken into stripes of blocks used to calculate parity (raid5).

      I think the technology exists, and all the components are proven, they've just never been assembled in exactly that way before.

      I wish the OP the best of luck. Even if it isn't practical in the end, it sure sounds fun..

  2. GlusterFS by Anonymous Coward · · Score: 3, Informative

    It has optional encrypted transport if you use the native (fuse) mount. Encryption on the back end is on the road map for a future release. It's available for Linux, there's a NetBSD port, and has had working Solaris and OS X support in the past, it probably wouldn't be too hard to make those work again.

    1. Re:GlusterFS by KingRobot · · Score: 1

      Yea, Gluster has a really nice API for writing what they call "translators". I would imagine it wouldn't be too difficult to write a translator that presents your cloud service of choice as a storage brick, and from there you just tie it all together as desired.

    2. Re:GlusterFS by Anonymous Coward · · Score: 2, Funny

      But then you could just run glusterfsck to un-GlusterFuck your glusterfucked filesystem....

  3. Why do you want to combine them? by egcagrac0 · · Score: 4, Interesting

    If you don't trust the provider to keep your data intact, don't use that provider.

    If you need more storage, pay for it. The cost is not prohibitive - 100GB or so for under US$10/mo is pretty easy to find.

    If $10/month prices you out of the market, there are better things to worry about than encrypting files and storing them in the cloud.

    1. Re:Why do you want to combine them? by SpaceCracker · · Score: 1

      I believe the asker didn't mention a price issue.
      Availability is one reason to redundantly "split your eggs into more than one basket". Cloud outages happen from time to time. If one vendor is unavailable (temporarily or closed down indefinately), you want your files to be available from another vendor.

      --
      sigo ergo sum
    2. Re:Why do you want to combine them? by egcagrac0 · · Score: 1

      If availability is the goal, duplication (mirroring) would be the way to do it.

      While technically mirroring is a mode of "RAID" (RAID 1), typically I hear "RAID" used to mean some form of spanning - combining more than one storage resource into one larger logical storage resource.

      As for permanently closing down, if you're a paying customer, you have a reasonable expectation to receive notice that they're terminating the service offering. If you're getting it free, enjoy what you get while you can, and don't expect that complaints will get you more free lunches.

    3. Re:Why do you want to combine them? by icebike · · Score: 1

      I agree that mirroring is the way to go, as long as all the cloud servers support some form of user-side encryption.

      But I can see being worried about permanently closing down as well.
      Does the FBI give notice when they seize the server farm?

      Also many of these services, especially the smaller ones are just resellers of Amazon if I'm not mistaken so in some
      cases even mirroring might not help, and any sort of raid 5 could leave you with nothing is more than one of
      your chosen mirror vendors was ultimately stored on the same upstream provider.

      --
      Sig Battery depleted. Reverting to safe mode.
    4. Re:Why do you want to combine them? by Xtifr · · Score: 2

      If you don't trust the provider to keep your data intact, don't use that provider.

      That's either a ridiculous statement, or completely off-topic. When it comes to reliability, trust isn't an absolute yes/no thing--it's measured in percentages. And redundancy multiplies reliability, so it's a big win.

      There's a trade-off for complexity here, and it's possible to question whether all the extra effort is really worth the potential gains in reliability. (Is it really that important to have eight nines instead of four, or ten instead of five?) But there's nothing wrong with investigating the possibility. And he didn't say anything about price. Or amount of storage. Perhaps he's perfectly willing to pay $10/mo three times over, just for the satisfaction of knowing his data is super available.

      For that matter, nobody, no matter how reliable, can guarantee you absolute security. Security is also something you have to measure in percentages (though it's a lot harder to estimate accurately). Encrypting your data gives you an extra layer of protection, even if you think your provider's security is good.

    5. Re:Why do you want to combine them? by Xtifr · · Score: 1

      I think it was pretty obvious that he had mirroring in mind, since what he actually said was "using a software raid (redundancy etc)". (Emphasis mine.) That's the only place he mentioned RAID, so I seriously doubt his goal is striping! :)

    6. Re:Why do you want to combine them? by theNetImp · · Score: 1

      OK say your a hobby photographer at $10 a month backing up 1TB of images is VERY cos prohibitive. $100/month for cloud storage is terribly high.

    7. Re:Why do you want to combine them? by Gaygirlie · · Score: 4, Informative

      OK, just fess up - it's your pr0n collection, right? 1TB of images at a gargantuan 20MB apiece is over 50000 images; at a more reasonable 5MB that increases to 200k+. "Hobby photographer" my foot.

      You've clearly never heard of RAW-images. 20MB RAW-image is actually still on the smaller end of the scale.

    8. Re:Why do you want to combine them? by egcagrac0 · · Score: 1

      Then perhaps offline backups are a better choice in your application.

      A USB hard drive or two and a safe deposit box should be substantially more affordable.

    9. Re:Why do you want to combine them? by egcagrac0 · · Score: 1

      If you don't trust the provider to keep your data intact, don't use that provider.

      That's either a ridiculous statement, or completely off-topic.

      Neither, actually.

      In a design like this, I assume that a storage resource - in this case, a cloud provider - will be either online, or offline. If they're offline, I need to work with a different copy of the data. Using a striping arrangement (or striping with parity) rather than a mirrored arrangement means there may not be another copy available.

    10. Re:Why do you want to combine them? by DarwinSurvivor · · Score: 1

      Not to mention he'd go through about $300/month in overage charges when he hits his cap every month.

    11. Re:Why do you want to combine them? by blueg3 · · Score: 3, Insightful

      Most RAW file formats (RAW isn't a file format itself, but a designation covering a number of different formats) already include lossless compression.

      Even if they didn't, compressing would create a lot of unnecessary work, because they're mostly valuable in the form that can be manipulated by image editing and management software. Anything other than the actual RAW format is dramatically reduced in value. Again, though, that's a moot point, because RAW formats generally include lossless compression already.

    12. Re:Why do you want to combine them? by blueg3 · · Score: 1

      It would take about 13 weeks with 1 Mbit up, which is on the low end of readily-available broadband data.

    13. Re:Why do you want to combine them? by WalrusSlayer · · Score: 5, Informative

      Uh, methinks you haven't really used tool chains designed to maximize the value of RAW files. The camera's built-in processor does way the hell more stuff than just compress raw pixels into JPEG. White balance is a huge one, along with level curves, sharpening, and a bunch of other stuff. Much of it either one-way or very hard to unwind. And as others have pointed out, most RAW *is* compressed, just lossless.

      So yeah, you can fix white-balance in a JPEG, but it's way simpler and more accurate to set the white balance if the pixels haven't already been misbalanced in the first place. Ditto for exposure. Most tools that deal with processed JPEG's don't even have an exposure adjustment---quite often the same tool that does both file types will have an exposure slide if it's RAW but not if it's JPEG. Sure, you can futz with brightness, contrast, levels, gamma, etc to correct an under-exposed shot. But sliding over to +2/3 for a slight underexposure is one click and you're done.

      As a guy who has deep-drilled many a software engineering discipline in his 25 year career, and shot tens of thousands of frames as an amateur enthusiast, you can pull me out of the "photographers who don't understand the tools" pool thank you very much.

      I have gone back and forth between JPEG and RAW over the years. There have been periods where, with two small children, I simply didn't have time to invest in RAW processing. And I was pleased the neutrality of the DSLR's processing anyway. Other times I knew I was shooting in challenging conditions, and set the camera to RAW+JPEG as a safety net. I've rescued many a shot that way. Recently I've been putting mileage on Lightroom and can extract an immense improvement out of the RAW's that would take me 4x the time to do if they were JPEG, and probably not end up with the same result. I now have more time to invest and the payoff is real and significant.

    14. Re:Why do you want to combine them? by flyingfsck · · Score: 1

      "keep a copy for themselves" If you do this distributed storage software right, then it will be fault tolerant and encrypted.

      --
      Excuse me, but please get off my Pennisetum Clandestinum, eh!
    15. Re:Why do you want to combine them? by donaldm · · Score: 1

      If you don't trust the provider to keep your data intact, don't use that provider.

      Yes that goes without saying, however you still must say it especially if you are consulting with a customer who is considering cloud storage. A bit like talking about backups. it is amazing this critical service is sometime a low priority with some companies.

      If you need more storage, pay for it. The cost is not prohibitive - 100GB or so for under US$10/mo is pretty easy to find.

      Yes $10 a month is not that expensive for 100GB however if you consider TB's of data (not that difficult if you consider movies etc) and then cost starts to climb and for a home user that $100/month is starting to get expensive.

      If $10/month prices you out of the market, there are better things to worry about than encrypting files and storing them in the cloud.

      For companies a professional backup system is much more practical than "Cloud Storage" although this could have a stating cost of a few thousand dollars going up over millions of dollars depending on the backup requirements of the company. For a home user it is actually cheaper to use portable disk drives as your backup service however once problem with that is the fact your data will normally reside in you home unless you have an arrangement to off-site them to a trusted friend or neighbour.

      It must be noted that when I am talking about backups in my reply are not really what can be considered "backups". Basically they are actually making a mirrored copy of the appropriate data such as using the rsync command from the file-system or directory structure you want to duplicate and maintain to the target file-system where you wish to have your data mirrored. The target file-system can be on the so-called cloud or a storage device.

      --
      There ain't no such thing as proprietary standards only proprietary formats. Standards are by definition open.
    16. Re:Why do you want to combine them? by semi-extrinsic · · Score: 2

      Since you say you've alternated between JPEG and RAW shooting, I have two questions (out of genuine interest):

      1) For a reasonably well-exposed photo where the white balance is roughly correct in the camera, are you able to produce a significantly better end result from RAW than from JPEG? (I definitely agree on using RAW+JPEG when you know exposure could be a problem)

      2) Do you have any rough idea about the bit depth the RAW photos need to be at before you get a significant advantage over JPEG? My old camera produced 10 bit RAWs, and at that time I was almost never able to out-perform the JPEG. My new camera has 12 bit RAW, and I haven't really had much time recently (small children here as well) to play around with RAW. But maybe it would be worth it?

      --
      for i in `facebook friends "=bday" 2>/dev/null | cut -d " " -f 3-`; do facebook wallpost $i "Happy birthday!"; done
    17. Re:Why do you want to combine them? by semi-extrinsic · · Score: 1

      The best solution is to have a friend who also runs a Linux server at home. Or hell, even give your friend an old Linux box and set up a Samba mount on it that he can access from Windows. You then each buy two harddrives, and mirror each other. If you don't trust your friend not to snoop on your photos, or vice versa, use encryption.

      --
      for i in `facebook friends "=bday" 2>/dev/null | cut -d " " -f 3-`; do facebook wallpost $i "Happy birthday!"; done
    18. Re:Why do you want to combine them? by beachcoder · · Score: 1

      Quite. You can get unlimited data for an absurdly cheap price.

    19. Re:Why do you want to combine them? by BlackPignouf · · Score: 4, Informative

      1) For a reasonably well-exposed photo where the white balance is roughly correct in the camera, are you able to produce a significantly better end result from RAW than from JPEG? (I definitely agree on using RAW+JPEG when you know exposure could be a problem)

      Short answer : No
      Longer answer : It depends on the light, the sensor, the image processor in camera and your RAW workflow.
      From personal experience, I'd say that Canon JPGs are pretty good out of camera, Nikon JPGs lack a bit of sharpening, and Fuji X sensors have very good JPGs that are still impossible to match with RAW+Lightroom.
      I use RAW as a safety net during events or weddings, so that if I get a picture with good expression, focus and composition but wrong exposure or WB, I can still save it and print it instead of having to delete it.
      RAW is also interesting for scenes with high dynamic range, such as landscapes or concert.

      Do you have any rough idea about the bit depth the RAW photos need to be at before you get a significant advantage over JPEG? My old camera produced 10 bit RAWs, and at that time I was almost never able to out-perform the JPEG. My new camera has 12 bit RAW, and I haven't really had much time recently (small children here as well) to play around with RAW. But maybe it would be worth it?

      I think it has more to do with dynamic range than with bit-depth. Just find a contrasty scene, take a RAW picture and try to retain details in both shadows and highlights with your RAW conversion software.
      http://www.dpreview.com/learn/?/Glossary/Digital_Imaging/dynamic_range_01.htm
      http://www.dpreview.com/learn/?/Glossary/Digital_Imaging/tonal_range_01.htm

    20. Re:Why do you want to combine them? by aug24 · · Score: 1

      I used to do this with a friend: peered ftp servers. Everything under /home and /etc was tarred, zipped, encrypted and ftp'd onto his server at midnight, while his did the same to mine. Can't remember where I put the decrypt key, probably on a floppy that is still in my bureau.

      Of course, data volumes were lower then, and the sun was warmer, and girls prettier.

      Just.

      --
      You're only jealous cos the little penguins are talking to me.
    21. Re:Why do you want to combine them? by BlackPignouf · · Score: 1

      I mostly agree with what you said, but the question was specifically for a well-exposed picture with correct WB.
      In this case, you don't need much processing (if at all), and RAW doesn't give you much advantage.

    22. Re:Why do you want to combine them? by silentcoder · · Score: 1

      >>OK, just fess up - it's your pr0n collection, right? 1TB of images at a gargantuan 20MB apiece is over 50000 images; at a more reasonable 5MB that increases to 200k+. "Hobby photographer" my foot.

      >You've clearly never heard of RAW-images. 20MB RAW-image is actually still on the smaller end of the scale.

      And 50000 is the approximate useful actuator operations on most higher-end DSLRs - most photographers who use them are on their third or fourth one by now.

      --
      Unicode killed the ASCII-art *
    23. Re:Why do you want to combine them? by silentcoder · · Score: 3, Interesting

      I'll try to answer as well. My previous DSLR a Canon 400D couldn't do Raw+Jpeg so I used ONLY raw, for things like "holiday snaps" style shooting I'd just mass-export to jpeg, but for real work I'd always use the RAW.
      With my new 40D I use Raw+Jpeg for shooting but I'm tempted to go to pure RAW as I've yet to use the jpeg, I figured it may be useful for a reference (what the camera thought was there) but otherwise, no thanks.

      1) 1) For a reasonably well-exposed photo where the white balance is roughly correct in the camera, are you able to produce a significantly better end result from RAW than from JPEG?

      For me the first part of post-processing is playing with the RAW - for example sometimes I will deliberately switch it to a different white balance or even do manual white balance to achieve some or other artistic effect. Raw is also very powerful for adjusting things like the global saturation and contrast levels very finely (while you'll want a tool like photoshop or gimp to adjust individual elements).

      >2) Do you have any rough idea about the bit depth the RAW photos need to be at before you get a significant advantage over JPEG? My old camera produced 10 bit RAWs, and at that time I was almost never able to out-perform the JPEG. My new camera has 12 bit RAW, and I haven't really had much time recently (small children here as well) to play around with RAW. But maybe it would be worth it?

      It doesn't much matter. If you are taking snapshots then just use jpeg. RAW comes into it's own if you're doing real photography - product shoots, studio work, landscape work, art photography etc. - where the post-production is as important a part of the process as the taking of the shot. RAW is stage one of producing the perfect image, gimp/photoshop is stage 2. Even those photographers who eschew editing of pictures will usually do RAW adjustment - which doesn't change what's there, only how it's 'presented' in terms of light.
      Personally I point out to those types that there is nothing I can do in gimp/photoshop that the old boys didn't use to do in the dark-room, it's just faster, easer and a LOT cheaper.

      For the most part a human cannot on a computer screen tell the difference between a 6MP camera (the smallest DSLR I know off) and an 18MP one since no common desktop/laptop screen could show such a picture full-size anyway you're seeing a distorted/shrunken version to begin with, but where it DOES matter is prints. I do prints of my best work and some have also been printed in magazines like Marie Claire and when you're doing prints you need to provide the images in the right level. Generally you will want to ensure they are scaled to page size (e.g. A3 for example) yourself - and that means including white-space bordering to prevent stretching - and you'll need to ensure they are high print-resolution (professional printing should be 300 DPI). Format wise uncompressed jpeg is usually used.
      Simple reality is that to get an uncompressed jpeg at 300DPI that is A3 in size you need a high MP shot to begin with or your picture simply won't look good at that resolution.
      RAW is invaluable here as it lets you handle such things as exposure levels much better. You cannot just yank up the exposure of a picture - if you do that you create lots of digital noise (which shows up as red-speckle) which no amount of editing can ever REALLY cover up properly - but in RAW you can subtly adjust lighting to make a useful picture from a slightly underexposed shot sometimes anyway. On 800x600 web-quality jpegs you'll never even NOTICE the noise being created in typical "push up the exposure" steps - but if you print that as an A3 poster for framing every one of those red dots is a glaring monstrosity.

      The first and finest art of ALL photography is lighting, don't think you can fix bad lighting in post, at best you can maybe make a useful website picture. If you are trying to do anything that's printable - you need to get your light right. The purpose of editing (both RAW and gimp) is to modify a

      --
      Unicode killed the ASCII-art *
    24. Re:Why do you want to combine them? by silentcoder · · Score: 1

      >with correct WB.

      Define "correct".
      If you define it as "closest as possible to the actual colour of the subject" then you have something of a point (not much) - but in fact you will be surprised what lovely and subtle effects you can sometimes obtain by deliberately using the wrong one.
      Take a picture with studio-flash white-ballance, and open it in RAW and then see what it looks like with Tungsten WB - 99% of the time you will hate the result - but 1% of the time you'll find a work of art you would never have had if you hadn't used RAW.
      That 1% is enough to be worth it.

      This exact example happened to me after a studio shoot just a week ago:
      http://silentcoder.co.za/photography/art/Caryn-Fetish/IMG_1858.jpg/

      --
      Unicode killed the ASCII-art *
    25. Re:Why do you want to combine them? by semi-extrinsic · · Score: 1

      I think it has more to do with dynamic range than with bit-depth. Just find a contrasty scene, take a RAW picture and try to retain details in both shadows and highlights with your RAW conversion software. http://www.dpreview.com/learn/?/Glossary/Digital_Imaging/dynamic_range_01.htm http://www.dpreview.com/learn/?/Glossary/Digital_Imaging/tonal_range_01.htm

      Thanks for the links. But if I understand correctly, the bit depth more or less precisely corresponds to the highest dynamic range the sensor is able to capture, right?

      --
      for i in `facebook friends "=bday" 2>/dev/null | cut -d " " -f 3-`; do facebook wallpost $i "Happy birthday!"; done
    26. Re:Why do you want to combine them? by semi-extrinsic · · Score: 1

      Thanks for an informative post. I already have three battery-powered flashes, two mains-powered studio flashes and an RF trigger set, so I'm covered there, and I must say proper flash photography is a lot of fun (Strobist and cheap ebay equipment got to me...)

      So just to make sure I understand what you're saying: let's say I shoot a portrait of my daughter, using two or three flashes. The background is plain ( e.g. white paper table cloth), and almost all light comes from the flashes. I set custom white balance with a gray card. The intended target is a 20"x30" print for the living room wall. Would the purpose of shooting this in RAW instead of JPEG be that the e.g. 1/3 stop I'll have to adjust it in gimp will be better based on the RAW than the JPEG? Or would it be more that the RAW lets me get more creative in post-processing, playing around with curves and colors to turn the picture into something significantly different from what my camera captured?

      --
      for i in `facebook friends "=bday" 2>/dev/null | cut -d " " -f 3-`; do facebook wallpost $i "Happy birthday!"; done
    27. Re:Why do you want to combine them? by silentcoder · · Score: 1

      >Thanks for an informative post. I already have three battery-powered flashes, two mains-powered studio flashes and an RF trigger set, so I'm covered there, and I must say proper flash photography is a lot of fun (Strobist and cheap ebay equipment got to me...)

      That's a decent setup - my own kit is mostly from second-hand purchases as well so I can appreciate that.

      >The background is plain ( e.g. white paper table cloth)
      In general I would recommend avoiding white backgrounds - dark blue, gray or black is usually better unless you compensate for the reflection in your flash. More importantly the high amount of white in the picture will throw off the camera's sensors so you cannot even use them as a guide for exposure (in practise you almost HAVE to over-expose a white background to get the subject correctly lit).

      >Would the purpose of shooting this in RAW instead of JPEG be that the e.g. 1/3 stop I'll have to adjust it in gimp will be better based on the RAW than the JPEG? Or would it be more that the RAW lets me get more creative in post-processing, playing around with curves and colors to turn the picture into something significantly different from what my camera captured?

      Both. The adjustment in RAW will produce a better quality outcome. Now the amount of modification you want to do will vary. Many of my pictures contain nothing but RAW editing, while others are significantly adjusted (for example taking a portrait and trying to reproduce the kind of post production that Diana Holga invented) - it's nice to have the option even if you rarely use it.
      Even more subtly there are things like pimples. Even for a picture I will barely be editing I would generally get rid of those, pimples are temporary blemishes and they are not representative of the person and have no place in their memories as far as I'm concerned.
      I mostly do glamour and erotic-art photography, often with models who pay me for portfolio work and that also comes with expectations to show them just a little prettier than they really are. In those cases I would often create a layer from the background, blue the hell out of it (gaussian blur with 50 pixel radius in both directions) and then slowly adjust it's opacity down till I find that perfect balance between soft skin and impossible perfection (though one of my all-time favourite shots I deliberately pushed it up until the girl's face looked as plastic as a Barbie-doll and named the picture "perfectly flawed" to make a point).

      So the degree of value you may expect from RAW shots really depend on what you do with your pictures. For the kind of scenario you describe - I would say the biggest advantage is that you get the power to make subtle adjustments to lighting. Studio lights give you picture-perfect setups but they don't really adapt well to the nuances of a specific shot. Depending on how your daughter was posed, what you want to stand out and remember in one picture may be subtly different from another - RAW gives you the power to make those subtle changes with little or no loss in quality.

      --
      Unicode killed the ASCII-art *
    28. Re:Why do you want to combine them? by BrokenHalo · · Score: 1

      The submission didn't actually specify backups, but in this case I would agree. From my point of view, if you want something done properly, you should be prepared to do it yourself. You can never get a 100% guarantee that your data is safe and secure, but you can get close by taking the process into your own hands.

      So-called "cloud" services, while convenient and trendy, are subject to too many conditions that are entirely out of your control (even if you run those services yourself), so in my paranoid opinion they are nearly worthless as a real backup. But bear in mind that I am an ancient programmer and sysadmin, and I have been accustomed over my 35+ years in the industry to offline storage of media in fireproof safes in off-site locations. Sure it costs money, but loss of data could cost the existence of the business.

    29. Re:Why do you want to combine them? by Subjective · · Score: 1

      If you don't trust the provider to keep your data intact, don't use that provider.

      That's either a ridiculous statement, or completely off-topic.

      Neither, actually.

      In a design like this, I assume that a storage resource - in this case, a cloud provider - will be either online, or offline. If they're offline, I need to work with a different copy of the data. Using a striping arrangement (or striping with parity) rather than a mirrored arrangement means there may not be another copy available.

      So you agree that you planned for an outage - you planned for them to not keep your data intact - you didn't trust the provider. But you used it.

      I think the statement is both
      Nobody trusts silicon or spindles, but we use them.

      It's also offtopic because the question was not which provider to use. A question that many people here seem to be trying to answer for some reason

      --
      My other .sig is also this bad
    30. Re:Why do you want to combine them? by robsku · · Score: 1

      If you don't trust the provider to keep your data intact, don't use that provider.

      If you need more storage, pay for it. The cost is not prohibitive - 100GB or so for under US$10/mo is pretty easy to find.

      If $10/month prices you out of the market, there are better things to worry about than encrypting files and storing them in the cloud.

      $10€/month? I pay 35€/year for account on Finnish "community", though it's for non-commercial use only and it doesn't aim to big profits but rather the payment is to cover their costs - now they don't have fancy "cloud file storage interface (just ssh/sftp to access your files) and the service is mainly provided for having "ssh shell" to run stuff like irc client on and email account, web pages & database (httpd, php, perl, mysql, etc...) and so on, but they also provide 50GB storage for ssh/http, which is backup secured, and additionally 500GB for your backups (which they don't keep backup versions of).

      fuse-encfs + sshfs should be easily set up to use that... for now I just upload encrypted and compressed tarballs via ssh to keep my data backuped.

      --
      In capitalist USA corporations control the government.
    31. Re:Why do you want to combine them? by Xtifr · · Score: 1

      Using a striping arrangement (or striping with parity)

      Ah, I see your mistake. You saw "raid", and assumed he was talking about striping, rather than the more obvious (since he then mentioned "redundancy") mirroring (e.g. RAID1). I'm not sure if that's a case of you having too little knowledge or you assuming too much on his part. I would have said mirroring if I meant mirroring, to avoid the potential for confusion, but I certainly figured he meant RAID1, even though the term "raid" is more commonly associated with striping.

    32. Re:Why do you want to combine them? by BlackPignouf · · Score: 1

      White balance isn't subjective : A gray card should be gray. :D
      But it's totally true that there are a lot of possible variations, and that playing with the orange/blue slider gives you a lot of creative flexibility.
      It also comes from the fact that your eyes and brain do a very good job at understanding that a gray card is gray even if it appears orange, green or blue.

    33. Re:Why do you want to combine them? by BlackPignouf · · Score: 1

      [*] Create an image all orange in GIMP: RGB=(255, 128, 0). Start the curve tool (with preview active), select a point about halfway the curve, and move it around. If you move it up the color shifts towards yellow, if you move it down it shifts towards red. Explanation: you actually only change the green channel, and yellow=(255, 255, 0), red=(255, 0, 0). The contrast curve is applied to each channel separately and red and blue are on the endpoints that aren't changed. This is an extreme example, but the effect on photographs is that many colors tend to be shifted towards black, white, red, green, blue, cyan, yellow and magenta when a contrast curve is applied, including the curve cameras apply when producing JPGs. Many people like those candy colors, and as they are 'louder' than the color distortions many monitors have (I still see lots of laptop screens that are far too blue) for many people they will look better than images with more realistic colors, but I have always disliked the effect and avoid it.

      Yes. Lightroom shows you in real time when you begin to lose information due to clipping in any of the 3 channels.
      Other programs (or your in camera histogram) might only complain when all 3 channels are clipping.

    34. Re:Why do you want to combine them? by semi-extrinsic · · Score: 1

      Thank you again :) I'll be giving RAW a second shot (pun intended) in the coming weeks.

      --
      for i in `facebook friends "=bday" 2>/dev/null | cut -d " " -f 3-`; do facebook wallpost $i "Happy birthday!"; done
    35. Re:Why do you want to combine them? by tlambert · · Score: 1

      I believe the asker didn't mention a price issue.

      An obvious underlying motivation for what the OP asked for is to take advantage of all the loss-leader offers out there for cloud storage providers; generally they will give you some storage free up to a storage cap, and if you want more, you have to pay. Being able to eke storage up to the cap from a lot of providers would let you effectively negate the cap on any single provider. So a primary motivation could be "I want free cloud storage".

      Availability is one reason to redundantly "split your eggs into more than one basket". Cloud outages happen from time to time. If one vendor is unavailable (temporarily or closed down indefinately), you want your files to be available from another vendor.

      Unavailability means that your vendor's cloud isn't working. One of the primary arguments in favor of cloud storage is that it doesn't matter where it's locate, so if a data center gets destroyed by a disaster, natural or otherwise, your data stays available. If your cloud storage vendo can't guarantee six nines of availability, then they are not actually selling cloud storage.

    36. Re:Why do you want to combine them? by silentcoder · · Score: 1

      Like any artform you should know the rules for doing it right - then when you break them, you do it on purpose and it means something.

      --
      Unicode killed the ASCII-art *
    37. Re:Why do you want to combine them? by egcagrac0 · · Score: 1

      There is a difference between "not intact" and "not available".

      I don't see a value in storing recovery information ("parity" in RAID parlance) on storage service B for the data on storage service A.

      I do see value in having a complete second copy ready to use on service B, for times when service A has a planned (or unplanned) outage.

      This is robustness and availability through failover.

    38. Re:Why do you want to combine them? by Subjective · · Score: 1

      I don't see a value in storing recovery information ("parity" in RAID parlance) on storage service B for the data on storage service A.

      Agreed. Put like that, it seems stupid.

      But what happens with 3 providers?
      A stores half the data, B stores the other half, and C stores the parity.
      If a single one is down for maintenance, the data is readable.

      So on. 7 providers. 5 hold data pieces and 2 hold parity
      Or it's still better to just mirror them? cut the data into 7 pieces and write each piece on two providers

      --
      My other .sig is also this bad
    39. Re:Why do you want to combine them? by BlackPignouf · · Score: 1

      The way I understand it, yes.
      12 stops dynamic range on a 12-bit RAW would mean that the darkest stop would get only 1 level while the brightest stop would get 2**11 levels.
      http://www.luminous-landscape.com/tutorials/expose-right.shtml

      Don't ask me why the Nikon D90 gets 12.5 stops on DXOMark even though it shoots 12-bit RAW.
      Either they're high, they have a different definition of DR or it involves weird non-linear conversions.

    40. Re:Why do you want to combine them? by egcagrac0 · · Score: 1

      A stores half the data, B stores the other half, and C stores the parity.
      If a single one is down for maintenance, the data is readable.

      That's an awful lot like storing parity on B for data on A.
      Since we've been comparing to RAID terms, dedicated parity storage is part of RAID 3 and RAID 4 - two levels which haven't been common in a long time (replaced with RAID 5 or RAID 6 - both featuring distributed parity).

      In a mirrored arrangement of n mirrors, so long as 1 mirror is up, you can read the data. (n-1 mirrors can be down simultaneously.) "Obviously" there will be a resync after an outage.

      So on. 7 providers. 5 hold data pieces and 2 hold parity
      Or it's still better to just mirror them? cut the data into 7 pieces and write each piece on two providers

      I see substantially diminishing returns after 2 providers, but your mileage may vary. I probably don't have to work within your justification framework.

    41. Re:Why do you want to combine them? by Subjective · · Score: 1

      A stores half the data, B stores the other half, and C stores the parity.
      If a single one is down for maintenance, the data is readable.

      That's an awful lot like storing parity on B for data on A.
      Since we've been comparing to RAID terms, dedicated parity storage is part of RAID 3 and RAID 4 - two levels which haven't been common in a long time (replaced with RAID 5 or RAID 6 - both featuring distributed parity).

      Yes, RAID-5 or 6 would be much better - distributed parity - even when one of the providers is down, only some of the pieces will require recovery

      In a mirrored arrangement of n mirrors, so long as 1 mirror is up, you can read the data. (n-1 mirrors can be down simultaneously.) "Obviously" there will be a resync after an outage.

      So on. 7 providers. 5 hold data pieces and 2 hold parity

      Or it's still better to just mirror them? cut the data into 7 pieces and write each piece on two providers

      I see substantially diminishing returns after 2 providers, but your mileage may vary.

      I intuitively see some truth in this, but why?
      * You don't gain any more speed
          You already maxed out your downstream
      * You don't really gain more reliability
          Reliability does not go higher and higher with more providers, because other components are still unreliable - i.e. you'll never reach 5 nines anyway.
      * You are "gaining" more complexity
          Slower write performance?

      I probably don't have to work within your justification framework.

      No, I'm pretty sure you don't. I think your own seems to be just fine :)

      --
      My other .sig is also this bad
    42. Re:Why do you want to combine them? by egcagrac0 · · Score: 1

      Yes, RAID-5 or 6 would be much better - distributed parity - even when one of the providers is down, only some of the pieces will require recovery

      I see absolutely no advantage to that (RAID5 or RAID6 across several providers). Mirroring provides redundancy; if it's an unacceptable risk level ("ohnoes! what if both providers go out of business in the same week?"), that's when you add more into the mix. (Of course, if that's an unacceptable risk level, you should be doing it in-house instead of outsourcing. There's a good chance if your own organization ceases operations and can't provide you the storage service, your internal customer will have simultaneously stopped needing the service - win-win!)

      Mirroring doesn't require "recovery" like a rebuild from partial data + parity. It just works.

      Yes, this has wandered far from OP's original question. How would I do it?
      1. I wouldn't, if possible
      2. iSCSI, if possible (not really, if you want to use "free cloud storage" as a backend)
      3. Find something that can mount the storage (SSHFS, or some other FUSE method), and copy the already encrypted files to/from it, repeat as needed for more copies on other providers, if I had to

  4. Don't trust the cloud by Anonymous Coward · · Score: 5, Interesting

    My residential internet connection via Comcast is fast enough today that I can pull files off of my server at home, "cloud" style.

    I have two 2TB drives in RAID1, encrypted with whatever magic `cryptsetup' performs, with port 22 of my firewall forwarded to the server. SSH only accepts logins from me. I consider my data to be more secure and easier to access (it's literally seconds away from availability on any real operating system anywhere with internet access. Windows need not apply) than anything I could get from ZOMG TEH CLOUD. Only disadvantage is speed. I'm not gonna be shunting gigabyte plus files around like this.

    Added bonus: easy to add users, easy to throw up a web interface, can do whatever you want with it, since you own the hardware (!!)

    Pfft, cloud. I remember when it was called 'the internet'.

    Now get the fuck off my lawn.

    1. Re:Don't trust the cloud by gripped · · Score: 4, Funny

      SSH only accepts logins from me.

      You hope

    2. Re:Don't trust the cloud by icebike · · Score: 1

      I consider my data to be more secure and easier to access (it's literally seconds away from availability on any real operating system anywhere with internet access.

      The thing about a cloud is that there are (if you choose the correct provider) multiple widely separated storage locations with redundant copies.
      Your setup, with both of your drives (and I wager also your backup copies) all sit in the same house.

      On match. One thief. One flood. One thunder storm.

      --
      Sig Battery depleted. Reverting to safe mode.
    3. Re:Don't trust the cloud by Blaskowicz · · Score: 1

      Easy to throw a web interface? I had installed Apache and looked at the kilometer long configuration file and was horrified. I installed a Webdav but thought it looked pretty useless. Fucked around to try to find a usable "web file manager" but I didn't found anything great and don't really know how to install them. Maybe on Windows you could get a setup.exe that set ups everything. Sorry, I don't know how throwing a web interface is "easy", I know a fuck ton about computers and some administration but I have no web dev experience.

      At least anyone can use Filezilla (Windows does apply)

    4. Re:Don't trust the cloud by Blaskowicz · · Score: 3, Interesting

      btw there's sshfs on Windows, I thought it would be pedantic to mention it but it exists albeit a bit slow.

    5. Re:Don't trust the cloud by jedidiah · · Score: 1

      > Easy to throw a web interface? I had installed Apache and looked at the kilometer long configuration file and was horrified. I

      That's much like whining about the size of a Windows application's registry hive.

      You must also be frightened by any fully featured modern video transcoder.

      --
      A Pirate and a Puritan look the same on a balance sheet.
    6. Re:Don't trust the cloud by Drinking+Bleach · · Score: 2

      You picked the wrong web server; Apache is great but its configuration is indeed difficult especially if you're not familiar with the concepts. Try out lighttpd, it's pretty dead simple.

    7. Re:Don't trust the cloud by Anonymous Coward · · Score: 1

      You may want to change your ftp and ssh ports to be non-standard ones if you have an internet-facing home server. Automated port scans usually pick up such services and will brute force your connection 24/7, lowering your available bandwidth. Check your logs if in doubt.

    8. Re:Don't trust the cloud by icebike · · Score: 1

      I don't know of any good way of accomplishing this, short of keeping everything in a big encrypted container (created with `dd' for example) and maybe chopping it up for transit. More trouble than it's worth. Suggestions welcomed.

      Yup, rolling your own is kind of painful. You can get it all working then turn your back and its gone to hell on your for no obvious reason.

      For that reason, I keep critical records and codebase in SpiderOak.
      Encryption happens in your machine, they do not have the key, and couldn't decrypt your data even if served with a warrant.
      Might not be suitable for a large collections, if for no other reason than the time and bandwidth involved.

      They have free accounts, but I pay them some pittance each year for 100 gig.

      --
      Sig Battery depleted. Reverting to safe mode.
    9. Re:Don't trust the cloud by DarwinSurvivor · · Score: 2

      If he's using them. AND if he remembered to also turn OFF password authentication.

    10. Re:Don't trust the cloud by dskoll · · Score: 2

      Turning off password auth is Basic SSH 101.

    11. Re:Don't trust the cloud by Voyager529 · · Score: 3, Informative

      > Easy to throw a web interface? I had installed Apache and looked at the kilometer long configuration file and was horrified. I

      That's much like whining about the size of a Windows application's registry hive.

      You must also be frightened by any fully featured modern video transcoder.

      No, there's a smidge of difference.

      The overwhelming majority of Windows applications can be configured using a series of dialog boxes, typically either in the "tools->options" or "edit->preferences" menu. These applications may incidentally store the results of those dialog boxes in a registry hive (or in an ini file in the %appdata% folder or similar), but it's infrequently the only way to make such changes. With Apache, they don't give you a tabbed, categorized dialog box in which to manipulate the options. Similarly, someone who "installed Apache...and was horrified" is probably not well-versed in working with HTTP server software, and thus, editing the Apache config file is going to be a mountain of guesswork as to what you'd really want in the first place. On top of that, there's the "you can usually fidget around to get Apache to do what you want it to do, but be really really careful because the easiest way to get it to work is also usually the most hackable, so if it works right with your instinct, you'll probably have to go back and change it later once you do end up getting it to work".

      As for video transcoding, unless you're a masochist who prefers using FFMpeg on a command line instead of the myriad GUI options, video transcoding CAN be as easy as "choose your source video, pick the general format you want it to end up in or the type of device you want it to go on, and click 'transcode'". In those cases, most of the advanced options are optional, and the defaults are generally close to what you want unless you know specifically that you need a particular non-default option somewhere. This is different than trying to get a web server up and running, especially since there's no security consideration to the video transcode.

    12. Re:Don't trust the cloud by Voyager529 · · Score: 2

      Making your own web interface for file management? somewhat challenging. Finding a canned one that doesn't utterly suck? Well, that's what Sourceforge is for =)

      Ajaxplorer:
      http://sourceforge.net/projects/ajaxplorer/?source=directory
      Simple to use browser app, and there are iOS and Android apps that do a great job.

      Extplorer:
      http://sourceforge.net/projects/extplorer/?source=recommended
      Better support for larger quantities of files and browsing using a traditional tree/file pane, but slightly more complicated UI due to the smaller, more nebulous buttons.

      As far as getting it to run on something, your best bet is to either try XAMPP, or better yet (if you've got the RAM for it and enough hard disk space), grab a copy of VirtualBox and head over to TurnKeyLinux.org, where they've got pre-configured LAMP stacks with plenty of browser based applications, including Ajaxplorer, which you can have up and running, perfectly configured, in twenty minutes or less =).

    13. Re:Don't trust the cloud by DarwinSurvivor · · Score: 1

      And yet still missed by WAAAAAY to many Linux users. Otherwise those bot-net brute-force attacks would've stopped years ago.

    14. Re:Don't trust the cloud by jabuzz · · Score: 2

      SSH key's get stuck on USB keys for convince which get lost and you have zero control on the quality of the password used to secure the SSH key.

      None obvious usernames and enforced password quality are in my opinion more secure.

    15. Re:Don't trust the cloud by AmiMoJo · · Score: 1

      You'd be surprised how many people skipped that class, probably figuring it was too basic for their l33t skillz...

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    16. Re:Don't trust the cloud by Hatta · · Score: 3, Interesting

      The overwhelming majority of Windows applications can be configured using a series of dialog boxes, typically either in the "tools->options" or "edit->preferences" menu. These applications may incidentally store the results of those dialog boxes in a registry hive (or in an ini file in the %appdata% folder or similar), but it's infrequently the only way to make such changes. With Apache, they don't give you a tabbed, categorized dialog box in which to manipulate the options

      No, they give you a nice organized text file to edit, with descriptive comments. You can search it and you can back it up easily. That's even BETTER than a tree full of checkboxes.

      --
      Give me Classic Slashdot or give me death!
    17. Re:Don't trust the cloud by Anonymous Coward · · Score: 1

      sorry, but over a network, even a REALLY fast network, you will never be able to perform brute force attacks successfully against a STRONG 8+ character password.

      brute force is only plausible against a known hash, something a brute force does not have.

    18. Re:Don't trust the cloud by Common+Joe · · Score: 1

      I'll say this about hope: According to his 1040 forms, I'd say he's underpaid. He better hope his girlfriend doesn't find out. Speaking of his girlfriend, there are lots of interesting pics of her. I didn't know the human body could pretzel that way.

    19. Re:Don't trust the cloud by Rysc · · Score: 1

      Two words: port knocking.

      You may laugh but it hides you from casual attackers pretty definitively. IOW, bot net brute forcers no longer clog up my auth log.

      --
      I want my Cowboyneal
    20. Re:Don't trust the cloud by DarwinSurvivor · · Score: 1

      Password's get stuck on postits for convince which get lost and the legth of a known password is immaterial to it's strength.

      You were saying?

    21. Re:Don't trust the cloud by xOneca · · Score: 1
      Maybe you could try Cherokee and its web interface...

      BTW, nginx has a very simple config file...

  5. spideroak by characterZer0 · · Score: 3, Informative

    Spideroak (http://www.spideroak.com) does what you want. It encryptes data on your machine before sending it to the cloud.

    --
    Go green: turn off your refrigerator.
    1. Re:spideroak by Anonymous Coward · · Score: 1

      So does Wuala: https://www.wuala.com/
      I wonder why the poster didn't try searching the web for a phrase like "encrypted Dropbox alternative".

    2. Re:spideroak by icebike · · Score: 1

      True, and I really like the fact that you can set it up to keep file changes so that you can step back in time to retrieve last weeks code base, or deleted files. It has lots of flexibility.

      But it does not do the other half of the OPs requirements, of mirroring or raiding the data to multiple physical locations.

      --
      Sig Battery depleted. Reverting to safe mode.
    3. Re:spideroak by characterZer0 · · Score: 1

      If you keep a second machine up with the Spideroak program running, it will mirror your data. It would be nice if there was an option to run the program and pull the encrypted data but not decrypt it, so you would have the backup if Spideroak's systems go down, but your data would not be compromized if that machine was.

      --
      Go green: turn off your refrigerator.
    4. Re:spideroak by icebike · · Score: 1

      If you keep a second machine up with the Spideroak program running, it will mirror your data.

      That is an option. But its not a requirement.
      SpiderOak operates in three distinct modes
      Backup
      Sync
      Share
      Each are individually selectable.

      --
      Sig Battery depleted. Reverting to safe mode.
    5. Re:spideroak by man_of_mr_e · · Score: 1

      Wuala is nice, but not widely supported by third party apps (particularly in the mobile space where you don't typically have control over where files are stored).

    6. Re:spideroak by Subjective · · Score: 1

      Yeah, I also wonder why he didn't search for things he didn't want, like using some single service for data storage, as opposed to a layer built over several providers
      I wonder why he didn't search for "encrypted flash disk changing robot", "encrypted gnomes", "dropbox alternative for people who dont want something like dropbox" or "latex bondage"

      --
      My other .sig is also this bad
  6. really... by CdBee · · Score: 1

    ... being a free software user doesnt mean you need to be a free service user: If you aren't paying, you aren't the customer.

    I use both Google Drive & Dropbox (for different usage cases and purposes) but my really important backups - including everything from both the other two - go into Amazon S3, as I have a contract there with the supplier, and knowing I'm a paying customer of a profitable service means I'm much less likely to have to rethink my backup strategy due to a withdrawal of a free offer. The time spent doing an initial backup of all my files I want to protect means I dont want to have to do that often, incremental backups are much easier to live with.

    --
    I have been a user for about 10 years. This ends Feb 2014. The site's been ruined. I'm off. Dice, FU
  7. preferably linux mountable by vlm · · Score: 3, Informative

    preferably linux mountable

    You'll find a userspace script solution to be infinitely simpler. A script that clones such and such directory onto such and such other directory while encrypting is simple, another script to clone that encrypted directory into some other directory (basically just rsync). Run it periodically outta crontab, etc.

    90% your effort will be expended on error detection / correction / reporting, 9% of your effort on key management for the encryption and keeping the individual services up and running, and probably about 1% on the actual nuts and bolts of copying stuff around while possibly encrypting.

    There are more failure modes than you'd think... consider giant files, for example, which don't fit. Or running it outta crontab and somehow having two copies running simultaneously. Or your scratch directory is on a device that suddenly got remounted RO instead of RW due to developing hardware issues.

    Bidirectional sync is ambitious but possible. You'll burn a seemingly infinite amount of bandwidth trying it (think about the next quote for a second)

    The basic idea would be to create one file per cloud used as a block device

    Thankfully you're just mirroring instead of requesting some kind of raid-5 like technology. Also you're just dumping "a big ole backup file" rather than individual files.

    --
    "Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
    1. Re:preferably linux mountable by Subjective · · Score: 1

      Thank you. At least you are going in the direction of what he wants

      I agree. This looks like a job for good old rsync! Or an rsync-like device.

      --
      My other .sig is also this bad
  8. Encrypt it before you store it by vlm · · Score: 1

    encrypt the data stored in the cloud

    Oh and another thing its infinitely more secure to encrypt the data before "putting it up on your homemade mirror network" rather than as a process.

    For example, 99.99999999% of the data I "control" does not need to be encrypted. It just simply doesn't matter, even to a paranoid, although those know no rational limit....

    Another example, lets say you were backing up a sql database of usernames/passwords for some site. The wrong way to do it is store the passwords in plain text and then encrypt the backup. Wrong for about a zillion (obvious?) reasons. If you have a decent system to hash and/or encrypt the data in the DB itself, thats much better, and no one can do anything with the encrypted data anyway. Or at least your database-level-backup script (as distinct from this project) can encrypt it for you (even if its just pipe mysqldump thru mcrypt and then into a file)

    --
    "Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
    1. Re:Encrypt it before you store it by Subjective · · Score: 1

      Oh and another thing its infinitely more secure to encrypt the data before "putting it up on your homemade mirror network" rather than as a process.

      I'm not sure I understand - 'rather than as a process'. you mean rather than as part of the storage process? It's more secure to have the data already encrypted, before storing it.

      For example, 99.99999999% of the data I "control" does not need to be encrypted. It just simply doesn't matter, even to a paranoid, although those know no rational limit....

      OK, now you're just attacking him for wanting encryption.

      Another example, lets say you were backing up a sql database of usernames/passwords for some site. The wrong way to do it is store the passwords in plain text and then encrypt the backup. Wrong for about a zillion (obvious?) reasons. If you have a decent system to hash and/or encrypt the data in the DB itself, thats much better, and no one can do anything with the encrypted data anyway. Or at least your database-level-backup script (as distinct from this project) can encrypt it for you (even if its just pipe mysqldump thru mcrypt and then into a file)

      I agree, but still - you don't want the hashes to leak, either. (no matter the hash or salt, username+hash is way better than a web login interface, and if you have some knowledge about the user you might break it)

      Why not encrypt everything as you store it, as well as (of course) keep salted hashes of passwords and not plaintext.

      What about shared secrets and private keys? Should they be encrypted twice (before backup and at backup)?

      --
      My other .sig is also this bad
  9. I just use emacs by Billly+Gates · · Score: 1

    You may need another text editor though

  10. Can be done with a FTPfs, raid and encfs by devitto · · Score: 4, Interesting

    Someone's already done & blogged about this, using multiple free FTP accounts, with a FTPfs bringing them local, then mounting a RAID (mirrored & parity) partition over it, and encfs over the top of that.

    It was VERY SLOW, but did work, even when he blocked access to some of the FTP accounts - it was just seen as a failed drive read, and the parity reconstruction still permitted access.
    I think the key problem was that FTP servers he used (or the FTPfs driver) didn't allow for partial writes to files, so every time you changed something, large amounts of data was re-uploaded. So there were possibilities for optimization.....

    Enjoy & share if you get anywhere !

    Dom

  11. Truecrypt and Dropbox by Billly+Gates · · Score: 1

    I use both and there are instructions here including a script where you run l.bat to set it up and sync.

    However, it seems your use case is a little different than a personal backup.

  12. Cloud Striping by lucm · · Score: 4, Funny

    Forget redundancy, just go with "RAIC-0": unleashing the true power of the Cloud by striping providers!

    --
    lucm, indeed.
  13. EXCITED KIDS by petur · · Score: 2

    Just pay for it FFS, why try to combine different free services, and go throuth the trouble of running your own linux server in order to save 10$ a month oh my god, "!#$ excited kids.

  14. CEPH/RADOS/RBD by Heebie · · Score: 2

    You could use CEPH to do the distribution, then RADOS to create an RBD (Rados Block Device) and when you mount the RBD as asn iSCSI device, you could then build a cryptfs device on top of it, so the provider of the RBD couldn't read/write the data without the keys stored on your server (or wherever you keep them.) The difficulty is getting something like this that is product-ized, so that a provider can give enough economy-of-scale to make it really worthwhile.

  15. Bitcasa by Anonymous Coward · · Score: 2, Interesting

    Bitcasa is an encrypted block based filesystem which mounts via FUSE and streams to the cloud behind the scenes. Has really intelligent caching built in and works with all major platforms (Lin, Win, Mac).

    Linux client hasn't been updated as much as the other platforms but should catch up soon.

    Full disclosure- I'm the CEO of Bitcasa.

    1. Re:Bitcasa by Subjective · · Score: 1

      Fuller disclosure:
      * Storage is Amazon S3. No mention of other clouds. So it's just a worse version of S3
      * Client is closed binary
      * Horrible 'acceptable use' policy and terms of use
          oh yeah, it is closed source:
      "You must not reverse engineer or decompile the Software, nor attempt to do so, nor assist anyone else to do so"

      So, um, yeah, they're data pirates waiting to kidnap your data. Have fun.

      --
      My other .sig is also this bad
    2. Re:Bitcasa by Subjective · · Score: 1

      I'm still looking for the legal part where it says you're not allowed to connect to the Service with any kind of modified client.

      --
      My other .sig is also this bad
  16. Why do this ? by Alain+Williams · · Score: 5, Insightful

    He has not said why he wants to do this, ie what problem he is trying to solve. Depending on the question the answer may be different. Does he want a cloud because:

    * data must be available from many places - ie over the Internet ?

    * data is to be safe from one place (ie home/office machine) blowing up and losing everything ?

    * fast access is needed from many places at once ?

    Please first answer these questions so that we may provide you with what you need rather than random solutions that may not be what you need.

    1. Re:Why do this ? by bazorg · · Score: 4, Insightful

      I'd wager that OP is more interested in using 5 free accounts supplying 10GB each than to pay a monthly rent for 50GB.

  17. The Cloud? LOL! by Mister+Liberty · · Score: 1

    Avoid it.
    Punt

  18. Wouldn't a(n) LVM accomplish this? Set up a bunch of logical devices, put them into an LVM, and let that take care of itself?

  19. OwnCloud? by RanceJustice · · Score: 4, Informative

    I too have been looking for a solution for "denyable-they-don't-have-the-encryption-key" secure, remote storage, back ups and the like. Platform independent and standards compliance is important; I don't want to get locked into a proprietary ecosystem Its even better if there's a nice GUI and usability that doesn't require guru-level knowledge to access, and pricing isn't insane. Thus far I've found a handful of tools that seem to be the best of their breeds - CrashPlan for instance allows encrypted, secure multi-site backups (your own PCs, friends PCs, their servers), unlimited bandwidth/storage space etc... but it is only meant for backups, not sharing or accessing the data frequently. SpiderOak is a fantastic Dropbox alternative, Linux-friendly (both GUI and CLI for those interested) and seems to be amongst the best of the "Cloud (tm)/ Dropbox" type file-hosting/sharing services. However, as the OP specifically notes that they are looking for a unified solution to bring most or all of those remote hosted/"Cloud" stuff under a single mantle, there seems to be one project that has that goal in mind - OwnCloud

    I've been watching OwnCloud (www.owncloud.org) since I heard of it, happy to see an open-source, standards-compliant, "installable on your own hardware as well as rented hosting etc.." universal, modular data storage/sync operation that can be totally under your own control. It has a ton of features, but most notable in this case is exactly what the OP wants: the ability to mount your Google Drive or Dropbox share and have your OwnCloud install interact with them. It looks to be a really promising project and I really hope that a lot of coding gurus join and take notice; if my skill was sufficient, I'd be looking to contribute. It is a relatively new platform and I am sure it will have some growing pains (ie. I do not know if it supports ALL "cloud drive" shares, for instance SpiderOak...), but it supports everything from a built in media player, Card/CalDAV, backups, LDAP, and seems to have amazing potential. I am told that Version 5.0 will be the next big leap forward in terms of polish. Check it out and those that can contribute, please do so. It seems the best option to have user-friendly, open source, secure "cloud" services without bolstering hegemony aspirations by Google, Microsoft, and many others.

    1. Re:OwnCloud? by RanceJustice · · Score: 1

      My understanding is that that "commercial" editions utilize the same codebase and exact same features of the community edition, but are simply based upon charging for support, branding and the like. It doesn't seem like they're trying to get everyone into pay-only options, especially considering that the commercial editions are nigh-exclusively big ticket items. Since the project is often sustained by web-hosts who are utilizing the community edition and reselling OwnCloud instances, many of them certified and/or recommended by the OwnCloud project itself (ie OwnCube.com), it seems to me the emphasis will continue to be upon the community edition. It don't see anything proprietary, with the possible exception of 3rd party custom-created modules and extensions discussed - but I am unsure if the license requires that these be open source as well.

      It seems everything in OwnCloud is based on proven, free and open source technologies. I too am eager to see the next version, but I hope all those that can make use of it start purchasing hosting, contributing, and otherwise backing OwnCloud in its current incarnation

  20. FreeNAS + OpenVPN by ternarybit · · Score: 2

    FreeNAS + OpenVPN is my "cloud" storage. Decent Comcast upstream at home means I have direct access to all my files anywhere, via a single UDP socket secured with certificate-based authentication and encryption. I take special solace knowing I own the hardware my data touches, and FDE on all endpoints ensures another layer of protection.

  21. Openstack swift by GPLHost-Thomas · · Score: 1

    Probably, that's not what the OP is searching for, but Openstack swift is a very interesting cloud storage solution which has redundancy, so I thought it was a good idea to raise the topic in this thread.

  22. S3QL by mrvanes · · Score: 1

    Have you looked at S3QL http://code.google.com/p/s3ql/? Mountable infinite Amazon S3 storage via fuse (no limited blockdevice setup).

  23. Interfering with business models by vs · · Score: 2

    Bruce Schneier's friendly reminder that distributed/encrypted cloud storage interferes with the cloud providers' business models. It'd be terribly useful, but I'm afraid they will keep on throwing sticks between our legs there for quite a while.

  24. Re:OwnCloud by LVSlushdat · · Score: 1

    THIS!!
    I wanted to have my own dropbox-like file sync, and since I'd heard good things about OwnCloud, I signed up for a Xen virtual server which came with 40GB of diskspace, put Debian on it, and installed Owncloud.. Was a piece of cake to get working.. Works like dropbox, and with the sync client on my systems and my wifes systems, we get a real-time sync of our critical files. I still back up music/videos & less critical stuff locally only as there is just too much to try and "cloud backup" all of that..... I'm considering getting another vps from another vendor and rsync'ing across the Owncloud datastore to the second vps..

    --
    THANK YOU, Edward Snowden!! Americans owe you a debt of gratitude (whether they know it or not..)
  25. Storage Made Easy by Splat · · Score: 1

    I think you just described SME:

    http://storagemadeeasy.com/

  26. nCryptedCloud by satansovenmitt · · Score: 1

    Check out this startup: https://www.ncryptedcloud.com/ Basically their primary piece is encryption into the public cloud while maintaining the ability to share your data (encryption/decryption is all client side). They are also combining the backend storage, so your view would be across drop box, Google drive, etc. They are in "early access" now but I hear they going GA soon.

    1. Re:nCryptedCloud by OneMiddleMan · · Score: 1

      Simple UI, Mac OX, Windows and iOS support. Good starting point.

  27. 'implying' by atari2600a · · Score: 1

    You're assuming every single cloud-storage-as-a-service...service uploads differences as opposed to wiping & rewriting the whole thing. If you're gonna treat each service as a volume at least have multiple blocks to hack your way around that. But yeah, I remember as far back as right after GMail launching, some hackers RAIDed multiple GMail accounts together for unlimited storage. I wouldn't know if you'd find public info on this as I've personally never had a need for more than modest storage.