Slashdot Mirror


Ask Slashdot: How Do You Store a Half-Petabyte of Data? (And Back It Up?)

An anonymous reader writes: My workplace has recently had two internal groups step forward with a request for almost a half-petabyte of disk to store data. The first is a research project that will computationally analyze a quarter petabyte of data in 100-200MB blobs. The second is looking to archive an ever increasing amount of mixed media. Buying a SAN large enough for these tasks is easy, but how do you present it back to the clients? And how do you back it up? Both projects have expressed a preference for a single human-navigable directory tree. The solution should involve clustered servers providing the connectivity between storage and client so that there is no system downtime. Many SAN solutions have a maximum volume limit of only 16TB, which means some sort of volume concatenation or spanning would be required, but is that recommended? Is anyone out there managing gigantic storage needs like this? How did you do it? What worked, what failed, and what would you do differently?

129 of 219 comments (clear)

  1. Just put "bomb" and "assassinate" in every line. by Anonymous Coward · · Score: 1

    It's all going to get backed up.

  2. ceph by drew8523 · · Score: 3, Informative

    we use Ceph, its fast, redundant, and crazy scalable, oh did i mention free (paid support)? ceph.com

    1. Re:ceph by u-235-sentinel · · Score: 2

      we use Ceph, its fast, redundant, and crazy scalable, oh did i mention free (paid support)? ceph.com

      Personally I've been using Ceph for the last few years myself. It has to be one of the best DFS's I've ever used. It includes security, speed, easy to expand by adding additional nodes. The free part was great. I found it looking through the repos one day. You can even tie it into other projects such as Hadoop (at least I recall reading it had a plug in a couple years ago).

      Great product!

      --
      Has Comcast disconnected your Internet account? Same here. You can read about it at http://comcastissue.blogspot.com
  3. Ambiguous by smittyoneeach · · Score: 4, Insightful

    Do you mean:
    (a) "Don't store it. Employ Amazon (or some other cloud) storage."? or
    (b) "Do not use Amazon."
    Clarity: it's like that one thing that is not the other thing, except for when it is.

    --
    Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
    1. Re:Ambiguous by FatdogHaiku · · Score: 1

      Clarity: it's like that one thing that is not the other thing, except for when it is.

      Good Lord! You've hit on the exact motto needed for my new startup!
      Random Eyeglasses Hut

      This is going to be so much better than what we had:
      "Somebodies prescription, in about an hour..."

      --
      You have the right to remain sentient. If you give up the right to remain sentient, you will be elected to public office
    2. Re:Ambiguous by allo · · Score: 1

      Don't do the shit.
      Go shopping on amazon

      FTFY

    3. Re: Ambiguous by cthulhu11 · · Score: 1

      Ceph. Or if you can get past the traditional filesystem concept, Swift. Don't cheap out on journals. Trust me on this.

  4. Talk to Vendors by Old+VMS+Junkie · · Score: 3

    Honestly, you should talk to the pros. I would call a couple of storage vendors, give them the basic outline of what you want to do, and let them tell you how they would do it. You can even get more formal and issue a Request for Information (RFI) or even a Request for Quote (RFQ). If you're a biggish company, your purchasing people probably have an SOP and standard forms for how to issue an RFI/RFQ. For the big boy storage vendors, half a petabyte is commonplace. The bigger question may very well be what this is going to look like at a software level. Managing the data might be a bigger challenge than storing it. Is this going to be organized in some sort of big data solution like Hadoop? Is it just a whole bunch of files and a people are going to write R or SAS jobs to query against it? Sometimes the tool set that you want to use will drive your choices in how to build the infrastructure under it.

    1. Re:Talk to Vendors by Anonymous Coward · · Score: 5, Informative

      Honestly, that's the WORST thing to do. When you talk to the pros, they will try and sell you some outrageous overpriced Fiber Channel system that's total overkill for what you are doing. I've worked with 'big data' storage companys like EMC and Netapp. We needed 300TB of 'nearline' storage, and EMC came up with a $3,000,000.00 TOTAL overkill Fiber Channel solution, and Netapp wasn't much better, coming in at close to $2,000,000.00. Total ripoff. The ONLY reason you would ever choose Fiber Channel over ISCSI is if you are doing HUGE transactional database, with millions of access per minute. If you just need STORAGE, I went with Synology, and got 300TB of RAID-10 storage for about 100K. I DUPLICATED it (200K total), and still only paid 10% of what the 'vendors' tried to sell me, I was VERY clear that I did not need Fiber Channel, I refused to spend tons of money for something that would have zero bearing on the performance, and found it's much better to research and provide your own solution at 10% of the cost of the big vendors. Why do you think EMC has almost 3Billion of revenue, because they convince pointy haired bosses that their solution is the best. Trust me, going with a 2nd tier vendor for 'near line storage' is a much better idea than talking to the 'big 5' to ask for a solution

    2. Re:Talk to Vendors by NatasRevol · · Score: 1

      LOL at FC only for transactional DBs.

      Also, VM environments. Large media servers, etc.

      My solution:
      Infortrend. It has iSCSI for you and your slow environment, and FC for me and my fast environment. And cheap enough for both.

      Also, 300TB of RAID 10 at $100k is most likely 7k rpm. I much prefer 15k as it's performant for VMs even when full of running VMs. 7k drives never will be. Well, maybe if you put a nice fat SSD cache in front of it.

      --
      There are two types of people in the world: Those who crave closure
    3. Re:Talk to Vendors by jbolden · · Score: 1

      Netapp provides performance storage. If you don't want performance and only want part of their solution they can virtualize the software and run on anyone's hardware. You can be down around $12k / mo for 300TB duplicated 1x with their software. Nowhere near $3m.

    4. Re:Talk to Vendors by laurencetux · · Score: 1

      How is a function of what can you spend on it. If you can drop a couple megabucks on it then you will get a solution that delivers your data before its requested and if a storage module THINKS of going bad its going to get swapped out.

      but yeah send feelers out to as many vendors as possible (and don't forget The Other Tower does not count as offsite backup)

    5. Re:Talk to Vendors by ArcadeMan · · Score: 1

      I put my data inside XML files, split the fields with CSV and store all of it on 4200RPM laptop drives that automatically go to sleep after a few minutes of inactivity.

      Oh, and I backup all of that data on punched tape once per year.

    6. Re:Talk to Vendors by markus_baertschi · · Score: 1

      A well written RFI sent to some vendors should give you an overview of what is available and at what cost.

      As you need file level access you should talk to NAS vendors, like Netapp or ENC Isilon. They will certainly have storage boxes for you. You'll have to fit a backup solution to your storage box too, this is work and adds cost.

      If you think this may grow, then look at scalability. Not all solutions scale. Also you may end up with millions of files, this may be problemantic to some backup solutions.

      I have experience with IBMs Spectrum Scale (GPFS) kit. The cluster filesystem scales nicely and handles lots of files and data with excellent performance. With the recent Elastic Storage components the price per TB is very competitive (5-10x lower than traditional enterprise storage). A half a PB of storage should cost in the $150k range, you'll have to add more for backup and maybe implementation.

      For the backup you should look into tape robots. Handling a TB size backup set manually is not fun and will require considerable manpower, when a robot does it mostly unattended. It may also make sense to combine your backup with other backups on site, you may end up be ten imes bigger than everything else combined...

    7. Re:Talk to Vendors by Anonymous Coward · · Score: 1

      Talking to companies that try to oversell you is the worst thing you can do.

      Talk to Oracle and spend 20x more.

      Talk to Cisco and spend 5x more.

      Derp

    8. Re:Talk to Vendors by mlts · · Score: 2

      Oracle has a SAN (well, SAN/NAS) offering which does similar with a rack of ports/HBAs that were configurable, assuming the right SFP was present. Want FC? Got it. iSCSI? Yep. FCoE? Yep. Want to just share a NFS backing store on a LAG for a VMWare backing store. Easy doing.

      The price wasn't that shocking either. It wasn't dirt cheap like a Backblaze storage pod, but it was reasonable, especially with SSD available and autotiering.

    9. Re:Talk to Vendors by Anonymous Coward · · Score: 1

      Talking to professional SALES people is the worst thing you can do. They will sell you what they have, and what they think you can afford... the WONT sell you affordable solution that you actually NEED.

      If you don't know the difference between sales professionals and IT professionals... you are part of the problem.

    10. Re: Talk to Vendors by mlts · · Score: 1

      Unless I'm completely hallucinating, I have set up MPIO on ESXi for iSCSI, as well as a LAG (link aggregate) for a NFS based backing store.

      iSCSI has its place in the enterprise, and it can be used in production. If the NIC supports it, it can even be used for booting. How does it fare against 8GB FC? In reality, there are a few tasks which will saturate a 10GB iSCSI link or an 8GB FC link, but not that many.

      All of these are just tools in the toolbox. iSCSI is easier to get going ad-hoc (but still be useful with MPIO), FC is well known and well used, and FCoE seems to be popping up because it works well with Cisco Nexus architecture.

    11. Re:Talk to Vendors by FrozenGeek · · Score: 1

      Talking to the pros is only the worst thing to do if you know as much, if not more, than they do. The fact that the OP is asking slashdot indicates he does not know a lot about setting up storage in the PB range. Are the major vendors overpriced? In terms of the hardware you get, probably. In terms of the knowledge they bring to the table, probably NOT in the case of the OP. If you have someone who can select COTS components and effectively couple them with some good OS/SW, great. Otherwise, get someone who knows what they are doing and buy their solution. Doing it on your own when you don't know what you are doing will only end in tears.

      --
      linquendum tondere
    12. Re:Talk to Vendors by AK+Marc · · Score: 2

      He wasn't very clear about his complaint, but talking to professional sales people about what you need will never get you an optimal solution.

    13. Re:Talk to Vendors by AK+Marc · · Score: 1

      I know you were making a joke, but 4200 RPM laptop drives are great. You'll have trouble finding a lower power usage spinner, and the read speed will be roughly interface speed for most practical implementations of multi-drive arrays.

    14. Re:Talk to Vendors by drsmithy · · Score: 1

      RAID10 for nearline storage ?

      More research required, methinks.

    15. Re:Talk to Vendors by Oceanplexian · · Score: 1

      We have actually purchased a NetApp cluster, replicated in two sites, and while I can't divulge what we paid (Plus I'm just the guy who set it up), there's a good chance the parent is off by almost an order of magnitude. Now – I'm not saying you couldn't build your own storage cheaper, or that I have my own issues with NetApp, or that some sort of Cloud solution might not be an even better answer- such as Amazon S3 or Glacier, I will say that a SAN is not at all a bad idea and depending on how important your data is, absolutely worth it. Synology makes great gear but they're in a completely different league compared to something like a NetApp and especially an EMC, just in terms of redundancy (redundant psus, redundant shelves, redundant controllers), support, and performance. It's the same reason banks spend millions to run mainframes even though a new smartphone is probably faster.

    16. Re:Talk to Vendors by ihtoit · · Score: 1

      Seconded. I use laptop EIDE drives for my network scratch - it's great, the array runs at saturation for my Gigabit network. And at 2TB, that volume isn't too shoddy on usable space either.

      For archival storage (for some measure of permanent to not include removable tape) I use huge drives in quick-release caddies and set to JBOD and simply diff the data daily. Once the drive's full, out it comes and in goes the next empty. Full drive goes offsite. Working volume is around 14TB right now, that's a RAID6. All commodity x86/x64 gear. My network volumes are all running in a wooden footlocker on an Athlon64 3400+ clocking at 800MHz.

      --
      Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
    17. Re:Talk to Vendors by Thumper_SVX · · Score: 1

      At least go with Dell. Dell will sell you an MD3860i with 60 6TB hard drives for not much more than what you paid for the your Synology. Performance is just as good as the Synology, you'll get next day on site support from a Dell tech, and a smaller rack/power/cooling footprint as well.

      Seconded... though having recently seen a lot of quotes you could do worse than the Dell SCv2000 which is the newer replacement for the MD3860i using the Compellent code. It's faster and cheaper than the MD, mostly because Dell no longer has to pay the Netapp tax for every MD (the MD's are based on an LSI chipset that's owned by Netapp)

    18. Re:Talk to Vendors by Bengie · · Score: 1

      I saw a ZFS benchmark comparing random read, write, read+write, and sequential read, write, and read+write of a 15k RPM RAID and 5400 RPM with 10x as much storage but just as many spindles for a fraction the price, and the 5400 RPM setup was faster once the 64GB of SSDs got warmed up.

    19. Re:Talk to Vendors by KGIII · · Score: 1

      Dell will sell you an MD3860i with 60 6TB hard drives ...

      How odd? I was drooling over that exact appliance the other day and wishing I could find something similar for home use. I do not want/need fiber. I do have a rack in my data room in the basement. Something rackable, CAT5/6, PB (or close) support, low power, easy management, enterprise level support - can be toned down a bit, expandable, and offering built-in redundancy... There was a YellowBox (I think that was its name, it has been long since discarded) appliance that as nice and met some of those needs, I feel it should have been expanded on. I currently have a home-grown solution based on simple white boxes. They are not rackable and they are power hungry even though they are minimally used. Maybe something based on the above ideas with four Atom CPUs running a *NIX variation with a front end or ability to mount slices of space. Money is not the objective, I will pay handsomely, but finding something that really fits my desires is difficult.

      I am sure such an appliance is out there and meets my needs almost exactly. I have not yet found it. I would even pay enterprise level pricing (though I expect enterprise level hardware) and would also want the ability to upgrade to SSDs (without needing to add them all at once) when those become a bit more mature for long-term use and the price becomes more reliable.

      One of the things I miss most about still owning my company is I am no longer able to lug home equipment that has been replaced. (I always just gave depreciated equipment to myself, employees, or donated the hardware to local schools. Being a tech-heavy business meant stuff was replaced fairly often and still had a great deal of use left in it.) I kept a lot of that stuff and still use a bunch of it today though it is, more often than not, to play around and much of that is now 10+ years old so upgrading/adding new toys is an option. I do get occasional hand me downs as I still go in and do some work for the company once in a while, I also have stock in the parent company, but they are fewer than was in the past.

      Anyhow, the silly mindless drivel above is mostly unimportant. I too, however, would like to be able to have a large storage array with backup capability. I already have off-site backups (not at the enterprise level) and a disaster recovery plan in place as well as cold-storage in a safe deposit box as well as a friend's garage. I would love to have a decent, easily managed, appliance for it that had great support and easy upgrading to 'future proof' things for a while.

      --
      "So long and thanks for all the fish."
    20. Re:Talk to Vendors by KGIII · · Score: 1

      Someone needs to cluster a bunch of unbranded cell phones and build an HPC out of them! A custom rack could hold countless phones and each could contain a 128 GB card. When one goes down they can chuck it into the trash and toss a new one into the cradle. Using a wireless mesh network would be a bottleneck but I suspect it would crunch a lot of numbers but the power consumption may be an issue. I am sure I am missing some snags, I have not actually given this any real thought, but those could be ironed out.

      I think this is a thing that needs to be done simply because of General Principle and his army of ants. We need to give it a good cause and get a kickstarter going. I would throw a few dollars (if it looked like they may actually make a serious attempt) at it just to have some laughs. We can build it and sell the compute cycles at cost to people sequencing genomes of rain forest flora and fauna. (It might actually be okay at that. If not, throw some more hardware at it - my favorite solution for everything.) The environmentalist ideal would potentially garner support. We could even make it based on used (read "RECYCLED") cell phones. Register it as a NPO and people can write off their old phones as a donation. It would employ smart people and help the environment! What's not to love?

      That, folks, is my shitty idea of the day.

      --
      "So long and thanks for all the fish."
    21. Re:Talk to Vendors by Cramer · · Score: 1

      Obviously, you've never used LTO technology. They cannot repair tracking errors -- the bits written when the tape was low-level formated, something NO commercial drive can do! "Bit Rot" will destroy LTO tapes in a matter of months if they are not kept at a nearly constant temperature. Conversely, I have DLT, DAT (4mm and 8mm), QIC, Exabyte (8200?) etc. tapes that are still readable after decades. (one of those 8200 tapes sat in a kitchen drawer for 11 years!) Yet, I have a trash bin full of LTO-2 tapes that are 100% unusable after one cycle through Iron Mountain's archive. The SDLT-I's have lasted 8+ years of continuous use (~1wk in the library, then 2-3mo on a table in the DC @ a constant 68F); the LTO-2's (fuji and sony) begin to fail after ~2yr in the same environment.

      (In fact, the SDLT DRIVES are failing more often than the media these days. The laser tracking servo fails. The drives are 10+ years old, the tapes 8+)

    22. Re:Talk to Vendors by HappyPsycho · · Score: 1

      If you don't know the difference between sales professionals and IT professionals... you are part of the problem.

      How do you get to the latter without at least making contact with the former?

      Something of the OP's scale isn't exactly the normal thing that your average IT professional has any experience with so the normal channels probably won't work.

  5. Depends who you ask... by snowgirl · · Score: 4, Interesting

    At Facebook, it's memcached, with an HDD backup, eventually put onto tape...

    At Google, it's a ramdisk, backed up to SSD/HDD, eventually put onto tape...

    For anyone who can't afford half a petabyte of RAM with the commensurate number of computers? I have no good ideas... except maybe RAM cache of SSD, cache of HDD, backed up on tape...

    Using something like HDFS to store your data in a Hadoop cluster of file requests, is likely the best F/OSS solution you're going to get for that...

    --
    WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
    1. Re:Depends who you ask... by tsetem · · Score: 2

      Thumbs up on HDFS. The next question to ask your groups how they will be analyzing it. HDFS (and Hadoop/Spark/Whatever) will hopefully fit in nicely there. Not only will your data be redundantly copied across multiple systems, but as your data needs (and cluster) grows, so does your computational power.

      Getting data in & out can be done via Java API, Rest API, FUSE or NFS Mounts. The only issue is that HDFS doesn't play well with small files, but hopefully your groups will be using large files instead.

      Now administration is another story, but then there's Cloudera's Manager that's supposed to greatly simplify management. I'm currently using it to store about .25 PB right now for random analysis, but growing it's capacity is a straightforward task.

      As far as backing up, HDFS provides snapshots, 3x replication (or more) across nodes in the cluster. Of course there's always the big hammer of just getting a second cluster. As an old HW sage once told me, "If you can't afford to buy two, don't buy one"

  6. Enterprise Storage by NFN_NLN · · Score: 2

    This project must have an unrealistically low budget, otherwise there are quite a few Enterprise solutions that will do all OR a combination of these tasks.

    > how do you present it back to the clients?
    Look at a NAS, not a SAN. ie NetApp or 3Par C series.

    > And how do you back it up?
    Disaster Recovery replication to another system or hosted services. NetApp, EMC, 3Par, etc, etc

    > Many SAN solutions have a maximum volume limit of only 16TB
    NetApp Infinite volumes limit is 20PB

    You can contact a sales person from any of those companies to answer any of these questions.

    1. Re:Enterprise Storage by NatasRevol · · Score: 2

      Yeah, the 16TB limit says OP is looking at VERY low end solutions. As in not feasible for petabyte range projects.

      --
      There are two types of people in the world: Those who crave closure
    2. Re:Enterprise Storage by Anonymous Coward · · Score: 1

      My favorite part of that 16TB limit is that it can be reached with two hard drives.

    3. Re:Enterprise Storage by Drewdad · · Score: 1

      Replication is not backup. I cannot stress this enough.

      I know of major companies that depended on replication and ignored backup, and then the original copy gets corrupted and the corruption gets replicated to the recovery sites.

      Now if you're doing SAN snapshots, and replicating those, then you might be covered, but mounting one of those snaps, and recovering some portion of your data, can be a real pain in the behind.

    4. Re: Enterprise Storage by afidel · · Score: 1

      Not necessarily, HP 3Par 20850 scales to 4 PB of SSD (raw, 15+ PB with dedupe) and 3.2 million sub 1ms IOPS, and 75GB/s of throughout but one LUN is still limited to 16TB because not enough customers need more than that it one logical disk to change underlying code.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    5. Re: Enterprise Storage by ihtoit · · Score: 1

      I just had an underpants spooge.

      --
      Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
    6. Re: Enterprise Storage by KGIII · · Score: 1

      NewEgg has them at $260 with a 2/customer limit and 12 in stock. That number will be smaller in a minute when they update the page. Free shipping too.

      --
      "So long and thanks for all the fish."
    7. Re: Enterprise Storage by ihtoit · · Score: 1

      good price. My last HD purchase was a WD Elements pocket 2TB for £69.99 from DSG. For some reason it doesn't suffer the problem my WDE 1TB has in that that one powers down and I have to hardcycle it for the system to pick it up again. It wouldn't be that annoying except that I use that one for music.

      --
      Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
    8. Re:Enterprise Storage by Cramer · · Score: 1

      Replication is not Archival. Corruption can be copied to a "backup" as well. If you aren't paying attention to what is being duplicated and to where, then "stupid is going to catch up to you eventually." For the record, I've seen the exact same mistake happen to people doing "backups" (RDX and tape) -- the error wasn't caught within a media cycle. (which was "weeks" for them)

    9. Re:Enterprise Storage by Drewdad · · Score: 1

      Yup. Inadequate retention can screw you the same was as just depending on replication.

    10. Re:Enterprise Storage by Drewdad · · Score: 1

      "same way as" not "same was as"

  7. Call ixsysyems, use ZFS by darkpixel2k · · Score: 1

    Seriously. Call ixsysyems. They specialize in this stuff and they use ZFS.

    --
    There's no place like ::1 (I've completed my transition to IPv6)
    1. Re:Call ixsysyems, use ZFS by NatasRevol · · Score: 2

      ZFS is a great raid system. That's now owned by Oracle. Goodbye ZFS.

      --
      There are two types of people in the world: Those who crave closure
    2. Re:Call ixsysyems, use ZFS by darkpixel2k · · Score: 4, Informative

      Nope. Not 'owned'. It's covered under the CDDL and developed by a group that isn't associated with Sun. Open-ZFS.

      --
      There's no place like ::1 (I've completed my transition to IPv6)
    3. Re:Call ixsysyems, use ZFS by Bengie · · Score: 2

      My cousin used ZFS+gluster for this multi-petabyte system.

    4. Re:Call ixsysyems, use ZFS by donaldm · · Score: 1

      Seriously. Call ixsysyems. They specialize in this stuff and they use ZFS.

      Since when is a file-system a backup and recovery solution?

      --
      There ain't no such thing as proprietary standards only proprietary formats. Standards are by definition open.
    5. Re: Call ixsysyems, use ZFS by darkpixel2k · · Score: 1

      It's *part* of a good disaster recovery solution. Care to back your stuff up with Refs? (Or whatever MS calls it today?)

      --
      There's no place like ::1 (I've completed my transition to IPv6)
    6. Re:Call ixsysyems, use ZFS by Bengie · · Score: 1

      For most people ZFS could be considered a "back up" solution to a certain extent and making certain assumptions. For a large enterprise systems, never rely on a single file system. The data files should be replicated to multiple systems that use multiple file systems.

  8. Tape by kthreadd · · Score: 1

    The research projects I've seen using that amount of storage has usually used a tape solution with dCache in front of it. You use a number of tape robots filled with tape, put them in different locations and have them back up everything between them.

    1. Re:Tape by kthreadd · · Score: 1

      Just realized I was a few digits off, saw that you said 0.5 PB. Somehow got it to 500 PB. Not that dCache isn't going to handle it, it will. But for as little data as just 0.5 PB a couple of disk arrays connected to a single server will usually be fine. Tape is still good for backup though.

    2. Re:Tape by kthreadd · · Score: 1

      Yep. It's not that much. We just installed a new storage system for fast temporary data, not long term storage. 1 PB. It easily fits in a single rack.

  9. Use storage level services. by hamster_nz · · Score: 1

    If you want to keep your data on-site, unless your already have a lot of the infrastructure that you can leverage the path of least resistance is to use something like a NetApp Filer.

    For backups it can create snapshots on a schedule (hourly/daily/weekly), then either replicate them to a second physical storage unit (hopefully at a different site) or present them to your backup solution.

    Using the file services on the NetApp will also provide a solution to your "how do I present it to the storage consumers" question - iSCSI, CIFS with domain integration, NFS, Fibre Channel... You also get storage level de-duplication and compression, if that works for your data.

    Of course you will pay what seems like a lot for it, but it does solve a lot of your problems in one unit. How much will it save in servers, backup capacity, a multi-drive tape library, daily visits to the server room to reload tapes and so on.

    But if your data center isn't up to providing the level of availability you want then any hardware solution is going to be problematic - large storage systems do not like having the power pulled out from under them. Minimum is dual-redundant UPS power and fault tolerant cooling, or you will most likely have problems.

  10. Storage Pod by Anonymous Coward · · Score: 1

    Something like storage pods? https://www.backblaze.com/blog/storage-pod/

  11. use slashdotFS by goombah99 · · Score: 3, Funny

    I use slashdotFS which is a markovian random comment generator which effectively embeds data in a stegenographic comment. The FS handles the details of creating and saving these so it's all transparent and mounts on your desktop like a regular drive. It's slow but it's capacity seems unlimited and frequently gets modded insightful

    --
    Some drink at the fountain of knowledge. Others just gargle.
    1. Re:use slashdotFS by goombah99 · · Score: 2

      another way is to convert it to jpeg and store it in facebook.

      --
      Some drink at the fountain of knowledge. Others just gargle.
    2. Re:use slashdotFS by KGIII · · Score: 1

      And by shear coincidence the encrypted header's plain text output is MOO! Compressed meta-data is goatse.

      --
      "So long and thanks for all the fish."
    3. Re:use slashdotFS by Big+Hairy+Ian · · Score: 1

      I was just going to suggest embedding it in Piers Morgans DNA as he oviously has the redundancy ande its about time he did something useful

      --

      Build a Man a Fire, and He'll Be Warm for a Day. Set a Man on Fire, and He'll Be Warm for the Rest of His Life.

  12. Lots of options by JWW · · Score: 1

    You could look into Lustre, although it would change your hardware configuration a bit (its not a SAN) Depending on your configuration and desired redundancy, this will affect costs a bit (i.e.. more luster nodes).

    You could by a traditional SAN and tie it all together with fibre, though you'd need a clustered file system like Stornext, or another commercial CFS, or even GFS if you prefer open source. This would help solve your traversal of the system as a regular directory structure issue.

    Best bet for backup would be to a robot tape library of some sort. There is some work being done on dynamic backup of data in Luster systems in the HPC space, but its not very mature. CFS systems like Sternest have methods in place for automatically backing up data on the filesystem.

  13. SanDisk sells a 512TB 3U shelf... by AcquaCow · · Score: 2

    SanDisk's Infiniflash is 512TB in a 3U chassis that is SAS-connected. You can front this with something like DataCore's SANsymphony to turn it into a NAS/SAN appliance.

    The pricing looks to be around $1/GB, which is a ton cheaper than building a SAN of that capacity, plus it's much smaller in power/space/cooling.

    --

    up 12 days, 22:30, 2 users, load averages: 993.20, 994.21, 994.56
    *makes note to limit user processes...
    1. Re:SanDisk sells a 512TB 3U shelf... by Lost+Race · · Score: 1

      $1/GB, which is a ton cheaper than building a SAN of that capacity,

      The marginal price of HDD storage is about $0.05/GB. Maybe double that for higher density, maybe double it again for redundancy. That's a maximum of $0.2/GB for the disks. There's some fixed overhead for a large disk farm plus some more per-byte overhead for the controllers and interconnects. Hard to believe that really adds up to much more than $1/GB. We're talking half a million dollars for 500TB.

      Daydream on. Big cluster of mid-tower PCs. Six 4TB drives per tower, for a total of 20TB with 1:6 redundancy. 25 of those towers would give you 500TB. 150 drives at $150 each = $23K. 25 server-grade PCs at about $1000 each = $25K. Networking? No idea, maybe another $2K? So we're looking at about $50K for 500TB. Obviously there will be some overhead for a managed commercial "enterprise" level system from a big vendor. But more than 10x the price? Really? Seems like there's room for a little more competition in that business.

    2. Re:SanDisk sells a 512TB 3U shelf... by hjf · · Score: 1

      now factor in the cost of maintaining spinning disks, powering them, cooling them, and datacenter space....

  14. Time for the next step by fustakrakich · · Score: 1

    Let's start growing brains in jars.

    --
    “He’s not deformed, he’s just drunk!”
    1. Re:Time for the next step by KGIII · · Score: 1

      I am not sure if it is my file system or my OS but I am definitely suffering from bit rot. Maybe it is Windows and I need a defrag utility?

      --
      "So long and thanks for all the fish."
  15. How are you using the data? by MetricT · · Score: 2

    What clients will you be exporting it to? Linux, OS X, Windows? All three?

    What kind of throughput do you need? Is 10 MB/sec enough? 100 MB/sec? 10 GB/sec?

    What kind of IO are you doing? Random or sequential? Are you doing mostly reads, mostly writes, or an even mix?

    Is it mission critical? If something goes wrong, do you fix it the next day, or do you need access to a tier 3 help desk at 3 am?

    We have a couple of petabytes of CMS-HI data stored on a homegrown object filesystem we developed and exported to the compute nodes via FUSE. Reed-Solomon 6+3 for redundancy. No SAN, no fancy hardware, just a bunch of Linux boxes with lots of hard drives.

    There is no "one shoe fits all" filesystem, which is part of the reason we use our own. If you have the ability to run it, I'd suggest looking at Ceph. It only supports Linux, but has Reed-Solomon for redundancy (considered it a higher tier of RAID) and good performance if you need it. If you have to add Windows or OS X clients into the mix, you may need to consider NFS, Samba, WebDAV, or (ugh) OpenAFS.

    1. Re:How are you using the data? by rev0lt · · Score: 1

      It is funny, I've read many comments since the top of the page, and finally someone is actually asking for requirements. At this point, its buried at the middle of the scrollbar. And yet, someone blames slashdot moderation. I blame the users.

  16. You're asking like you will be implementing it... by tlambert · · Score: 4, Interesting

    You're asking like you will be implementing it... don't.

    Gather all their requirements, gather your requirements on top of it (I'm pretty confident that some of those requirements were your additions for "you'd be an idiot to have that, but not also have this...", possibly including the backup).

    Then put out an Preliminary RFP to the major storage vendors, including asking them what they'd say you'd missed in the preliminary.

    Then take the recommendations they make on top of the preliminary with a grain of salt, since most of them will be intended to insure vendor lock-in to their solution set, revise the preliminary, and put out a final RFP.

    Then accept the bid that you like which management is willing to approve.

    Problem solved.

    P.S.: You don't have to grow everything yourself from seed you genetically modify yourself, you know...

  17. You don't. by Anonymous Coward · · Score: 1

    Unless you REALLY want to pay for it.

    As someone who works in a Hospital system, Imaging Informatics specifically, we have roughly that much data spread across 2 locations. Backups aren't what you think they are. We backup the infrastructure config. Databases, VM cluster config and VM's, which compressed, probably equates to 5-10 Terabytes. That's it. That's the stuff which, if worst possible event happened, we wouldn't be exctly back to 0 when we rebuilt.

    As for the 400-500 Terabytes of data, they're in what we call Archive state. There isn't backup of them, but they are in proper data centers with fire suppression. So there's that... Still, if 1 site went up, we'd be down that data. Thems the breaks... Goes back to money! But, what we do have, is evertying in RAID with Hot Spare. I think... I know 2 drives can fail in a block, and have recently, and we can recover the block. As 75% of this data is pretty much read-only transfer, the only stuff being written to permanent storage is new data. I think we're seeing 120-150 Terabyte of growth a year, and we're looking at new storage since current gear is at the 'EOL'. Life Cycle wise, not warranty or operation.

    Point is, will we see a PetaByte storage system bought? Maybe, but it will be the same setup. Archive system, with backup for the 'guts', what I like to call it. Simply put, CXX's don't want to throw the $$ down for Petabyte Data store site duplication. If money was far more flowing to use, we'd at least start there and implement a 100-150 Terabyte SSD Caching block with 10GB Fiber, in and out. Not happening, but a man can dream...

    1. Re: You don't. by TheMeuge · · Score: 1

      Are you telling me you have a petabyte of clinical data with no backups? Good luck with that lawsuit my friend...

    2. Re: You don't. by ihtoit · · Score: 1

      that's all right because we have public officials who leave backups on public transport...

      --
      Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
  18. look at how backblaze does it by Anonymous Coward · · Score: 1

    Backblaze blog has a rundown of their storage pod https://www.backblaze.com/blog/storage-pod-4-5-tweaking-a-proven-design/

    This with something like gluster, luster, cephe or even just nfs.

  19. Ask the people who are currently storing 150 PB by Anonymous Coward · · Score: 1

    Backblaze is an online backup provider. They have open sourced some of their software and hardware designs.

    They are currently storing over 150 Petabytes of user data. https://www.backblaze.com/blog/150-petabytes-of-cloud-storage/
    They are working on scalability into the Zettabyte range https://www.backblaze.com/blog/vault-cloud-storage-architecture/
    They have open sourced their hardware design for anyone to use. https://www.backblaze.com/blog/storage-pod-4-5-tweaking-a-proven-design/

    They also looked into using 3rd party vendors but decided that they could build a better solution for at least 1/8 the price. https://www.backblaze.com/blog/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/

    I know that it is not a plug and play solution but if you are willing to build off of their work you can save a ton of money and have a solution that truly fits your needs.

  20. Easy by ArcadeMan · · Score: 5, Funny

    How Do You Store a Half-Petabyte of Data? (And Back It Up?)

    That's the easiest question I've ever seen.

    1. Wait about a decade or so.
    2. Buy two half-petabyte flash drives.
    3. Alternate your copies on the two flash drives, the previous one becomes your backup.

    NEXT!

  21. easy by YoungManKlaus · · Score: 1

    Step 1: buy a metric shitton of storage space (virtual or physical)
    Step 2: put your data on it
    Step 3: ???
    Step 4: profit

  22. What are your budget and reliability requirements? by fishnuts · · Score: 2

    If you have a small budget and moderate reliability requirements, I'd suggest looking into building a couple Backblaze-style storage pods for block store (5x 180TB storage systems, apx $9000 each), each exporting 145TB RAID5 volumes via iSCSI to a pair of front-end NAS boxes. NAS boxes could be FreeBSD or Solaris systems offering ZFS filestores (putting multiples of 5 volumes, one from each blockstore, together in RAIDZ sets), which then export these volumes via CIFS or NFS to the clients. Total cost for storage, front-ends, 10GbE NICs and a pair of 10GbE switches: $60K, plus a few weeks to build, provision, and test.

    If you have a bigger budget, switch to FibreChannel SANs. I'd suggest a couple HP StorServ 7450s, connected via 8 or 16Gb FC across two fabrics, to your front ends, which aggregate the block storage into ZFS-based NAS systems as above, implementing raidz for redundancy. This would limit storage volumes to 16TB each, but if they're all exposed to the front ends as a giant pool of volumes, then ZFS can centrally manage how they're used. A 7450 filled with 96 4TB drives will provide 260TB of usable volume space (thin or thick provisioned), and cost around $200K-$250K each. Going this route would cost $500-$550K (SANs, plus 8 or 16Gb FC switches, plus fibre interconnects, plus HBAs) but give you extremely reliable and fast block storage.

    A couple advantages of using ZFS for the file storage is its ability to migrate data between backing stores when maintenance on underlying storage is required, and its ability to compress its data. For mostly-textual datasets, you can see a 2x to 3x space reduction, with slight cost in speed, depending on your front-ends' CPUs and memory speed. ZFS is also relatively easy to manage on the commandline by someone with intermediate knowledge of SAN/NAS storage management.

    Whatever you decide to use for block storage, you're going to want to ensure the front-end filers (managing filestores and exporting as network shares) are set up in an identical active/standby pair. There's lots of free software on linux and freebsd that accomplish this. These front-ends would otherwise be your single-point-of-failure, and can render your data completely unusable and possibly permanently lost if you don't have redundancy in this department.

  23. Re:Don't by jellomizer · · Score: 1

    But he used vague requirements so not to give enough information for an actual informed decision.

    But in general it sounds like it is going to be expensive and a lot of work, with working out a lot of details more then storing and backing up data.
    Then the question but how do you present it back to the clients? That is a different can of worms.

    The real question should be.
    Which consulting company should I work with on a big data project?
    Have you worked with some that seems to be able to give you a clear goal and time lines, and meet the budget specified.

    --
    If something is so important that you feel the need to post it on the internet... It probably isn't that important.
  24. We paid ~$30k for a 24TB array...call a vendor. by InfiniteBlaze · · Score: 1

    They'll be happy to talk to you for free, for the prospect of getting their hands on that kind of cash. You're easily looking at $.5M-$1M between storage, processing, and redundancy.

  25. What's your budget? by Karmashock · · Score: 1

    Sounds like you need the storage onsite at least for the research project.

    The mixed media thing sounds like something to throw at the cloud unless there's a reason not to do that.

    As to spanning volumes etc... I don't really understand the file structure of this research project. Having a petabyte of data in a single directory is typically the opposite of good ideas.

    I'd like more information.

    As to back ups... it depends on how frequently the information changes. Backup tapes are probably the cheapest way to go for backups of archives. 3 TB at 20 dollars a tape.... not bad. And you can do incremental back ups if there are little changes.

    The tapes are supposed to last about 10 years. So that's something.

    If we're talking about high frequency changes... you almost need to replicate the primary storage... and the number of times you need to do that is variable on how badly you need to not lose the data.

    If we're talking about data that if lost orphans are going to get ground up into hamburger and fed to the dogs... you're going to want multiple back ups. If it would merely be annoying... maybe one back up is fine.

    --
    I've decided to stop wasting my time responding to AC trolls/sockpuppets... so if you want a response from me... login.
  26. NAS by sega_sai · · Score: 1

    We recently bought for our group a NAS server with ~200Tb of raw storage (175Tb after RAID6 with a good card). And this is NFS mounted to other servers. It is pretty easy to use and configure and quite cheap (20k UK pounds). Regarding the backup, I would probably just buy a second server. (maybe with cheaper confiuration, worse raid card, etc.)

  27. Ask the guys at CERN by prefec2 · · Score: 1

    You will not get a good answer here, because even if there would be one it will be hard to find between all the nonsense.

    BTW your scenario is incomplete and therefore it is unlikely to give a good answer. It looks a little bit like you want /. to make your homework.

  28. Wrong questions. More details needed. by d3vi1 · · Score: 5, Informative

    You're not asking the right questions:

    The first correct question is why on earth would someone need to access half a petabyte? In most cases the commonly accessed data is less than 1%. That's the amount of data that realistically needs to reside on disk. It never is more than 10% on such a large dataset. Everything else would be better placed on tape. Tiered storage is the answer to the first question. You have RAM, solid/flash storage (PCI based), fast disks, slow high capacity disks and tape. Choose your tiering wisely.

    The second question you need to ask is how the customer needs to access that large datastore. In most cases you need serious metadata in parallel with that data. For Petabytes of data you cannot in most cases just use an intelligent tree structure. You need a web-site or an app to search that data and get the required "blob". For such an app you need a large database since you have 5M objects with searchable metadata (at 200MB/blob).

    The third question is why do you have SAN as a premise? Do you want to put a clustered filesystem with 5-10 nodes? Probably Isilon or Oracle ZS3-2/ZS4-4 are your answer.

    Fourth question: what are the requirements? (How many simultaneous clients? IOPS? Bandwidth? ACL support? Auditing? AD integration? Performance tuning?)

    Fifth question: There is no such thing as 100% availability. The term disaster in Disaster Recovery is correctly placed. Set reasonable SLA expectations. If you go for five-nine availability it will triple the cost of the project. Keep in mind that synchronous replication is distance limited. Typically, for a small performance cost, the radius is 150 miles and everything above impacts a lot.

    Even if you solve the problems above, if you want to share it via NFS/CIFS or something else you're going to run into troubles. Since CIFS was not realistically designed for clustered operation regardless of the distributed FS underneath the CIFS server, you get locking issues. Windows Explorer is a good example since it creates thumbs.db files, leaves them open and when you want to delete the folder you cannot unless you magically ask the same node that was serving you when it created the Thumbs.DB file. Apparently, the POSIX lock is transferred to the other server and stops you from deleting, but when Windows Explorer asks the other node who has the lock on the file you get screwed since the other server doesn't know. Posix locks are different from Windows locks. It affects all Likewise based products from EMC (VNX filler, Isilon, etc.) and it also affects the CIFS product from NetApp. I'm not sure about Samba CTDB though.
    I would design a storage based on ZFS for the main tiers, exported via NFSv4 to the front-end nodes and have QFS on top of the whole thing in order to push rarely accessed data to Tape. The fronted nodes would be accessed via WebDAV by a portal in which you can also query the metadata with a serious DB behind it.

    I've installed Isilon storage for 6000 xendesktop clients that all log-on at 9AM, i've worked on an SL8500, Exadata, various NetApp and Sun storages and I can tell you that you need to do a study. Have simulations with commodity hardware on smaller datasets to figure out the performance requirements and optimal access method (NAS, Web, etc.). Extrapolate the numbers, double them and ask for POC and demos from vendors, be it IBM, EMC, Oracle, NetApp or HP. Make sure that in the future, when you'll need 2PB you can expand in an affordable manner. Take care since vendors like IBM tend to use the least upgradable solution. They will do a demo with something that can hold 0,6PB in their max configuration and if you'll need to go larger you'll need a brand new solution from another vendor.

    It's not worth doing it yourself since it will be time-consuming (at least 500 man-hours until production) and with at least 1 full-time employees for the storage. But if you must, look at Nexenta and the hardware that they recommend.

    And remember to test DR failover scenarios.

    Good luck!

    --
    UNIX was not designed to stop you from doing stupid things, because that would also stop you from doing clever ones.
    1. Re:Wrong questions. More details needed. by radish · · Score: 1

      The first correct question is why on earth would someone need to access half a petabyte? In most cases the commonly accessed data is less than 1%. That's the amount of data that realistically needs to reside on disk. It never is more than 10% on such a large dataset.

      Never say never. We have data sets several times larger than that which are 100% always online due to client access patterns. Not only online, but extremely latency critical. And I personally could name a dozen other companies with similar requirements.

      --

      ---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"

    2. Re:Wrong questions. More details needed. by Professor+Paradox · · Score: 1

      This is definitely the best post so far. Sending out requirements to different vendors will just get you a vender specific answer. If you ask a DBA how to store that much data they will give you an answer that explains how MSSQL could handle that, and then they would talk about backup snashots, and you would be stuck with SQL as the client access.

      I want to reject the premise of your request, are you really responsible for manging the data of these two other groups? It seems like in the past you have owned the storage for other internal teams, but now the time has come for them to start doing this themselves. Option 1, you own the service that does this, you don't pay attention do limits and anything like that, and provide an SLA to groups that want to use your service. This has probably been what you currently doing. Some teams may be unhappy with that service because it doesn't quite fit their needs. Option 2, each team that wants something different and should manage it themselves. Where an filesystem for one team may be what they need, perhaps a different team wants MongoDB shards.

      Monoliths are evil, and trying to maintain petabytes of data in one place is not a good solution. It's easier for two teams to maintain and own their own Terabyte storage solutions that will solve their own problems, then having you to try to mediate and come up with the solution yourself.

  29. SAN is out. by TheHawke · · Score: 1

    Library storage sounds like that may be your best choice. Several high end vendors sell such systems and may need to have RFS and RFQ's submitted, not to mention seeing the systems in action. This is not going to be cheap, but it's best on the long term investment. Ensure that it is scalable and can handle any future expansions without investing in whole new kit or that will simply put your department back to square one.

    --
    First rule of holes; When in one, stop digging.
  30. SAN, etc... by jbolden · · Score: 1

    On a SAN the 16tb limit comes generally from 32 bit SANs the 64 bit SANs wouldn't have it. Plenty of SAN solutions can handle 500tb or 10x that much. So just upgrade. If you only want backup there are plenty of hardware backup devices that handle this. For example exagrid scales to I believe 300tb / hr much less 500tb total. This isn't gigantic in today's world. You just need to have a conversation with your vendor, or an agent. You aren't asking for anything abnormal or challenging.

  31. But restore ... by Ungrounded+Lightning · · Score: 2

    Just put "bomb" and "assassinate" in every line. ... It's all going to get backed up.

    But getting them to restore it after it's gotten lost or corrupted is difficult.

    --
    Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
    1. Re:But restore ... by Hognoxious · · Score: 1

      They still apply that bit in about disclosing the nature and cause of the accusation, did they?

      That's cute.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  32. What are your IOPS and throughput requirements? by DamnStupidElf · · Score: 2

    For high throughput/IOPS requirements build a Lustre/Ceph/etc. cluster and mount the cluster filesystems directly on as many clients as possible. You'll have to set up gateway machines for CIFS/NFS clients that can't directly talk to the cluster, so figure out how much throughput those clients will need and build appropriate gateway boxes and hook them to the cluster. Sizing for performance depends on the type of workload, so start getting disk activity profiles and stats from any existing storage NOW to figure out what typical workloads look like. Data analysis before purchasing is your best friend.

    If the IOPS and throughput requirements are especially low (guaranteed < 50 random IOPS [for RAID/background process/degraded-or-rebuilding-array overhead] per spindle and what a couple 10gbps ethernet ports can handle, over the entire lifetime of the system) then you can probably get away with just some SAS cards attached to SAS hotplug drive shelves and building one big FreeBSD ZFS box. Use two mirrored vdevs per pool (RAID10-alike) for the higher-IOPS processing group and RAIDZ2 or RAIDZ3 with ~15 disk vdevs for the archiving group to save on disk costs.

    Plan for 100% more growth in the first year than anyone says they need (shiny new storage always attracts new usage). Buy server hardware capable of 3 to 5 years of growth; be sure your SAS cards and arrays will scale that high if you go with one big storage box.

  33. Buy a Storage Pod by Areyoukiddingme · · Score: 3, Informative

    Buy Storage Pods, designed by BackBlaze. You can get 270TB of raw storage in 4U of rackspace for $0.051 per gigabyte. Total cost for half a petabyte of raw storage: $27,686. To back it all up cheaply but relatively effectively, buy a second set to use as a mirror. $55,372. For use with off-the-shelf software (FreeNAS running ZFS or Linux running mdm RAID) to present a unified filesystem that won't self-destruct when a single drive fails, you'll need to over-provision enough to store parity data. Go big or go home. Just buy another pod for each of the primary and the backup sets. Total of 6 pods with 1620TB of raw storage: $83,058. Some assembly required. And 24U of rackspace required, with power and cooling and 10Gbe ethernet and UPSs (another 4-8U of rackspace).

    Expect a ballpark price of something a little under $100,000 that will meet your storage requirements with sufficient availability and redundancy to keep people happy. It will require 2 racks of space, and regular care and feeding. Do the care and feeding in house. A support contract where you pay some asshole tens of thousands of dollars a year to show up and swap drives for you is a waste of money. Bearing that in mind, as other posters have said, talk to storage vendors selling turnkey solutions. Come armed with these numbers. When they bid $1 million, laugh in their faces. But there's an outside chance you'll find a vendor with a price that is something less than hyperinflated. Stranger things have happened.

    If you don't generate data very quickly, you can ease into it. For around $35,000, you can start with just 2 pods and the surrounding infrastructure, and add pods in pairs as necessary to accommodate data growth. Add $27,000 in 2 chassis next year to double your space. Add $26,000 of space again in 2017 and increase your raw capacity another 50%. (Total storage cost using BackBlaze-inspired pods is dominated by hard drive prices, which trend downwards.) When you find out your users underestimated growth, another $25,000 of space in 2018 takes you to somewhere in the neighborhood of 2 petabytes of raw storage, that you're using with double parity and 100% mirrored backup for a total effective useable space of approximately 918TB. You'll be replacing 2-3 drives per year, starting out, and 0-1 after infant mortality has run its course. Keep extras in a drawer and do it yourself in half an hour each on a Friday night. If you configured ZFS with reasonably sized vdevs, (3-5 devices) the array rebuild should be done by Monday morning. By 2020, you'll be back up to replacing 2-3 drives per year again as you climb the far side of the bathtub curve. While you're at it, you can seriously consider replacing whole vdevs with larger capacity drives, so your total useable space can start to creep up over time, without buying new chassis. By 2025, you will have 8 chassis in two racks hosting 2.88PB of raw storage space that's young and vital and low maintenance, having spent roughly $200,000.

    A bargain, really.

  34. Roll it yourself but take responsibility by maraist · · Score: 1

    Super-Micro has 36 and 72 drive racks that aren't horrible human effort wise (you can get 90 drive racks, but I wouldn't recommend it). You COULD get 8TB drives for like 9.5 cent / GB (including the $10k 4U chassi overhead). 4TB drives will be more practical for rebuilds (and performance), but will push you to near 11c / GB. You can go with 1TB or even 1/2TB drives for performance (and faster rebuilds), but now you're up to 35c / GB.

    That's roughly 288TB of RAW for say $30k 4U. If you need 1/2 PB, I'd say spec out 1.5PB - thus you're at $175K .. $200k.. But you can grow into it.

    Note this is for ARCHIVE, as you're not going to get any real performance out of it.. Not enough CPU to disk ratio.. Not even sure if the MB can saturate a 40Gbps QSFP links and $30k switch. That's kind of why hadoop with cheap 1CPU + 4 direct-attached HDs are so popular.

    At that size, I wouldn't recommend just RAID-1ing, LVMing, ext4ing (or btrfsing) then n-way foldering, then nfs mounting... Since you have problems when hosts go down and keeping any of the network from stalling / timing out.

    Note, you don't want to 'back-up' this kind of system.. You need point-in-time snapshots.. And MAYBE periodic write-to-tape.. Copying is out of the question, so you just need a file-system that doesn't let you corrupt your data. DEFINITELY data has to replicate across multiple machines - you MUST assume hardware failure.

    The problem is going to be partial network down-time, crashes, or stalls, and regularly replacing failed drives.. This kind of network is defined by how well it performs when 1/3 of your disks are in 1-week-long rebuild periods. Some systems (like HDFS) don't care about hardware failure.. There's no rebuild, just a constant sea of scheduled migration-of-data.

    If you only ever schedule temporary bursts of 80% capacity (probably even too high), and have a system that only consumes 50% of disk-IO to rebuild, then a 4TB disk would take 12 hours to re-replicate. If you have an intelligent system (EMC, netapp, ddn, hdf, etc), you could get that down to 2 hours per disk (due to cross rebuilding).

    I'm a big fan of object-file-systems (generally HTTP based).. That'll work well with the 3-way redundancy. You can typically fake out a POSIX-like file-system with fusefs.. You could even emulate CIFS or NFS. It's not going to be as responsive (high latency). Think S3.

    There's also "experimental" posix systems like ceph, gpfs, luster. Very easy to screw up if you don't know what you're doing. And really painful to re-format after you've learn it's not tuned for your use-case.

    HDFS will work - but it's mostly for running jobs on the data.

    There's also AFS.

    If you can afford it, there are commercial systems to do exactly what you want, but you'll need to tripple the cost again. Just don't expect a fault-tolerant multi-host storage solution to be as fast as even a dedicated laptop drive. Remember when testing.. You're not going to be the only one using the system... Benchmarks perform very differently when under disk-recovery or random-scatter-shot load by random elements of the system - including copying-in all that data.

    --
    -Michael
  35. Anything is possible with the right budget... by emag · · Score: 3, Informative

    Lucky (?) for you, I just went through purchasing a storage refresh for a cluster, as we're planning to move to a new building and no one trusts the current 5 year old solution to survive the move (besides which, we can only get 2nd hand replacements now). The current system is 8 shelves of Panasas ActiveStor 12, mostly 4 TB blades, but the original 2-3 shelves are 2 TB blades, giving about 270 TB raw storage, or about 235ish TB in real use. The current largest volume is about 100 TB in size, the next-largest is about 65 TB, with the remainder spread among 5-6 additional volumes including a cluster-wide scratch space. Most of the data is genomic sequences and references, either downloaded from public sources or generated in labs and sent to us for analysis.

    As for the replacement...

    I tried to get a quote from EMC. Aside from being contacted by someone *not* in the sector we're in, they also managed to misread their own online form and assumed that we wanted something at the opposite end of the spectrum from what I requested info on. After a bit of back and forth, and a promise to receive a call that never materialized, I never did get a quote. My assumption is they knew from our budget that we'd never be able to afford the capacities we were looking for. At a prior job, a multi-million dollar new data center and quasi-DR site went with EMC Isilon and some VPX stuff for VM storage/migration/replication between old/new DCs, and while I wasn't directly involved with it there, I had no complaints. If you can afford it, it's probably worth it.

    The same prior job had briefly, before my time there, used some NetApp appliances. The reactions of the storage admins wasn't all that great, and throughout the 6 years I was there, we never could get NetApp to come in to talk to us whenever we were looking for expansion of our storage. I've had colleagues swear by NetApp though, so YMMV.

    I briefly looked at the offerings from Overland Storage (where we got our current tape libraries), on the recommendation of the VAR we use for tapes & library upgrades. It looked promising, but in the end, we'd made a decision before we got most of those materials...

    What we ended up going with was Panasas, again. Part of it was familiarity. Part of it was their incredible tech support even when the AS12 didn't have a support contract (we have a 1 shelf AS14 at our other location for a highly specialized cluster, so we had *some* support, and my boss has a golden tongue, talking them into a 1-time support case for the 8 shelf AS12). We also have a good relationship with the sales rep for our sector, the prior one actually hooked us up with another customer to acquire shelves 6-8 (and 3 spares), as this customer was upgrading to a newer model. Based on that, we felt comfortable going with the same vendor. We knew our budget, and got quotes for three configurations of their current models, ActiveStor 14 & 16. We ended up with the AS16, with 8 shelves of 6 TB disk (x2) and 240 GB SSD per blade (10 per, plus a "Director Blade" per). Approximate raw storage is just a bit under 1 PB (roughly 970-980 TB raw for the system).

    In terms of physical specs, each shelf is 4U, have dual 10 GbE connections, and adding additional shelves is as easy as racking them and joining them to the existing array (I literally had no idea what I was doing when we added shelves on the current AS12, it just worked as they powered on). Depending on your environment, they'll support NFS, CIFS, and their own PanFS (basically pNFS) through a driver (or Linux kernel module, in our case). We're snowflakes, so we can't take advantage of their "phone home" system to report issues proactively and download updates (pretty much all vendors have this feature now). Updating manually is a little more time-consuming, but still possible.

    As for backups, I honestly have no idea what I'm going to do. Most data, once written, is static in our environment, so I can probably get away with infrequent longer retention period backups for every

    --
    "The urge to save humanity is almost always a false front for the urge to rule." --H.L. Mencken
  36. Tape for backup by Crashmarik · · Score: 1

    One of these will do you well
    https://en.wikipedia.org/wiki/...

    For storage that's trickier. You probably need to characterize your usage before you talk to a vendor otherwise they will oversell you into oblivion.

  37. A large cluster... by quonsar · · Score: 1

    ...of Windows10 boxes!

  38. EMC Isilon by dave562 · · Score: 1

    Where I work, we are running EMC's Isilon platform. We have ~4PB of data replicated between two data centers.

    The platform supports the traditional CIFS/SMB and NFS for client connectivity.

    It also has Hadoop support (HDFS). The great thing about the HDFS support is that you do not have to spin a separate file system for it. The same files that your clients access via CIFS or NFS can be accessed via HDFS. Isilon was built with Hadoop in mind and the Isilon nodes act as Hadoop "compute nodes".

    The OneFS file system presents a practically unlimited in size, single file system. There are some interesting tuning options that can be leveraged depending on your data type and IO patterns. If you need to get REALLY crazy, the system has support for tiering data based on a whole slew of different factors (last accessed date, file date, file size... basically any file metadata attribute you can think of can be used for tiering purposes).

    This probably does not matter for you, but the system also supports AES256 at-rest encryption. We deal with a lot of financial and other highly sensitive data for clients that demand at-rest encryption, so that was a must have for us.

    The only downside is that since it is from EMC, you can plan on paying through the nose for it. (But never pay full retail for EMC, ever. Threaten them with NetApp if you have to. ;) )

    We still leverage a SpectraLogic tape library to archive data off of the system. With a moderately specced NetBackup system we get a consistent ~35000kb/s restore rate off of a single drive. That lets us provide reasonable RTOs back to the business.

    On the subject of backup, another great thing about Isilon is that you can dedicate certain nodes to specific tasks. In the Isilon architecture, the NL nodes are the slowest nodes that they have. We leverage those for backup to keep the network IO off of the faster X and S-nodes.

  39. That's it? by guruevi · · Score: 4, Informative

    500TB is nothing these days. You can easily buy any system and it will support it. Look at FreeBSD/FreeNAS with ZFS (or their commercial counterpart by iXSystems). If you want to have an extremely comfortable, commercial setup, go Nexenta or with a bit of elbow grease, use the open/free counterpart OpenIndiana (Solaris based).

    You can build 2 systems (I personally have 3, 1 with SAS in Striped-Mirrors, 1 with Enterprise-SATA in RAIDZ2 and 1 with Desktop-SATA in RAIDZ2) and have ZFS snapshots every minute/hour/day replicated across the network for backups, both Nexenta and FreeNAS have that right in the GUI. The primary system also has a mirrored head node which can take over in less than 10s. As far as sharing out the data: AFP/SMB/NFS/iSCSI/WebDAV etc. whatever you need to build up on it.

    My system is continuously snapshotted to it's primary backup so that in case of extreme failure (which has not happened in the 7 years since I've built this system) I can run from the primary backup until the primary has been restored with perhaps a few seconds of data loss (don't know if that's acceptable to you but in my case it's not a problem in case we do have a full meltdown)

    Where are those systems limited to 16TB? I wouldn't touch them with a 10-foot pole because they're running behind (within a few years a single hard drive will surpass that limit).

    --
    Custom electronics and digital signage for your business: www.evcircuits.com
  40. Backblaze Storage Pod? by im_thatoneguy · · Score: 2

    What are your performance requirements. If you just need a giant dump of semi-offline storage then look into building a backblaze Storage Pod.
    https://www.backblaze.com/blog...

    For about $30,000 you could build four storage pods. Speed would not be terrific. Backups are handled through RAID. If you want faster, more redundant or fully serviced your next step up in price is probably a $300,000 NAS solution. Which might serve you better anyway.

  41. Use Amazon S3 storage with glacier archival by xavierpayne · · Score: 1

    Use Amazon S3 storage (gives you cloud storage with a directory tree.

    Accessible via desktop apps or even web browser if you want.

    For stuff they want to archive but will rarely ever use have those S3 folders archive to Glacier.

    Nothing to backup and you can store petabytes in glacier cheaper than any other option on the planet. :)

    1. Re:Use Amazon S3 storage with glacier archival by Anonymous Coward · · Score: 1

      Are you kidding? Amazon S3 is ~0.03 per gigabyte PER MONTH (even upto half a PT they're like ~0.028 per gig PER MONTH. It only takes a quick scroll at some of the solutions on this thread that get you to ~5-15 cents per gig FOREVER (and cheaper in the future as prices fall).

      In other words, Amazon S3 is cheaper to start with, but that "cheaper" only lats like 2 months, then it keeps costing you more than your entire solution would've cost you on premises. After a year, there's no question that amazon is WAY more expensive. And you can't process that data---unless you pay amazon for the computing resources. An in-house hadoop cluster would provide both storage AND compute.

      True, there's a lot less headache with amazon, but it's definitely not cheap (not to mention you'll be paying by gigabyte to get the data out one day).

    2. Re:Use Amazon S3 storage with glacier archival by im_thatoneguy · · Score: 1

      Not to mention bandwidth. How are you going to move 500TB to the cloud and back in a reasonable time frame? You're looking at several months even over a gigabit connection.

  42. Depends on what you need to do with it by radish · · Score: 1

    Where I work we deal with data sets of a similar order. However, different data sets are stored differently depending on need. For online relational data where performance is critical, it's in master/slave/backup DB clusters running with 4.8TB PCIe SSDs. The backups are taken from a slave node and stored locally, plus they're pushed offsite. No tape, if we need a restore we can't really wait that long.

    For data we can afford to access more slowly we use large HDFS clusters with regular SATA discs. There's a level of redundancy built in there, and where data is important enough to need a real backup (much of it is not) it is also pushed offsite. The HDFS approach has the advantage of presenting as a very large filesystem, and obviously if you're running hadoop against it there's an automatic advantage.

    --

    ---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"

  43. From someone who's bought this much storage... by rockmuelle · · Score: 1

    While I agree with most commenters that you need to supply many more details before even beginning to narrow the options, if you do look at the storage vendors, DDN (Data Direct Networks) is really hard to beat.

    I see the EMC Isilon guys posting here and need to counter. :) They are overpriced and underpowered for almost every application. Their strength is typical enterprise environments - lots of small files accessed via NFS and "enterprise" SLAs. That's almost always the wrong solution for big data applications (NFS is terrible for big data). EMC Isilon sold a lot of storage into my space (gene sequencing) and very few customers are happy, especially when they find out what the other vendors could do.

    I've organized bake-offs between DDN, Isilon, and a number of other vendors. DDN always came out ahead on price and performance (every time they were half the price and twice the speed as Isilon). DDN is the most represented of the vendors on the Top 500 Supercomputing list and also power a certain streaming movie/TV service we all know and love. DDN is also a pretty ethical - if they're a bad match for your application, they'll let you know and provide recommendations.

    Whatever you do, don't build it yourself. As tempting and fun as it is, given that you're asking the question, you've already self-identified as someone who won't be able to support it. I've seen many smart people go the SuperMicro JBOD route only to create support nightmares for themselves.

    Also, for that much space, avoid Amazon at all costs. It's way too expensive compared to dedicated hardware.

    For cost, budget around $150-250k to get started. It might seem pricey, but you'll spend more than that on manpower building it yourself (or your first few months on Amazon).

    In addition to DDN, IBM, Dell, and HP all have solutions in this range that aren't terribly expensive.

    -Chris

  44. Gluster or Ceph by Anonymous Coward · · Score: 1

    Gluster or Ceph, depending on requirements.

    Both are Open Source, call Red Hat if you want support.

  45. In a hidden directory by aquabat · · Score: 1

    I keep it all in a separate drive, and only mount it when I want to look at the data. Also, I mount it under .porn, so it isn't visible in a casual listing.

    --
    A republic cannot succeed till it contains a certain body of men imbued with the principles of justice and honour.
  46. Your use case is likely unique by davidwr · · Score: 1

    Given how few use cases there are like the one you describe, there are probably a lot of important considerations that didn't make it into your question that make your use case unique.

    This is one of those cases where you really need to sit down and decide what works best for your situation, NOT what works best for other situations that require this amount of data storage.

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
  47. In a petafile, obviously by raymorris · · Score: 1

    To store files close to a petabyte, you need a petafile, obviously.

  48. Backups by manu0601 · · Score: 1

    Storing the data is the easy part, Glusterfs should do it just fine. The point I am curious about is backups: how do you backup such a volume?

  49. Server Based Storage by kaustik · · Score: 1

    Disclaimer: I work for a storage vendor. Also a long time Slashdot reader though, so this isn't mean as a sales pitch.

    Half of a petabyte is not really a lot of data in today's world. I talk to people every day that are trying to find ways to manages many PBs (into the hundreds) and are having challenges doing this with traditional storage. The trend that was started by the big Internet companies is to get rid of the fibre-channel SANs and instead solve the problem of storage using standard x86 servers. They use Linux as an abstraction layer from the hardware, and applications acting as storage systems too pool many servers together.

    One of the challenges you need to get over is stretching a namespace that big without filesystem limitations like maximum inode counts. This is generally accomplished using some type of key/value store (object) under the hood. Single flat namespaces with no practical size barrier.

    Some options that are available today are Swift from OpenStack and Ceph from Red Hat if you want to go the open source route. These can be good choices if you have the engineering staff on hand to piece it all together and the talent to keep it running. GPFS is also making a come back in this area, and there are a ton of startups looking at this space now.

    My company has a commercial solution for this stuff. Pretty cool - it's a Linux app and runs on the server of your choice. I'l save you the sales pitch, and if you want you can try it for free on your own here: http://scality.com/trial

    Whatever you choose, best of luck to you!

  50. Start Off Right... by BDMcGrew · · Score: 1

    I am a professional and manage several hundred petabytes globally. From experience I can tell you, they may be asking for half petabyte right now but tomorrow that will double and again next year and so on. Plan big to start with and you'll save your future self a lot of grief! If you PM me I can give you more details but in short I can suggest:

    1) Look at a scalable filesystem like GPFS or StorNext. Yes there is a price tag associated with big iron filesystems (and no I don't work for any of them) but you get what you pay for, and scalability is everything. As an example - pairing GPFS with TSM and the right hardware, I can create an infinitely scalable filesystem that'll scale to yodabytes.

    2) Tier the storage system. Think SSD for the cache (here and now) I/O, winchester disk for the short term and tape for the long term. Yes, tape: compute cost per tb on tapes the vault versus square footage in the data center.

    3) Separate your networks. Keep the client access separated from the disk i/o. Doing this will save massive congestion problems from day one!

    There are lots of other things to consider but by today's standards a half petabyte isn't an insurmountable amount of data just like a terabyte was twenty years ago.

  51. Mega by pestilence669 · · Score: 1

    It may sound "funny," but I once priced Mega (KimDotCom) for offsite backup & storage. They turned out to be less expensive than Amazon Glacier by a bit AND instantly available. We didn't go with them. Instead, we replicated across data centers with multi-terabyte storage nodes.

  52. Re:EMC Isilon by mlts · · Score: 1

    Isilons are a cool technology. Take FreeBSD, add a custom filesystem (OneFS), link individual nodes via Infiniband, and let the custom code automatically select which nodes/drives to fetch data from. If a hard drive blows, it shrinks the array in order to maintain redundancy.

    Of course, Isilons support deduplication, iSCSI (you create a disk image and mount that), and your NAS protocols of choice. If you set a hard quota, the presented directory can be configured to show the quota as the disk space present. Very nifty, and not that expensive for an enterprise array. Need more space? Add drives or more nodes.

    For long term backups, Isilons support NDMP [1].

    [1]: Of course, you can always connect a tape silo to a UNIX machine, write a script that SSHes into an Isilon node and pulls off /ifs/data.

  53. Store it in the cloud by ljw1004 · · Score: 1

    Store it in the cloud. 1/2 petabyte isn't even the "highest tier" requirement.

    On Azure it will cost $168k/year to store this much data instantly accessible. Whatever other solution you come up with, if it takes more than 1 full time person to support, then it's already more expensive (and that's not even including the up-front capital costs, installation and setup costs, training costs, deprecation, maintainance, ...)

  54. Hadoop by RabidMonkey · · Score: 1

    Sounds like a fairly simple case for a Hadoop cluster - a smallish one at that. We're currently deploying to clusters at 1PB/rack density, which means you could deploy a rack or two easily enough. You'd get compute, you get a single flat filesystem, you get redundancy, all built in. Our biggest cluster is now up to 16PB, all one big compute/storage beast, chugging away all day.

    I'd suggest starting with the Hortonworks Sandbox VM - grab it, fire it up, play with it. Add some files, poke around, see if it meets your needs. Learn about mapreduce, or maybe your data can be put in to HIVE for analysis.

    The nice thing is that yo ucan use hardware you may already have to get things going. Hortonworks is pretty much at the point of a 'next next finish' installer, so you really only need to dedicate a few hours to getting something up to test. Then, thre's a lot of tuning and craziness to running a bigger cluster, but a POC is simple.

    Anyhow, I'm blind, because all I do is Hadoop clusters all day, but this seems like an easy win for ya.

    GL;HF!

    --
    We emerge from our mother's womb an unformatted diskette; our culture formats us. - Douglas Coupland
  55. You're out of your league by Loconut1389 · · Score: 1

    Not only are you out of your league, but you're barking up the wrong tree.

    1) You should hire someone to figure it out for you- as either on-site consultancy or use something like amazon.
    2) You should use a different site that has more than 5 legitimate comments on a thread.

  56. Re:Don't by CaptQuark · · Score: 1

    Another example of posters trying to be cute and split their reply between the Subject and Comment blocks. It causes confusion when the comments don't stand alone and then you realize the subject line needs to preface the comment.

    Just "Don't" do it.

    --

  57. EMC SANs by AnythingButMicrosoft · · Score: 1

    If costs are not a priority look into using multiple EMC SANs striped in a RAID array. I've installed a few with the largest encompassing 14 physical units for ~100 VMs, they work great.

    1. Re:EMC SANs by swb · · Score: 1

      Are there vendors that actually support RAID across otherwise independent SANs?

      Like if you had SANs A through F, each with a 10 TB volume and you used SAN controller Z (which has no disks of its own) to take those 10 TB volumes and turn them into a single (say RAID-6) volume.

      I've done this for laughs with a NAS4Free implementation, using its iSCSI client to mount LUNs from 3-4 different storage devices and then combining those mounts into a RAID LUN which I then exported via ISCSI and used on a client.

      It seems like an interesting idea, and put together right seems like it might offer some relatively interesting redundancy versus some of the replication and mirroring options I've seen vendors advertise.

  58. Not a do it yourself project by cmurf · · Score: 1

    Get quotes from Netapp, EMC, and Red Hat.

  59. Re:Don't by davester666 · · Score: 1

    Budget? I suppose we do a round of layoffs...

    --
    Sleep your way to a whiter smile...date a dentist!
  60. Re: Don't by Anonymous Coward · · Score: 3, Insightful

    I think that the intention was to stimulate a discussion amongst a community of geeks who have a genuine interest in this type of technology and enjoy discussing solutions that they have built. Sure, you could just outsource the service and pay consultants to do it for you but I don't think that is the general ethos of the traditional Slashdot reader. Also, if you feel that you should be paid for commenting here then this is probably not the forum for you. Twat.

  61. MoosFS, Exablox or Scailty Ring by ACorvus · · Score: 1

    How about MooseFS (http://moosefs.org) for an OSS solution, or if you want appliances off the shelf that won't cost you a limb or three, Exablox (http://exablox.com). Or if you need more than the 700TB that can give you, how about http://www.scality.com/ - which is software defined and you can use your own iron.

    --
    -- Sig Sig Sputnik
  62. Re:Don't by Hognoxious · · Score: 1

    But he used vague requirements so not to give enough information for an actual informed decision.

    Which is the perfect situation to employ a consultant. Outcome 1: he'll ask the right questions, get accurate answers because management know the requirements, and it'll be a success. Outcome 2..N: it'll be a disaster but it won't be your fault.

    --
    Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  63. Re: Don't by Arnold+Reinhold · · Score: 1

    Would I make a major enterprise purchase based on a Slashdot discussion? Absolutely not. Would I want to read a Slashdot discussion and maybe follow suggested links and look up all the buzzwords BEFORE talking to vendors or consultants? Absolutely.

  64. Use cluster or ceph by terry.bowling · · Score: 1

    Both are free, hardware agnostic and the future of software defined storage. And Red Hat can provide enterprise support if you need.

  65. Re:Don't by donaldm · · Score: 1

    But he used vague requirements so not to give enough information for an actual informed decision.

    Which is the perfect situation to employ a consultant. Outcome 1: he'll ask the right questions, get accurate answers because management know the requirements, and it'll be a success. Outcome 2..N: it'll be a disaster but it won't be your fault.

    Excellent answer and the bit about covering your rear is priceless.

    I have consulted on issues like this and there are multiple solutions some relatively simple and others complex, but a backup solution for half a petabyte of data is not going to come cheap so obviously any professional consultant will also want to cover themselves as well.

    If a project has not been raised with all input being documented, milestones set and sign-off for all steps no professional consultant would want to touch this. Sure you can jury rig a solution and it may work but if anything goes wrong then whoever is perceived to be guiding this is effectively going to be looking for a new job.

    Here are some very basic questions a consultant is going to ask and don't think these can be answered in a simple sentance:

    1. Do you have a disaster recovery plan?
    2. What amount and type of data do you really want to backup?
    3. Do you want daily, weekly monthly, yearly or other types of backups? What type of backups do you want them to be?
    4. What do you want your backup window to be?
    5. What do you want your recovery window to be?

    The above is just the start of the questions and there are going to be many many more before that will require detailed answers before any recommendation is reached with regard to equipment, installation, maintenance as well as backup, storage and and recovery strategies. This takes time and everyone wants to cover their rear so sign-off for important steps (ie. milestones) are essential.

    --
    There ain't no such thing as proprietary standards only proprietary formats. Standards are by definition open.
  66. Re:Don't by KGIII · · Score: 1

    I did some work to help ease the traffic flow around Atlanta, GA. (There is a giant highway that runs around it in a circle, access was fairly easy but egress was not as good as it should have been. The idea was brilliant when they designed it. Importantly,population growth was around the outside of the circle and there was congestion at peak hours and the load was not where it was anticipated and designed for.)

    Anyhow, after bidding and getting the contract (a consulting contract - we would recommend design changes, for example, but not specify how the changes were made only what needed to be changed and where and traffic engineers would take care of the rest - traffic engineering was not a part of this contract and we did not bid on that project due to the mess that it was, it has only been marginally improved but it is great in off-peak hours except it is not really needed in off-peak hours) we learned something. They had effectively bid out to hire a consultant to see if they had needed to hire a consultant. Our internal name was, "The Georgia Recursive Loop." The City of Atlanta has its own traffic engineers, not as many as needed really, so we were unable to recommend consulting a consultant to keep the chain going.

    That was one of the projects (surprisingly few) that made me feel a little bad for the tax payers. They were not the only ones that hired a consultant to consult on hiring consultants. Sometimes they hire a lawyer, a specialist who is not on the city budget, to determine if they should hire a consultant to determine if they should employ the services of a consultant. (I am looking at YOU District of Columbia. I am looking at you...) Buffalo, NY hired a lawyer who recommended a specialist lawyer to vet our proposals. The original lawyer remained on the books and handled communication between the specialist lawyer (who had ended up being our main contact) and the city council. The council, of course, reported to the manager of the local transportation department. It was a lot like the "Chinese Telephone" game we played as kids where you say one thing in one person's ear and they repeat it and so on and so on until it is munged silliness at the end.

    It is quite lucrative, really. If you are not insane when you start then you will be by the time you get familiar with all of the silliness. Sorry for the novella but there simply is no easy way to share the experiences. Hopefully it is reasonably clear. My only justification, for being a part of the system, is that it paid well, provided great jobs, and the tech/educational aspects of it were originally mind blowing and fun.

    --
    "So long and thanks for all the fish."
  67. Re:CoW and Replication on Resilient Storage by Bengie · · Score: 1

    Windows is limited to 512 total shadow copies. Shadow copies could accidentally be lost for a number of reasons, they are not guaranteed. Microsoft has a list of things to be careful about that can influence your chance of losing a shadow copy, including block size and defragmentation, which could cause older shadow copies to get destroyed.

    LVM has performance issues. Many people complaints of over 10x reduction in performance after only a few snapshots. It also only works at the block level and not the FS level, which highly limits its usefulness.

  68. one bit at a time by bingoUV · · Score: 1

    'nuff said

    --
    Bingo Dictionary - Pragmatist, n. A myopic idealist.
  69. We use a combination of tools by Gumbercules!! · · Score: 1

    We store and backup about this much data (a little more), although spread across a variety of machines. All in all, though, the data is primary virtual hard drives (we run a private cloud environment).

    Storing it on disk is easy enough - and cheap enough, that it's little concern. Amazon, Azure, etc. are *insanely* expensive for this task, month by month, compared to self owned disks.

    As our hypervisors are all Microsoft (Hyper-V - and yes, I know this is Slashdot and I just said I use a Microsoft product but it's easily the most economical approach, when 99% of your clients need Windows licensing), we use Windows Server 2012 R2 native tiered storage pools on a mix of SATA HDD and SSD to achieve the storage, generally spread across a group of Supermicro servers with large numbers of disk bays - effectively software defined storage.

    For backup, we use the highly dense 1RU servers, with 12 bays (Supermicro again), with commodity 6 or 8TB SATA disks. Each RU can get near to 100TB of storage (raw) and they don't use much kW - and they cost hardly anything. Backups are performed using Microsoft DPM 2012 R2, as well, because, again, cheapest option and so far, 0 problems.

    The biggest issue I have is airwalled backups - those are hard to manage, for low dollars, for this kind of setup. So I've resorted to having a few more backup machines and manually swapping the network cable from one group, to the next, as the equivalent of swapping tapes.