Slashdot Mirror


How the LHC Is Reviving Magnetic Tape

sandbagger writes "The Large Hadron Collider is the world's biggest science experiment. When spinning, it reportedly generates up to six gigs of data per second. Today's six-terabyte tape cartridges fill rapidly when you're creating that amount of material. The Economist reports that despite the advances in SSDs and hard drives, tape still seems to be the way to go when you need to store massive amounts of digital assets."

37 of 267 comments (clear)

  1. Never underestimate the bandwidth by Gothmolly · · Score: 4, Funny

    Of a station wagon loaded with tapes.

    Also, -1, Duh, because this is an obvious, stupid article.

    --
    I want to delete my account but Slashdot doesn't allow it.
    1. Re:Never underestimate the bandwidth by Isca · · Score: 4, Interesting

      Actually I found the article informative. I knew tapes were the cheapest and most cost effective backup solution but I didn't realize that they were so fast once the tap has been loaded.

      It's also interesting to see the advances in tape reading technology that they are striving for - it sounds as if it will keep pace with HD and SSD technology to keep staying relevant.

    2. Re:Never underestimate the bandwidth by SuricouRaven · · Score: 5, Interesting

      Cheapest, sort of.

      The price of storage roughly follows the y=mx+c linear graph: m is the cost of the media, while c is the cost of the equipment needed to access it.

      For hard drives, it's easy: c=0. A drive is self-contained.

      For tape, c is large (Up to several thousand pounds for one tape drive), but m is smaller (Tape, purchased in bulk, is cheap).

      So if you're storing a small amount of data, a rack full of hard drives is cheaper. For larger amounts, tape is cheaper.

      This ignores issues of ease of access and management software.

    3. Re:Never underestimate the bandwidth by NatasRevol · · Score: 3, Funny

      Ping times.

      --
      There are two types of people in the world: Those who crave closure
    4. Re:Never underestimate the bandwidth by Dachannien · · Score: 5, Funny

      Why use a Station Wagon? Why not a 747?

      When's the last time you saw a 747 with that totally swank wood trim on the outside?

    5. Re:Never underestimate the bandwidth by dshk · · Score: 4, Informative

      Yes, they are surprisingly fast. The maximum speed of a current Tandberg LTO-6 drive is 160 megabytes/s if the data is uncompressable. With the usual compressible data it can be about 320 megabytes/s (officially 400).

      These drives can even be too fast. The drives do speed matching, but they have a minimum speed, below that they start shoe-shining. One reason I have chosen an older generation, LTO-3 tape drive, instead of the current generation, because I cannot easily feed an LTO-6 with at least 60 MB/s, which is the minimum speed of the drive. Considering compression, that is about 120 MB/s, which saturates a 1Gb network.

    6. Re:Never underestimate the bandwidth by eek_the_kat · · Score: 2

      interesting graph, but I think your explanation on C is a little muddy. I would just say C is the initial cost before any storage medium is acquired.

    7. Re:Never underestimate the bandwidth by SuricouRaven · · Score: 2

      Only if you need to access them all at once.

      My library for a long time was in the form of a row of drives sitting on the shelf, and a hot-swap bay.

    8. Re:Never underestimate the bandwidth by TooMuchToDo · · Score: 4, Interesting

      I used to work on data taking for the CMS detector at the LHC. We were using Storagetek tape silos [http://computing.fnal.gov/cdtracks/2009/january/images/robot.jpg] for long-term storage of data at Tier1.

      Tape allows for cheaper storage and large capacities, but you're then fighting contention issues (there are only so many robotic arms and tape drives for your tape library) as well as having data on tapes go bad without knowing it. When data is on disk, I can at least verify it immediately. Bit rot is definitely alive and well on tape.

    9. Re:Never underestimate the bandwidth by Doc+Hopper · · Score: 4, Informative

      The drives do speed matching, but they have a minimum speed, below that they start shoe-shining.

      Agreed. At my work we do parallel streams to multiple Sun T10000 T2 tapes (T10K "C" drives) at 250Mbyte/sec uncompressed (500 megabytes per second compressed, more or less, usually quite a bit more). If for some reason we push less than about 120mbytes/sec, the tape rewind times cause all kinds of issues.

      We make the same kind of decision when choosing Sun T10000 "B" drives instead of "C" or the new "D" drives if the source cannot push data fast enough.

      I've long laughed at articles saying tape is dead. For large-scale* backup, retention, transport, and legal hold problems, there simply is no other solution that scales reasonably well.

      *My definition of "large-scale" for this specific context: hundreds of terabytes or more, much of it transported thousands of miles regularly. If you don't work with hundreds of terabytes and at least dozens of petabytes on a daily basis, you may suffer from optimistic delusions regarding disk storage capabilities, one which disk storage vendors are all too glad to reinforce, to the detriment of customers faced with half-baked solutions that cannot hope to meet their throughput requirements. Given "large-scale" data, there's no replacement for tape at present; everything else is a low-throughput also-ran, typically harboring enormous and unplanned complications. We're also heavy users of VTL, replication, cloning, S3-workalikes, and various disk technologies. Tape remains vital to large enterprise operations, and those predicting its imminent death have been the butt of jokes about marketing wonks for a decade and a half.

    10. Re:Never underestimate the bandwidth by dshk · · Score: 5, Informative

      Sequential access speed is only relevant if you backup huge non-fragmented files or entire raw partitions, and nothing else.

    11. Re:Never underestimate the bandwidth by Chris+Pimlott · · Score: 3, Interesting

      These drives can even be too fast. The drives do speed matching, but they have a minimum speed, below that they start shoe-shining.

      Er, "shoe-shining"? What do you mean?

    12. Re:Never underestimate the bandwidth by NeoMorphy · · Score: 2

      You can use "logical block protection" and multiple copies so that you can save archive copies in protected vaults, which will increase your data integrity by having multiple copies in different locations as well as increased protection from bit rot(from cosmic rays at least). Multiple copies can be created simultaneously, at the cost of tape drives.

      You still have to read the entire disk copy to verify, which could take awhile if it's several TB in size. Though you still have several options to protect it from "bit rot", IE filesystems like ZFS and/or not using raid-0. If the data is important, you still have to back it up. Also, while it's on tape you don't have to worry about someone accidentally overwriting it with a single command. Especially the archived copies.

    13. Re:Never underestimate the bandwidth by Anonymous Coward · · Score: 3, Informative

      >"shoe-shining"

      When the tape drive repeatedly and quickly does forward and reverse operations over the same piece of tape due to data fault on the tape or some buffer problem (or other reasons, too). The analogy is to the quick back-and-forth of a shoe-shine rag that runs the same piece of rag over a shoe many times.

      Ah, memories...

    14. Re:Never underestimate the bandwidth by evilviper · · Score: 2

      Er, "shoe-shining"? What do you mean?

      Would have been quicker and easier to look it up, yourself, than asking here, and waiting for a response:

      https://en.wikipedia.org/wiki/Tape_drive#shoe-shining

      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    15. Re:Never underestimate the bandwidth by afidel · · Score: 2

      That's where disk to disk to tape comes in, feed the disk drives from your primary source and then spool off to tape as fast as the tape will go. Generally this involves something like an incremental forever strategy which just backs up the changes each day and then makes new synthetic full backups on a regular basis so the number of tapes required for a restore is reasonable. Basically you use tape for archive/retention and disk for your primary backup and restore. Tape also gives you offsite and offline backups.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    16. Re:Never underestimate the bandwidth by mjwx · · Score: 2

      Why use a Station Wagon? Why not a 747?

      When's the last time you saw a 747 with that totally swank wood trim on the outside?

      1944,

      Except we called it a Mosquito.

      --
      Calling someone a "hater" only means you can not rationally rebut their argument.
    17. Re:Never underestimate the bandwidth by EETech1 · · Score: 2

      Asking the question here is kinda like saying "tell me a story about the olden days grandpa" sure you could go to the library and read a book, but hopefully a few folks here will get off their lawn long enough to tell us an amusing shoe-shining story, or reminisce about their experiences with some bad-ass (or slow-ass) hardware to add some depth and sense of community to the discussion you just don't get from a Google search.

      I'd rather hear it here, and have it inline with the discussion for everyone to enjoy (before Dice squeezes the last drop of life from /. and we all HAVE to go read Wikipedia)

      Cheers

  2. Re:but what about cheap disk? by jones_supa · · Score: 2

    I also sometimes get the "mental patient" stamp for saying that I still use optical discs.

    I just cringe the idea of storing long term archived data using an electric charge (flash, HDD, tape). Optical disc has also the benefit of being truly read-only so that you or a piece of malware cannot destroy the data afterwards by software.

  3. Re:but what about cheap disk? by MightyMartian · · Score: 2

    I've always insisted on a tape backup system. Hard drive backups certainly have their place, but tape cannot be beat for long-term archival storage. One of our weekly offsight backups goes into a safety deposit box, where sits a duplicate tape drive. I don't want to be searching around for a replacement while my organization is down and out due to some cataclysmic failure.

    --
    The world's burning. Moped Jesus spotted on I50. Details at 11.
  4. Re:maybe by MightyMartian · · Score: 2

    I don't think "cheapness" is the problem being solved. More important for an organization like the LHC is archival reliability. Tapes can lost a long time while retaining their data integrity. I honestly doubt even high end hard drives can make that claim.

    --
    The world's burning. Moped Jesus spotted on I50. Details at 11.
  5. Re:but what about cheap disk? by MightyMartian · · Score: 3, Interesting

    It depends on the optical disc. If you fork out the money for an archival media like gold CDs or DVDs, then you can probably expect something like 20 to 40 years. All in all, from what I've read, tape still is king in long term storage.

    --
    The world's burning. Moped Jesus spotted on I50. Details at 11.
  6. No shit Sherlock by morcego · · Score: 3, Informative

    No one in the data retention business ever stopped using tapes. See the numbers on LTO units being sold, if you need proof.

    This is a shitty article.

    --
    morcego
    1. Re:No shit Sherlock by TheSync · · Score: 2

      Sony now has a new Optical Disk Archive (ODA) at up to 1.5 TB per disk and they claim they are good for 50 years.

      I still lean LTO myself due to it being a more widely-used format with far more vendors participating.

    2. Re:No shit Sherlock by Doc+Hopper · · Score: 2

      But doing backups of company data on tape is a bad idea. A bad reader can ruin tape. Its also susceptible to strong EM events. Optical is the way to go.

      Have you ever tried restoring data from a DVD silo that's been in continuous use for ten years? I have, from multiple silos using different media. The rate of corruption of optical media is TERRIBLE.

      Optical media is useful for certain types of storage, but historical reliability rates are awful. Meanwhile, tape tends to find the errors at write time, with far fewer incidents of "write once, read never". Everybody thinks optical is great until they work with the damned stuff. Failures are rampant, and unlike tape they tend to happen silently and undetectably until you try to read the stuff some time hence.

      For tape, EM worries are obviated by the use of decent modern tape containers; any EMP source sufficiently-strong to get through the shielding on a modern, shielded tape shipping container is likely to destroy the container itself as well (read: low-level nuclear explosion). Also, if you store your data with a good off-site storage company (Iron Mountain is a fine choice), they will store your data in shielded cages that won't let a cell phone signal leak out, much less much worry about EM destruction of your data.

      If you value your data, use tape. Or replicate the data like crazy globally and distribute the electrical & maintenance cost to your customers. If you don't really care about the data, optical media is just fine.

  7. Re:but what about cheap disk? by MightyYar · · Score: 4, Informative

    Just be careful - optical disks degrade, too. Years ago before hard drives became so incredibly dirt cheap, I would do my little video editing thing and then back up the project files to DVD. And not just any DVD - I did my homework and found the best-rated archival DVDs (sorry, don't remember the brand - only that they came from Japan). Anyway, I just sucked them back onto my NAS, and some of them had developed a teeny bit of unreadable data. Fortunately, I had made PAR2 files for everything. Between par2repair and ddrescue, I was able to recover the data. But the moral of the story is don't rely on optical disks to be magical storage that does not degrade.

    --
    W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
  8. The death of tape by cyberchondriac · · Score: 2

    ..has been greatly exaggerated lately by trade journals. There are some backup scenarios for which hard disc backup just isn't viable.
    Viva la tape.

    --

    Look back up at my post, now look back down, you're on the Internet. Now look back up. I'm a signature.
  9. Gmail is backed up to tape by Albanach · · Score: 3, Interesting

    A couple of years ago, Google restored lost gmail from tape. I'd expect that even with deduplication they must use a phenomonal amount of tape.

  10. Re:but what about cheap disk? by rickb928 · · Score: 4, Informative

    The bottom line in managing long-term archiving (5+ years) is that you need to both refresh and verify you storage, at several different levels.

    1. Shoot the initial copy.
    2. Copy this asap. "Copy1"
    3. Stash both in disparate locations.
    4. Go back to the 'original' on a 6-9 month schedule and verify it.
    5. Go back to the 'copy1' on a schedule and verify it on a different schedule.
    6. Go back to the 'original' on a different 9-12 month schedule and refresh(copy) it, stored to the other site.
    7. Go back to the 'copy1' on a different schedule and refresh (copy) it, stored to the other site.
    8. Repeat 4&5 on a year schedule. Do you need to re-write the data in 'current' formats and retain both original and new? Are you moving to new media?
    9. Repeat 6&7 on a year schedule. Ditto the rest of step 8.
    10. We should be at year 2 or 2.5. Repeat steps 1-9 once for a 6+/- year retention, again for 10+ year retention.

    Are you changing data formats, and is it possible to ensure integrity by copy8ing and archiving in new formats?
    As you change media, do you need to retain old media systems, or will you move to the new media?
    At what point is the data no longer valid, determined by the owners?
    Are the 'owners' the only stakeholders? If not, expand the set.

    In all of this, you have a dedicated media management system including media drives, copy/verify capabilities, and stand-in for restoration.

    This is all very interesting to me. Medical records in particular seem to be assumed to have a lifetime retention, but other than the date and nature of the event, how important are the details of your appendectomy performed at age 5 when you are 60? Is that benign tumor removed at age 12 important at age 45? How much LHC data collected in 2013 will be useful in 2023? Different criteria. Different processes.

    --
    deleting the extra space after periods so i can stay relevant, yeah.
  11. Spinning? by rossdee · · Score: 2

    " When spinning it reportedly generates up to six gigs of data per second."

    The LHC itself doesn't spin, rvrn though there are protons moving around the circular track at very near lightspeed. /pedant

  12. Reviving Magnetic Tape? by bravecanadian · · Score: 2

    For *reliable* backup and archive purposes tape never went out of style.

  13. Consumer vs enterprise tape technology by BenJeremy · · Score: 3, Interesting

    I've worked as a tape monkey in a large facility (Camp Foster RASC, Okinawa, circa 1989-90), so I know tapes do work well in the enterprise, but my experience with tapes in the consumer space in the 90s was anything but good. 90% of the tape backups made (using several different formats) using consumer-grade systems were corrupt and worthless.

    We took great care with the tapes, but when we checked them (thankfully never needed them, except one occasion), they were mostly all bad.

    Optical isn't much more reassuring as a backup media, given that optical discs tend to degrade over time.

    If somebody has a tape system that can store terabytes on a cartridge, reliably, for say... $10/TB or less, and the system costs less than $200, I'd look at it, though. Otherwise, it is still more worthwhile just to use hard drives to back up data (even at their inflated prices)

  14. Re:but what about cheap disk? by Doc+Hopper · · Score: 3, Informative

    long-life optical discs fail... [store] tape in a cool, dark place...

    This, this, one-thousand times this. I've worked in data centers for a decade and a half, and seen innumerable optical media go bad within just a few years (typically about 3 years) even in DVD jukeboxes in climate-controlled environments. Meanwhile, we restore from fairly ancient tapes on a regular basis.

    In reality, most companies don't store tapes longer than 7 years anyway; that's the upper limit of typical audit liability. The data on the tapes may be older than that, kept indefinitely on-disk, but most large companies have a fairly aggressive destruction/over-write schedule for data on tape older than 7 years.

    It's very unlikely we'll need data off a tape 20 years from now, but kept in the right conditions -- like the bat-cave of a tape silo room housing tens of thousands of 10TB tapes a few feet away from me right now -- there's a really good chance the data will be readable. While we do have plenty of tape failures (hundreds per year), they are almost always caught at write-time by the verification head.

    On a modern tape drive, you usually have several dozen "heads" on any given tape drive, and there will be two sets of them each with its own mechanism to align it with a precision of just a few microns. Pretty amazing, really; if you drop by the Denver, CO area some time, the Oracle/Sun building engineers there can often arrange a tour of our tape testing facilities if you sign a NDA and represent a potential sale. Anyway, the second mechanism will be engaged on the tape in order to read what the first just wrote and verify it before it passes the "successful write" confirmation back up the fibre channel chain. This way you can guarantee you don't get "write once, read never" media.

  15. Re:but what about cheap disk? by Doc+Hopper · · Score: 4, Informative

    ...you need to both refresh and verify you storage...

    You came pretty close with the process, but for most businesses you're not quite there. Here are a few clarifications on the process.

    1. Typically large companies (including those, like us, with stringent HIPAA requirements) take two simultaneous copies from the original source. We don't copy a copy if it can be avoided, and we have enough tape drives to do this.
    2. We contract out with a local storage company to grab the tapes within a few days and store for the given retention period off-site. One copy usually remains on-site as well for long-term retention and rapid restoration. With plenty of capacity in the silo (tens of thousands of tapes in an Oracle/Sun SL8500), we are not terribly concerned about retention policies. If we get tight on space, we'll just expand the silo again.
    3. The same data usually still exists as on-disk media marked read-only, available for the legal folks who insisted we archive it in the first place. Often it also exists at a second geographical location thousands of miles (at minimum) from the first, with its own backup tapes. Plus it exists on two tapes at each site, one near-line and one off-site. Given tape reliability, three layers of data protection is typically sufficient. If "legal hold" is involved, we also insist that the disk array be kept on a valid support contract to reduce the risk of failed disks in the storage appliance.
    4. Retention policies dictate we keep around at least a few tape drives of every generation we've ever used which has tapes archived with our off-site storage facility. Even if they are not in the silo, they're in a storage closet waiting for us to bring them to life if needed up to twenty years later.

    I do this kind of thing all the time. Feel free to ping me at my easily-figured-out email address (firstname@lastname.org) if I can answer additional questions for you.

  16. Re:Tape is bullshit. by Doc+Hopper · · Score: 3, Informative

    Tape is slow, expensive, proprietary and unreliable.

    The only people who still use it are those who have to, or idiots with money to burn.

    Fact check on the troll.

    "Tape is slow". Absolutely false for throughput; true only for IOPS. A modern tape is much faster than a modern hard drive. That's the point of the article, and my personal experience as well. Random I/O to/from tape drives is incredibly slow, but no hard drive can touch a modern tape drive's throughput. It's the reason LHC uses it.

    "Tape is expensive": True only in a non-ROI sense, therefore mostly false. You'll find a modern, large tape silo of equivalent capacity to a modern, large storage appliance usually works out much cheaper both in initial cost and cost over time if you intend to use the hardware for at least three to five years. That said, the cost of admission to the world of enterprise tape is pretty high; it's the ongoing costs that are much lower than hard drives.

    "Tape is Proprietary": Both true and false. LTO is an open (licensable) standard, but the fastest/largest tape drives on the planet are typically proprietary right now, because being the fastest/largest causes more sales, and therefore funds innovation in faster/larger tape technology.

    "The only people who still use it are those who have to...": False. There are many, many use cases for tape where it is not a requirement, but is just more convenient, reliable, faster, and less expensive than a hard-disk solution. I could list them, but, well, you're a troll and I don't want to type much more.

    "The only people who still use it are... [those] with money to burn.": False. ROI is what drives most of our tape purchases, and we save an enormous amount of money by using tape in appropriate scenarios. Hard disks are appropriate for some use cases, tapes are mandatory or just a smart purchase in others.

  17. Re: maybe by Doc+Hopper · · Score: 3, Informative

    Late 2013 pricing.

    4TB hard drive: around $400
    5TB tape: around $160
    8.5TB tape (same media as 5TB, newer drive): still about $160

    Cost per terabyte of disk: about $100.
    Cost per terabyte of tape: about $19

    I'm ignoring the cost of the tape drive, just like I'm ignoring the cost of the head(s) involved in NAS/SAN storage.

    To fix your quote to be in line with reality:

    Glacier is cold storage; the drives are only spinning when they are filled, when retrieving, and when scrubbing / consolidating. Just like tape but at least five times more expensive.

  18. Re:but what about cheap disk? by lgw · · Score: 2

    Tape does not rot. Bad tapes were bad when created (at least this is true of half-inch linear format - some of the old personal tape formats were crap, but they're all dead now). LTO tape archive lifetime depends on the tape, but it's up to 30 years. You only need to verify tape once after you write it - if the tape was written successfully, it's good until the backing starts to fail.

    Your upstream to them is free as well, as you're already paying for an Internet connection. "It's free because you're paying"? That's the opposite of free. You might want to check the math on how much internet connectivity you need to buy to upload 1TB in any useful timeframe - those connections aren't cheap.

    Still, it's not all about price. If Amazon gives you a warm, fuzzy feeling of safety then more power to you!

    --
    Socialism: a lie told by totalitarians and believed by fools.