Ask Slashdot: Best Way To Store Data In Hard Copy?
First time accepted submitter bmearns writes "I have some simple plain-text files (e.g., account information) that I want to print on paper and store in my firebox as a backup to my backup. What's the best way to encode the data for print so that it can later be restored to digital form? I've considered just printing it as text and using OCR to recover it. The upsides are that it's easy and I can even access the information without a computer if necessary. Downsides are data density, no encryption, no error correction, and how well does OCR work, anyway? Another option is printing 2D barcodes. Upsides are density, error correction, I could encrypt the data before printing. Downsides are that I'll need to split it up into multiple barcodes due to maximum capacity of popular barcode formats, and I can't access the data without a computer. Did I miss any options? What do slashdotters suggest?"
It would be far easier to scan a lot of text back to digital form than read numerous bar codes. Converting the text to useful data may be the more difficult part. But why would you want to go through this hassle?
Print a human-readable copy and add a computer-readable format, like barcodes or a pen drive, a hard drive, SD card... (CDs might not survive very long if you're unlucky)
there must be some way to do QR codes
http://qrcode.kaywa.com/ can do it 160 characters at a time, but that seems really inconvenient
I am sure you googled for it and you checked the first result (http://ollydbg.de/Paperbak/) it gave?
I'm not sure why you want to use paper, so I don't know if it's compelling or not. If not, consider this: paper is inexpensive, but ink and toner are not. So ... why not back up your data to an inexpensive hard drive and put that in your firebox? Depending on the drive and the amount of data, the drive may not even take up as much space as the paper. This is just a suggestion.
The Egyptians used hand written papyrus and we still have copies to look at. The laser printed paper copies of the Book of the Dead simply didn't survive.
Flash media is probably demonstrably more durable than paper. Get some hardened flash keys and store multiple copies of your Library of Alexandria in redundant lockboxes.
Print the hashes. Then come up with a secure storage method for the passphrase.
Just to cover more alternatives. But, really, why make things unnecessarily complicated for yourself? If the papers are in your firebox anyway, why encrypt? If you insist, try encrypted RAR with parity, converted to base64 and printed as the resulting plaintext in a decently large print to make sure no smudging will cause trouble during OCR.
They contain error correction, they are scalable, and have quite a nice information density. And you can generate them with tons of free tools and several APIs are available as well.
Personally, I just keep backups and don't bother with hard copies.
You're printing text... So print text. Error correction? You, which is yer best bet when the papers are waterlogged after a basement flood or what ever else isn't going to happen to YOU.
Google for OCR-A and OCR-B as TTF. There are freely available versions. I use them for mailing labels, along with PostNet bar codes to make it as easy as possible for the Post Office.
Learning HOW to think is more important than learning WHAT to think.
QR codes. You can encrypt these. If you print them e.g. on plastic foil, they'll last close to forever. Of course, you will need to keep a piece of hardware that can read QR codes.
I would, however, take another route, although outside of the scope of your question. It is something I already do for files that are very valuable to me: I put them on magneto-optical disks. The things last forever and withstand the roughest of treatments. Writing and reading are slow, but that is a downside I just accept. I still have a database ( invaluable to me ) I acquired in the middle '90s on magneto-optical disk. It survived: a fire; spilling of liquids, including dog pee; some mild X-ray radiation; an inadvertent stay in our home's trash can; being jumped upon by a kid; and a 20-foot fall.
Religous speak to God. Insane are spoken to by God. When all shut up, one can finally hear Shostakovich in peace
First of all to answer your question, just print it. Document scanners today are awesome.
BUT:
I have some simple plain-text files (e.g., account information) that I want to print on paper and store in my firebox as a backup to my backup.
Why?
What is so important that you need so many hard copies?
The only thing I can think of is maybe your mortgage if you have one. I've seen banks screw up so bad that on rare occasions folks need to go back decades worth of records to keep their house. So that's what? 360 statements?
And there's legal requirements on some things. Statute of limitations only go back so many years, so keeping a lot of stuff for certain things - like taxes - really isn't worth it.
So, maybe reevaluate why you need your backups.
Created compressed password protected files and then use Paperback http://www.ollydbg.de/ to print the results. Downsides are PC based program to encode/decode but upside is you get the source code.
Get one of those thin flash cards, save the data on it, and tape it to the printed paper.
I mean, c'mon. What's the point of having it ONLY on paper? Yes, this is the backup of the backup. So what? Add another layer and save you the trouble later. Or two layers. It is obviously not too much data, since you are considering backup it up on paper. So just for a few 5ers and get some low capacity flash cards, make lots of copies.
morcego
In terms of their ubiquity in modern marketing, QR Codes are a slightly annoying solution in search of a problem; but as an engineering approach to the sort of problem the OP described, they're fantastic. There are many free and open source QR Code generation utilities and libraries, and the QR Code spec itself was patented, but freely licensed for public use by the Toyota subsidiary that invented it.
QR codes include error correction, and can encode binary data on the order of a hundred times the density of a regular bar code.
We need more information to be able to answer your question! What kind of barcode scanner? How much information? Are you talking a few pages of account numbers or are you talking reams of source code? How do you plan to get the data once you need it? More than once data recovery project has failed over the years when the data was available but there was no means to recover the data from the media!!
Are you going to keep at least two barcode scanners in your lockbox (a decent one is about $6-$800), what about a license for a product to read the data and it's media? Do you have a preferred operating that you have to use? Is this for legal purposes where you have to maintain the chain of custody?
Do you need the ability to recover data in a hurry, or can you take a couple days to recover data for account numbers for another country, or is this a legal recovery so that you can prove that /you/ wrote the source code to something? Why not use tried and true methods of data archival like tape backup, hard disk, or archival qualities of optical media? It almost seems like your deliberately trying to be obtuse for the sake of being obtuse.
If you simply want privacy go with your pick of an open source crypto program and store with an 2048 bit key or some such thing. For lack of a better way to put it you sound like your asking for the best wrench to hammer a nail into a board with - just get a hammer.
For printing, pick a font that has no ambiguous characters. This makes OCR easier if you have to retrieve the data back into a computer. I suggest Trebuchet, in which I (upper-case eye), l (lower-case ell), and 1 (one) are distinct. Alternatively, use either the OCR-A or OCR-B font, which are not easily read by humans. Place the hard copy in a sealed envelope and store it in a bank safe-deposit box.
Also in the same safe-deposit box, store electronic copies using at least two different media (two so that, if one becomes obsolete and unreadable, the other might still be used). You might want to change the media -- or at least review them -- annually to ensure they are still useable.
Take a look at Twibright Optar (http://ronja.twibright.com/optar/)
(A review is at: http://lwn.net/Articles/242735/)
Might be hard to find, but a nice plastic form of punch tape might do the job of both having a hard copy (technically human readable) and being machine-readable. You'd have the added advantage of being able to encorporate encryption if you so desired.
If you're really serious about having hard printouts that you want to later get back in should a disaster occur, an idea I would have would be to base64 encode the text and then print it using a fixed width font in order to make OCR easier down the line. The downside of this is that should the scan not be great or the paper become degraded then you may find you'd get weird encoding issues if, say, a lowercase "l" is read as an uppercase "I" I'd also take hashes of the text files and print them in the header/footer as a rudimentary way of verifying the files are the same after scanning them back. Maybe do a few tests before committing to such a method, this is totally off the top of my head BTW!
SD cards are surprisingly durable. While diving, I've recovered cameras that have been lost underwater for years and the flash cards work fine. I don't know about heat resistance, though, or how hot it might get in that firebox.
Take a look at Twibright Optar: http://ronja.twibright.com/optar/ (A review is at: http://lwn.net/Articles/242735/)
There used to be one called Bridge, but I couldn't find it. Anyway, it's popular enough so that you can learn braille if you ever lose the digital reader. Also, if you can code at all, it'd be easier to parse the count of dots than the thickness of lines from scanned-in images; perhaps make up your own "braille" system and store the algorithm in plain text along with a bunch of other algorithms. I think you'll be safe enough from most thieves, just not the government (but they can already get your account information). Really, instead I'd rather recommend a remote server (or cloud) and just use Duplicity (rsync+gpg software).
The G
> pen drive ... SD card
No, absolutely not. Flash memory is not archival storage. Flash memory is subject to charge leakage over time and current MLC / TLC flash is even more vulnerable because 2 or 3 bits are stored per cell at the cost of reduced resolution margins.
Engraved to stone. Guaranteed for centuries.
Back in the late 90s when it was difficult to export strong crypto out of the USA, the PGP project came up with a program to get around this by using some loopholes in the law that allowed the source code to be exported if it was printed in book form.
So the PGP source code was printed out, made into books, shipped overseas, and scanned and OCR'd. My memory is somewhat fuzzy, but they had a suite of utilities to do this reliably. See http://www.pgpi.org/pgpi/project/scanning for a description and links to the tools.
I would compress it with a password (7-zip, RAR etc.) and then use Google Drive, Dropbox etc. to store it.
Thus it will be future proof for many years and accessible on any computer.
If you are going to encode it in a non human readable format, there is little point to storing it on hardcopy over electronic storage medium (hard disk, USB flash, floppy, etc). You will still need a computer to access it.
There are some fonts out there specifically designed for OCR, but in practice any little spec of dust or dirt can change how the computer reads it (an "O" can become a "Q" for example. And "1" is easily misread as "i" - in some fonts they are even 100% identical). So OCR is OK for text that you can spellcheck, but not for other kinds of data.
Depending on the kind of data, you could include something like a printed checksum to verify you read it write.
To conserve space, just make the fonts as small as you feel comfortable reading, use both sides of the paper, possibly reformat the data to utilize more space on the page, and use thin lightweight paper. And include an additional electronic backup so you don't have to bother OCRing until the world ends next Thursday.
Ridulian crystal paper.
I would never have thought of putting my backups on paper. I instead multiply the backup locations to insure the redundancy I am comfortable with.
Everything I write is lies, read between the lines.
This doesn't make any sense whatsoever.
You could encode the data in barcodes, but it'd be as silly as maling copies of your book collection using paper and pen.
Your eat bet is to just encrypt the hell out of your data using trucrypt, and toss it on a dirt cheap server somewhere that isn't publically accessible.
There is no reason why you cannot print encrypted text, but OCR of fonts is more difficult and error prone than bar codes. How about totally geeking out with paper tape or punch cards?
Excuse me, but please get off my Pennisetum Clandestinum, eh!
What information is so important that you have to deal with it in this fashion? Just curious...
Microfilm is still the best way to store large quantities of data in hard copies. Easy to store on film, easy to copy, easy to convert to digital files again if needed. Cheap. It doesn't require complicated machinery or defunct software to access
Chisel the data in stone. Then use the stone to build your house. It is known to last for thousands of years.
Excuse me, but please get off my Pennisetum Clandestinum, eh!
It's how we used to get data into the system, or store data from the system, many (ahem) decades ago!
Seriously, it's a shame these technologies are no longer used, as they would be ideal for this purpose.
You never know what is enough unless you know what is more than enough. - Blake
I don't see it as being mentioned but since CD's were mentioned why not look into Mdisc, they claim to last 1000 years + and they offer both dvd, and now bluray options.
QR Codes are 2-D barcodes. Each QR square can support 4k of (capitals-only) alphanumeric text, or nearly 3k of binary data. It has built-in support for error correction and spanning data across multiple QR Code blocks. And of course binary data can be encrypted.
-
- - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
http://www.pgpi.org/pgpi/project/scanning/
Shoot title cards of the text onto BW film which is flammable nitrate stock. Mix in with scenes of people acting in early 20th century costume. The actual film may not last, and might burn your house down; but if anybody ever finds it they'll do their best to transfer it to something else.
For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
What is harder:
procuring and digitizing this information in some way that's "machine-recoverable"?
Or... just typing the account info back in again by hand in the odd case you absolutely need to recover from your backups?
If you seriously have enough "account information" that it would be ridiculously prohibitive for you to type them back in using your paper copy in an hour or two some Saturday afternoon, then this is a problem for your accountants and lawyers to solve.
Jesus, you people will over-engineer ANYTHING.
The Milleniata M-DISC is a writeable DVD that is supposed to last a thousand years. Accelerated ageing tests conducted by an independent tester seem to support this claim. For less than a hundred dollars you can get some disks and a compatible drive. Add a fire safe rated for 2 hours and you are all set.
Even if the media itself survives, you might be wondering if the chances of finding a drive capable of reading it in the future are better than your chances of finding a 5.25" drive today. I think they are much better.
The 12cm disk form factor has survived since 1981 and has seen the transition from pressed CD to writable media and higher densities. The only real alternative are formats that have electronics as part of the media. I believe there will be a market for passive recording media for a long time to come and there is no real reason to leave the form factor. And as long as the drive accepts media of the same shape it will probably support legacy formats.
This might be a bit of a stretch.. but if you want some "encryption" on your printed copies.. have you considered using a font like wingdings or webdings to use as the print font? I was thinking about some of the previous posts regarding Egyptian glyphs and tho it's not a solid "encryption" (more obfuscation).. It would be a security deterrance, if anything. And if you need to "decrypt" your text you could utilize an in-hand charmap to decode it and OCR should allow you to scan it in with "read as font". Just an idea.
Question Reality, Find Your Own Truth...
http://ollydbg.de/Paperbak/
If your text is entirely in a single, simple font, OCR can work really well on that. You shouldn't have any trouble. QR codes might have been forgotten in 20 years, and are hard for humans to read.
Personally I'd just stick a USB stick in the safe, printing it out is too much work.
"First they came for the slanderers and i said nothing."
If I was doing it I'd use a combination of 3 techniques:
1 Plain text for human readability
2 QR codes for scanning and error correction
3 Redundant Gold stabilized Azo dye CDRs with ECC codes for fast machine readability
DVD videos, for example, have error correction, yada yada yada.
But, at the same time you can put a tiny tiny nick in one and the entire thing can become unreadable.
VHSes mean-while will degrade gracefully.
I do not know anything about barcode encoding, but you should always consider how damaging a small amount of damage/warping is and how the data degrades when damaged.
Text degrades very gracefully, the entire page needs to be completely destroyed to lose the entire data set.
No idea about barcodes.
Troll is not a replacement for I disagree.
Are available at camera stores. I suspect we'll be able to read CD formats for quite a while longer.
This is a backup to your backup, so digital means must have failed before you'd consider using it. Text is low density, but it has an advantage that any encrypted barcode or other high tech means do not have -- it can be read by human eyes. When you're huddled in a rough lean-to roasting a feral cat over the campfire amid the wreckage of civilization, you will still be able to read your backup. That might come in handy.
Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
I'm thinking tatoo might be a good medium. Depends on your storage needs and the size of your back.
Sorry, I omitted: Over a campfire of old burning tires. It gives the cat a nice smoky isoprene taste. Try it, you'll.... well, it'll keep you alive.
Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
You should print it on acid-free paper if you plan on scanning it back in because regular paper will be useless to you in about 150-200 years.
(-1: Post disagrees with my already-settled worldview) is not a valid mod option.
After you've done that the problem will have resolved itself to the point where most people just have a folder of "stuff".
If you still feel the urge to put printed copies in a fire-safe, take into account the type of ink you print your stuff with (you wouldn't want to come back in 5 years to discover all you have was faded sheets with no printing on them) and also just how long your safe remains fireproof for. It may not be as good as you'd hoped.
politicians are like babies' nappies: they should both be changed regularly and for the same reasons
Datamatrix (2D) barcodes provide a good method for storing data, at a maximum geometry of 144x144 'pixels' you can store a maximum of 3116 numerical digits or 2335 alpha characters - encoding could be further optimised by pre-processing the stream, but just changing to all uppercase can improve encoding density. Furthermore upto 16 barcodes can be 'joined' to extended the size of the record.
Although not easy a datamatrix can be decoded by a human, but there are many applications and devices which will do it for you. If you are making a time capsule include a hand-held scanner which can output RS232.
Laser or acid etching metal sheet would produce an almost indestructable record, which could be included along with human readable pages for maximum redunancy.
punch holes in cardboard.
here at hobarthackerspace we have a pdp8 a chap is restoring so soon we may be able to have reliable paper backups.
High density Morse Code.
* Carthago Delenda Est *
I wouldn't keep a paper copy around. Just make sure you have a redundant backup.
I have two encrypted external USB drives. One is always connected to my docking station, and one that I keep in the trunk of my car (i.e. "offsite"). I have a nightly cron job that rsync's the latest version of my files to the drive that's always connected to my docking station. Once a month I'll sync to the other drive, and put it back in the trunk of my car when it's done.
This setup doesn't cost me anything, my data isn't stored in the cloud, and I'll still have a copy of my data if anyone breaks into my house, my house burns down, my car is stolen, etc.
Why do you want to store bits on paper and not on some other medium? Chances are that paper is not the medium with the best data security.
Ultimately the only difference between storing a bit on a magnetic domain on some disk and in the color of some paper fibers is the far inferior data density of the paper, and the relative durability.
Unless you're just talking about a fairly small amount of information the economy of scale will almost always favor digital media. With barcodes you could probably store about 265k/page (with some of that taken up by ECC no doubt, and you'll probably have to roll your own software to manage it all). That is about 260MB per 20lb ream of paper if you double-side it. That is 80lbs/GB, at a cost of probably $12/GB if you get half-decent paper in bulk (what would be the point of using super-cheap paper?). Oh, and that isn't factoring in the costs to print it all, or later scan it all back. It would take hours to print 1GB of barcode and you'd spend probably $20 on toner doing it with an economical printer.
I'm sure you could buy the nicest archival DVDs for WAY less than that, and you can burn several GB in a few minutes and read it back in even less time, and the discs won't weigh 80lbs each.
Go ahead and add extra ECC, or use tape/whatever. I imagine that no matter how you slice it the paper will be the least effective storage solution against any failure mode.
Punch Cards made of Refractory metals , Problem Solved !
Harddrives made of sapphire
http://news.sciencemag.org/sciencenow/2012/07/a-million-year-hard-disk.html?ref=hp
Microfilm is an option if you want high density data storage on a non-digital medium.
It's the only way to prove to future archeologists that neanderthal genes survived well into the 21st century.
...post on Dan's Data already?
He covered most options available for what you want back in 2009, and apparently he did an update in 2011.
http://www.dansdata.com/gz094.htm
Mit der Dummheit kämpfen Götter selbst vergebens
Why not encrypt it and store it within the bitcoin block chain. As long as there is at least a few lonely souls still mining it your data is as safe as it could ever be.
http://ronja.twibright.com/optar/
It's great to have a physical copy... for lots of reasons. Like you can't trust third parties...
But really...take your text files, put them in a tarball, encrypt them with a symmetric key, and put it up in google docs.
If I was tasked with printing out text that needed to be archived in a vault I think I would encode the text as a QR code with the error correction set to level H first.
I would also consider laminating the pages or doing something to protect them from moisture and dryness.
Personally I toss a USB drive in a safe deposit box every few months.
Mod me down with all of your hatred, and your journey towards the dark side will be complete!
I use laser etching for gigs of data into aluminum foil to store data content into the foil as binary data with ECC. Binary data is burned through. Clear text is then etched upon upon the surface as a microscopic print.
The end result is a printer media which should last several thousand years.
LTO5 Tape in IBM LTFS format is the current Tape standard in the Banking industry we also use it in Digital Motion Picture Industry.
Put your data on DVD+Rs, multiple copys in multiple directories, two disks of different Japanese brands.
Read Frank Herbert's "Dune" for the reference.
Do not mock my vision of impractical footwear
I have an entire box of unpunched punch cards that I'm willing to sell, for the right price.
print the Hexadecimal representation on good archival paper in OCR-A it will be practically bullet proof
if you must encrypt, encrypt each account separately that way if a portion of the page is lost / destroyed the rest is still useful.
Snowden and Manning are heroes.
This was figured out (for e-mail) decades ago. A number of mail servers only supported 7-bit data (ie. ASCII text), so you couldn't just dump binary data in an e-mail and expect it to get through, uncorrupted. Not to mention DOS/Unix text conversion destroying binaries...
First it was UUEncode. But that was soon eclipsed by Base64 encoding. Any data you have can be run through the base64 command included with most Linux systems, and come out as plain text. The output can be run through the base64 -d command to be returned to the original.
This might not sound impressive when you're printing out plain text, but something like gzip can give you 90%+ compression on a plain text file, and base64 gives you that data in a format that can be printed out on paper, and read and typed back in.
Encryption can be in the loop as well... In fact PGP ALREADY DOES ALL OF THIS FOR YOU. You can tell it to compress the data with gzip or bzip2 before encrypting it with your key, and you can tell it to "ASCII armor" the output (typically for e-mail use).
As an added bonus, with PGP you can protect yourself against data loss by taking your password-protected public/private keypair, and output it with the password removed, and ASCII armored, so it can be printed out to paper, and stored (somewhere even safer than your safe) so you can always decrypt your critical data, by proving you have physical access to that key printout, even if you forgot the password you used to protect it, years later.
And if you want extra protection against physical damage of the paper copies in your safe, PAR2 will allows you to "RAID" your data, using parity and checksums to reconstruct the bits missing in your data. You can select the exact percentage of parity data you want... I find 34% a very good arrangement, so you have 3 pages (or CDs/DVDs) of data plus one of just parity, allowing you to recover any one entirely missing page in the set of four, OR if you don't lose a page, recover the data if up to 34% of the media is damaged, no matter what the distribution across the four.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
Comment removed based on user account deletion
How secure does this need to be? Burn a copy to m-discs (rated at 1000 years), put a copy on tape (rated a few decades), put a copy in the cloud (like BackBlaze). Printing this out on paper doesn't make any sense. If you do that, be sure to store it in an OCR-compatible format, and be sure to use something that has heavy amounts of error correction built in so that you can get a 100% data reproduction even with OCR errors.
Another option is to put several QR codes on a page. QR codes max out (by spec) at between 1,276 and 2,956 bytes at version 40 (the largest size) depending on how much error correction is involved. At max level, 1,276 bytes are stored and 30% of the QR code can be destroyed before data loss. At min level, 2,956 bytes can be stored and 7% can be lost. Not all apps can read such large barcodes, however.
Use the Cauzin Softstrip...
DVD's do not have a very long shelf life, even if kept very well. Maybe they make longer lasting (more expensive?) DVDs with better dyes and materials.
http://interserver.net/
Statistically speaking, egyptian hieroglyphics carved in stone seem to be readable over long time periods.
What information has survived the longest? Perhaps the information encoded in DNA. But, to survive in harsh environments with the constant threat of destruction, new copies are made continually within cell nuclei.
Which leads to my suggestion. Maybe we don't just want our data preserved for millenia, we want it made better! Thus we should initiate a process of artificial intelligence, seeded by our own ideas but with freedom to innovate in years to come. Then, it can adapt in order to communicate with any future would-be readers.
You can use a very small font and still be able to distinguish the three "characters" of dot, dash or space. Likewise, even faded the difference in size between a dot and dash should be discernable. As to encryption, it may suffice to just use a code book and keep that in a different location. But I think picking up page after page filled with morse code would deter most wannabe thieves.
Back in the days of yore [ late 60's ] I was a physics / computer science student who worked part time as a newspaper photographer. The newspaper got a newfangled computerized typesetter that used punched paper tape for it's input. The bootstrap program was about 10' of 1" wide paper tape. The machine crashed and had to be rebooted at least fifteen to twenty times a day. We didn't have a duplicator for punched tape and we were having to re-punch the entire program every five or six days. Until we found a punch that handled a thin stainless steel tape that the reader would accept. I still have several short programs that were read hundreds of times and have been sitting on a shelf for over 40 years. I can still read the data on the tape manually bit by bit and, if a machine existed, I'm sure the tape would still work fine. I'm pretty sure it'll still be both machine and human readable after another 500 years assuming it doesn't get hot enough to fuse the reel together or get exposed to enough radiation to make the stainless steel brittle. Program code was straight octal machine code and text was encoded as 7 bit ASCII with an 8th entropy bit.
Storage:
Retrieval:
I'd bet you could pick up some used microfilm or microfiche equipment from an old library, newspaper, or business. That equipment was standard during the 1970s, and I'd guess there's still a lot of it around.
You can store hundreds of pages on a single small roll of microfilm.
Canon still makes equipment to scan microfilm into digital formats.
Comment removed based on user account deletion
you can laser engrave on metal, plastic, or even wood (like wood burning) they may have a laser printer at a local "do it yourself" club (usually a membership-based workshop with saws, drills, lathes and other large machines designed for home woodwork/metalwork/et cetera)
laser etched metal will last far longer than you will.
It depends on what kind of information you have.
Information that is human-readable like account numbers and other details is something that especially in an emergency you might need to have ready, even if due to the emergency you lost access to your computer, scanner, etc. Heck, you might need access to it in order to get a new one.
There are special OCR-fonts that you can use to print out that information and be certain that the OCR will be painless. Use it, because keeping the data human-readable also leaves you with a backup option in case the restore doesn't work as expected (say, the paper got crumbled or smeared) - you can type it in.
For human-meaningless information, pure digital data, QR codes are probably fine because they allow for error-correction and are meant to be able to be read back even if there is noise in the input data, something that barcodes are not as good at.
But, frankly, why the fuck don't you just put your stuff on a CD, USB thumb drive or something like that and put that into the fire box ?
Assorted stuff I do sometimes: Lemuria.org
How do you backup your backup program? I you use a special encryption and coding (e.g. barcode) program, you will need it in 10, 20, 30 years when you disaster actually happens. How do you make sure that the program you need to recover you data still is available?
Their is few doubts that OCR will still be available. However, barcodes and LUKS might be obsolete...
why would you 'read' an UUencoded version of a JPG?
unless he is going to extract all the raw text of the PDFs, office docs and OCR'ing the scans, 'reading' the binary data (UUencoded or Base64) of those files gives no insight to their contents.
i wonder why nobody created write-once-SD cards. like giant CPLD's whitch are burned by blowing fuse links. you could still read them with the Arduino Ultra Mega Fantastico rev.2853
The are encryption systems that you can do mentally. To stand up to a full attack by the NSA they may get a bit laborious (tbh, simply memorising the data is probably easier), but if you simply want to make it unprofitable to crack you can probably use http://www.schneier.com/solitaire.html.
Sorry to go non-digital but I can't help feeling that old tech probably meets the requirements here, this looks to be suited to storage on mico-fiche/ultra-fiche, it lasts a long time, its should be more durable than paper/ink, is relatively cheap to produce and only requires a light and a magnifier to read, so come the apocalypse even the Eloi could knock up a reader.
So why store locally? You can use bcrypt on a tarball, or create a small TrueCrypt volume, and store it at a cloud provider. Either a file locker (or multiple ones if you're paranoid), Dropbox, or a Linux instance at a cloud provider.
Use crypto and only store the key. A key is small enough to be typed in without OCR. If your data is (correctly) crypted, there is no problem in leaving copies in the wild.
The only solution to protect data is duplication. There is no need for a safe.
For example, I would be really annoyed to lose all my digital pictures. There is a copy on the computer of my father (in another town). It is stored in a crypted (ecryptfs) directory because some pictures are personal. He does not have my password. I have also a backup of its data.
QR codes allow for pretty high density, but since I assume you mean some sort of e-ink display, I don't get why you'd have to scan it. Besides, wouldn't you still have to keep it powered?
oh they are available but why would you want write once?
world was created 5 seconds before this post as it is.
Font is one thing (covered above), layout on the page is another important factor. As someone who has had to OCR and reconstruct financial data from statements I would suggest taking real care about ...
Page breaks that don't align with the paper breaks.
Page titles with page numbers that mix in with the recovered text EG: use !=xyz=! for the page number/title so it can be recognised and edited back out automatically.
Look for missing end of lines for long lines.
Use CSV/delimited tables rather than positional separation as tables don't OCR well when they are sparse causing "read down the columns" and misalignments and strange grouping of items.
Practice with a good sample of the data and see how it works end-to-end.
If you have to encrypt in blocks so one error won't break everything and consider using a crc checksum for each block.
As with all backups, one copy is not enough.
RAID - Redundant Array of Inexpensive Documents.
I don't know, I would use a shift cipher on the input, base64 encode, another round of shift cipher with a different value and then print. You could use OCR or just type in by hand and apply the reverse process to recover. Its stupid, but maybe the closet to being usable.
Encrypt the files and then print out the raw data in hex. To reverse it, you use OCR, convert the hex values to data files and then unencrypt. Down side is that it's not human readable, but it might be more data-dense than barcodes. You'd just have pages and pages of: 41 35 55 c1 8e 8a 1e 0d 88 69 0e 9d 48 5c 30 ba 0d 86 9c ca 6e 32 12 b3 e2 87 fc 51 1d d0 62 76 5e dc f6 8c c5 a0 40 fa 49 78 a9 3a c2 bc 19 d5 ef 79 c9 07 6b 94 85 2b 8b 1b a6 3f 4c 2d 05 f7 06 48 e4 6f d0 b6 05 00 e0 40 3e a0 4a 1c 12 05 9b ed 7c b4 e0 0f ee 29 d1 64 75 5c d6 21 f7 34
Check a software called Twibright optar, it fits fits 200kB on an A4 page and has error correction.
Use an enigma app.
Ops, I shuld have usd the prevuwe but in.
Stored in a salt mine.
You've been going on and on like a chicken with its head off for less than a meg of data? How precious is this pithy amount of data that you're in mortal terror of losing in the decades? Why not just MEMORIZE it!
Kilobytes. geesh
Do consider old yellow punched tape and print to hollerith cards.
You are on to some thing important. Digital media is much more sensitive to temperature than paper and fire safe safes commonly only rated for paper. Digital media does not survive on the dash of a car in many cases yet a paper map is fine.
Truth is stranger than fiction, but it is because Fiction is obliged to stick to possibilities; Truth isn't. Mark Twain.
What about some form of Steganography... that is, embedding the data in a picture. You'd likely have one reference picture and then embed the data in a slew of modified pictures. You'd definitely want to include some error correction, but with a little creativity and some playing around with it, it might be pretty data dense. I imagine too, that picking a good background image would help you increase data density in each picture, but I don't know enough about this to provide advice on that... Good luck!
Laserengraving your info on a piece of metal (preferably non-corroding and stable) should last for some time.
You could 'just' engrave, og do a full burn-through.
I am quite surprised I haven't seen anyone mention microfiche/microfilm yet.
It is still a currently accepted archival "technology" used to store countless hard copy records, by institutions that have to A) store tons of records, and B) store them for a very long time.
I wouldn't call it inexpensive, but it isn't all that expensive either considering alternatives. However you usually do large batch jobs, so you might only want to do a run every year or every 3, or 5 years.
The added bonus to being somewhat reasonable to catalog and search, being reasonably inexpensive, and having a long lifespan is the fact that as my air quote indicated earlier, the technology has been around for a very long time. The formats are mature, the readers are easy to come by and mature. Also if you are looking at OCR and keeping digital copies, it is much much easier to either hire out to a firm, or buy a scanner, that takes fiche in a hopper, or even easier film on a spool, and can quickly and automatedly go through your collection and record them into digital format. Depending on volume and document, however that is where you may run into some costs surprisingly, the storage and organization of the digital files.... (particularly if you want to host them someplace).
Anyway if you want a permanent hard copy, doing a microfiche/microfilm run every say 5 years and getting rid of your paper is a good idea.
you are also constipated?