256GB Geometrically Encoded Paper Storage Device
jrieth50 noted that a method of using geometric shapes combined with color to store up to 256GB of data on a sheet of paper or plastic. The article says "Files such as text, images, sounds and video clips are encoded in 'rainbow format' as colored circles, triangles, squares and so on, and printed as dense graphics on paper at a density of 2.7GB per square inch. The paper can then be read through a specially developed scanner and the contents decoded into their original digital format and viewed or played."
according to this Indian blogger.
http://itsoup.blogspot.com/2006/11/scam-of-indian- student-developing.htmle nt_developing_technology_to_store_450_GB_on_paper
http://www.digg.com/tech_news/Scam_of_Indian_stud
QR Codes are not as sophisticated, but can reconstruct the original data when 30% image is missing or distorted. Since these guys are obviously pretty clever, I can't imagine this feature would be overlooked.
I expect to see a story like this on Digg, but I thought Slashdot was better than this.
It's a scam.
I dunno who it is
but it prolly is fhqwhgads.
2.7GB per square inch would would require a linear data density of 152292 dpi. Neither my scanner nor my printer come within a hundredth of this. The main problem with the printer at such resolutions is bleeding of the inks into the paper. To form the different shapes several dots would be necessary, which would further decrease the effective resolution by an order of magnitude. For example, suppose a 3x3 grid was used to form each character, the article states that there are four different shapes used, yet that 3x3 grid could encode 512 different patterns. Realistically, at 600dpi (giving 360000 dots per square inch), with 3 ink colours (yielding 8 different colours) you would get 360000 bytes per square inch, or 33MB per A4 page - somewhat short of the 256GB promised. You'd also need to dedicate around 25% of the capacity for error correction. This is complete and utter bollocks.
Like tinyurl, but one letter less! http://qurl.co.uk/
Bank's computers don't use OCR, they use MICR
Sorry, but you have combinatorial math in there where it doesn't apply. Yes, 24 bits gives you 256*256*256 colours, but it's still 24 bits.
The appropriate calculation is 4096*4096*24*8*11, so you over-estimated the capacity by a factor of 700,000 or so.
let's suppose we have a very fine color printer and a very fine color scanner that can print at say 4096 DPI in RGB with 24 bits of color. And we'll consider an 8x11 sheet of paper:
4096*4096*256*256*256*8*11
Please, at least try to understand the technology involved. 4096 dpi means 4096 individual dots. Each of those dots is a single ink colour (typically one of cyan, magenta, yellow, or black), and it is the combination of those dots in dithering patterns which produces multicoloured output.
Your "4096 * 4096 * 256 * 256 * 256" is way off the mark - you are overestimating the capabilities of printers by several orders of magnitude.
FYI, the average "ZOMG 1440 dpi!!!" consumer printer is lucky to reach the equivalent of 100 pixels per inch. Even the commercial machines used in printing glossy magazines are more like 300 pixels per inch -- nowhere near 4096.
You are an idiot because: You ignored the one and only thing he /did/ say, which was that he was doing something differently.
Bzzt.
Encoding data using dots is the most efficient method possible. He has to print the image somehow, and scan it back in again. No combination of triangles and circles can circumvent the resolution limit, which is what is being calculated here.
By showing that the claim exceeds all practical limits of optical resolution (and probably the absolute physical limits), we show that what we have is just another magical compression scam.
He says that he's "doing something differently"; we've proved that what he claims to be doing is impossible. End of story.
Okay, let's look at some math. First, calculate the number of bits that must be stored to reliably archive 256GB:
.426 micro meters = 426 nm
256*1024*1024*1024*8*(10/8) = 2.749 * 10^12 [allowing for 25% extra - error detection/correction]
Now, the area of a sheet of paper in mm^2:
210 mm * 297 mm = 6.237 * 10^4
Let's make an assumption: it would be tough for a scanner to correctly identify more than 256 colors (blues especially are problematic). So, going by a pixel based method, we can store 8 bits per pixel, so the number of pixels needed is:
2.749 * 10^12 / 8 = 3.436 * 10^11
Pixels per mm^2 will therefore be:
3.436 * 10^11 / 6.237 * 10^4 = 5.509 * 10^6
Taking the square root of this figure and inverting will give us the size of one side of a pixel in mm, so:
1 / (5.509 * 10^6)^.5 = 4.260 * 10^-4 mm =
This is smaller than the wavelengths of some frequencies of visible light, therefore a large portion of the spectrum is gone in terms of colors that can be used. Eliminate these colors and you increase density yet again, requiring you eliminate more colours. By the time you get to monochromatic (black white), which you will, the size is smaller than the wavelength of ANY visible light.
So, for this storage density, either you are scanning in ultraviolet light (and printing using an appropriate ink) to get a small enough wavelength, or you have thrown out light all together and you are using an electron microscope as your scanner. (Note - ever see electron microscope images in color? Can't exist unless colorized).
Fairly clever hoax though - if they had stuck with, say, 16GB then it would not have edged into the impossible.
You are an idiot because: You ignored the one and only thing he /did/ say, which was that he was doing something differently.
Bzzt.
Encoding data using dots is the most efficient method possible. He has to print the image somehow, and scan it back in again. No combination of triangles and circles can circumvent the resolution limit, which is what is being calculated here.
By showing that the claim exceeds all practical limits of optical resolution (and probably the absolute physical limits), we show that what we have is just another magical compression scam.
He says that he's "doing something differently"; we've proved that what he claims to be doing is impossible. End of story.Indeed. For those not inclined to simple mathematics, here it is in a nutshell-
Assumptions (none of them unreasonable, all of them quite generous even):
1440dpi
8 bit color
8" x 10.5" printing area
Even assuming perfect readability, this resolution yields only 1.4GB per page. Talk of "shapes" is smoke and mirrors to obfuscate one of the cold hard facts of information theory: you cannot accurately represent all permutations of 8 bits of information if you've budgeted less than 8 bits. Compression schemes allow you to compress repetitive patterns is you know they're going to be there beforehand (e.g. an almost arbitrarily large number of only 1's or only 0's can be represented with run length encoding), but X bits of random data requires X bits of allocated space.
If a job's not worth doing, it's not worth doing right.
No.
I divided it by 24, because the entire calculation is in terms of bits. We have 24 bits per pixel. 2^24 possible colours, encoded as 24 bits. 24 colours would encode less than 5 bits.
What your calculation assumes is that we are storing two megabytes per pixel. I think you can see why this is impractical.
All the "proofs" in the comments that show this is a scam so far calculates how many dots can be printed/read from a piece of paper, and then corresponds each dot to a bit of data. Well, guess what. The whole point of this thing is he's NOT USING DOTS. This may very well be bullshit, but the "proofs" against it are meaningless.No, you simply don't understand very basic information theory. Printers print with dots. Any shapes you make on the paper are made up of dots. A 3x3 grid of dots (9 bits) can be marked in 512 different combinations, only 10 of which make a squares(2), triangles(4), or lines(4) that can be easily differentiated. Using shapes does not increase the resolution, it limits it. You cannot represent 8 arbitrarily chosen bits of information if you've budgeted 7 bits of storage. At 1440dpi, 8 bit color, even assuming perfect readability, you cannot record more than 1.4GB of information, no matter what "shapes" you arrange for the dots to make.
If a job's not worth doing, it's not worth doing right.
I believe it was "Dr. Dobb's Journal" that used to publish code that could be scanned, sort of a variant on barcodes.
Printed at the higher resolutions available to printers and scanners 15-20 years later, how much data could you store using that encoding format on paper? We've gone from about 100dpi to 600-1200, which actually means at least 36 times the data storage per square inch.
I fail to see how a binary pixel can fail to take less space than a printed geometric shape. You can squirt an ink dot a lot smaller than you can a recognizable microscopic shape.
Colour filtering to provide "layers" of data is a form of bonding or multi-frequency processing. Even CMY alone triples the storage potential; the colour discrimination of the scanner is the only limit to the potential bandwidth multiplier (plus data loss due to fading.)
In other words, it sounds a lot more original than it actually is. Odds are the creator is too young to have even heard of Dr. Dobbs, much less have played with the code scanners to save typing time.
I do not fail; I succeed at finding out what does not work.
Here's an upper bound as a check on your numbers (not restricting ourselves to a small dictionary of shapes). I'll give away the punchline: My numbers agree with yours, but 256 GB is not possible using printers and paper.
Assumptions: I use your printer linear resolution of 1200 dpi, and assume that adjacent pixels can be resolved at this resolution. I also assume that 256 different colors can be distinguished, as you do, and that the paper we are using has an area of 96.6763 inches^2, also as you do.
Calculation: With a linear resolution of 1200 dpi, one can fit 1440000 dots per square inch (Check!), and so 139213872 dots on a sheet of A4. With 256 colors we can store a number as large as (number_of_colors)^(number_of_dots). So:
256^139213872 = 2^N (where N is the equivalent number of bits)
(2^8)^(139213872) = 2^N (recognizing that 256 = 2^8)
2^(8*139213872) = 2^N
N = 8*139213872 (bits)
(and if we just divide by 8 again to get bytes...) => 139213872 bytes = 139 MB
Discussion: This theoretical upper bound is three orders of magnitude smaller than what is being claimed by the article: It is not possible to store 256 GB on a sheet of A4. My result does however agree with your result in that the inequality (my_result)>(your_result) holds, as it should. Ad it's really not too shabby: Accounting for 8-to-14 conversion for some error correction, we can store slightly under 80 MB in this way.
Different assumptions: If I instead use your 2000 dpi laser printer figure, then I can fit 4000000 dots per square inch, and so 386705200 dots on a sheet of A4 and so almost 386 MB. (Including error correction, one might store 220 MB.) Pretty impressive!
The Absurd: Right now, many modern semiconductor fabs have working 90 nm photolithographic processes. That means that the smallest feature size is 3.54330709×10^(-6) in, and the linear resolution is about 282222 dpi. If all we print is the first metal layer, then a dot can either be "with metal" or "without metal" -- that is, one bit. And on a silicon wafer with an area the same as that of a sheet of A4 paper, we can then fit 7700207603555 dots, or 962 GB. Hard drives are about halfway there!
Depends on the pigments in the ink. Organic pigments get fried by UV, or even just trace ozone in the air, very quickly. But metal ion based pigments (lead, cadmium, iron...) can last almost forever. Too bad the used media would then be toxic waste.
Strictly speaking he could use many colors. The resonable width of color spectrum is around :).
1000 nm and good optical filters will give you a window of 2 nm bandpass, so assuming he used
500 wavelengths/colors he could store 700 Gb per page. Also, I am aware of prototype, in the lab
printers (by Canon) which do 9600 dpi (Google it), so pushing technology to its limits and
cost notwithstanding you could write 31 Tb on an A4 sheet. And I am pretty sure one can make
this work for not much more than a yearly budget of one National Lab in the US
Thanks for the link to weierstrass' Implicity blog. His diagram of the 102 unique two-color patterns is very useful in this context.
So 102 patterns using 2 colors multiplied by the number of combinations of two colors that can be drawn without replacement from a palette of 256 colors... it has been a while since I've worked combinations but I know how to use Google as a brain prosthesis:
Google this, guys: "256 choose 2" yields 32,640 unique two-color schemes.
So 32,640 * 102 patterns = 3,329,280 possible combinations within each 3x3 matrix. 85,000 such matrices on the 8.5x10 inch sheet yields 2.9*10^11 unique patterns possible on one page. That's a pretty long bit stream. Converting from bits to something understandable yields 33 GB per page.
This unsophisticated technique shows a sheet of paper can hold more than 1500 times the information that the one-bit-per-dot crowd was thinking was the max (22 MB iirc). It is still an order of magnitude lower than the reported achievement of Sainul Abideen-- but I am working as a Resource Support Assistant in a Community College and I don't profess to know much about combinatorial math or pattern recognition. I think it enlightening that Google says that "256 choose 3" gives 2,763,520 unique three-color schemes...
I do, however, know a thing or two about Google and how to use simple resources like it to make the world a little more understandable.