Storing Very Large Files On Amazon's Unlimited Cloud Photo Storage
AmiMoJo writes: Last year Amazon started offering unlimited cloud storage for photos to customers who subscribed to its "Prime" service. Japanese user YDKK has developed a tool to store arbitrary data inside a .bmp file, which can then be uploaded to Amazon's service. A 1.44GB test image containing an executable file uploaded at over 250Mb/sec, far faster than typical cloud storage services that are rate limited and don't allow extremely large files.
This is why we can't have nice things.
almost a Japanese Zipper: YKK
My org had dozens of videos housed at Viddler.com's "free hosting" while it lasted. Viddler had trouble being free a couple of years ago and sent a big bill we couldn't pay. When we asked were our videos deleted, Viddler tech support said they existed... somewhere... in Amazon.
Gently reply
Awesome, thanks. This is really informative.
I'll get right on that learning to read Japanese..
A better tool would be to split the data among smaller files. A 1.44 GB BMP is sure to attract attention. 1440 one MB jpegs isnt. Am I right? Peeps?
I think it's easier to validate that a JPG file is really a JPG than a BMP, or at least it's harder to store arbitrary data in a JPG and still have it decodable as a JPG.
So just store the data as 1 MB BMP's or TIFF's.
Wrong, that should be 1000 x 1.44MB BMPs. At least then they could say "they're floppy disk images!"
`echo $[0x853204FA81]|tr 0-9 ionbsdeaml`@gmail.com
First the article with the luser asking help desk question and now this with the link in Japaneses.
I think that with the new overlords Timothy has gone full honey badger on us.
-- I have a private email server in my basement.
Yes. Millions of Slashdotters are literally shutting down internet communication as we know it while spreading the news... and eating hot grits off'n the Portman.
Happiness in intelligent people is the rarest thing I know.
Ernest Hemingway
That's great. And exactly how long do you think Amazon will allow this to go on before:
a. Amazon runs a script to test that file magic matches extension or delete?
b. A limit of 20MB per file is established?
c. The free service gets a 5GB cap; want more then pay?
d. Amazon shutters the service entirely?
This sort of crappy hack has already been done with other services.. A proof of concept is no longer needed. At this point you are just abusing a service to the likely detriment of everyone.
Fuck YDKK!
And it's going to be pretty easy to figure out how to tell the difference between an image and something else... If Amazon starts to have space problems, you can bet they will be quick to find and delete such junk...
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
I think it's easier to validate that a JPG file is really a JPG than a BMP
You can start with a real image, then modulate the pixels by the data.
Also, you can make it a lossless JPEG.
I think the reason to just use BMP is because it's less processing and computing time required (More efficient, and less space will be wasted too).
Here is my research...
Steganography & Amazon Cloud Drive:
http://bsmuir.kinja.com/stegan...
"And it's going to be pretty easy to figure out how to tell the difference between an image and something else"
Not really.
Firstly, how are you even going to load the image for image processing?
How are you going to determine if the image is "real"? Run some OpenCV algorithms on it? Face detection?
I'm not being an ass, it's actually a really difficult problem to solve, and Amazon would really be better off just capping the maximum file size and/or the total volume stored and be done with it.
Seems quite complicated.
If Amazon doesn't convert the images, he could just upload a PNG file with a lot of information stored in ancillary chunks... the png specification even allows creating custom/developer chunks which should be ignored by any parser that doesn't understand them (for compatibility with future versions of the standard)
For example, just abuse the hell out of iTXt or zTXt chunks in the format : http://www.libpng.org/pub/png/...
For private chunks, see this bit : http://www.libpng.org/pub/png/...
Oh there are things in an image that you won't see in some data stuffed into an image container. It's not that hard, a little time consuming and processing intensive perhaps, but not that bad. Consider that they only really need to find that 1-2% who are doing this, abusing their terms of service and just toss them, one could even do it manually for awhile... Hire a bunch of folks to look at a "picture" and tell me if it's really a picture... Heck, make it a CAPTCHA task... Just start with the biggest files and work your way down...
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
Back in the day, when I worked as a dev at a social networking site, we would resample old photos that hadn't been accessed in over some threshold (let's say it was 1 year, for the sake of argument). Anything older than the threshold would get re-encoded in JPEG to a poorer representation in order to save storage space.
So what stops Amazon from doing the same thing? Do their TOS say they won't?
Non-image data under those circumstances become pretty much useless, even if packaged so that they appear to be an image of off-station TV reception. Once you include a lossy recompression, your data are no longer data, but noise for real.
Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.
Do you want new terms of service? Because that's how you get new terms of service.
Amazon: your 1.44 GB files don't seem to be photos, and violate TOS. So we deleted them. I'm sorry, they were your only backups? Oh, you are right, a movie is a series of photos. And they were of you and your girlfriend? We'll try and recover them and we will all try to determine if they violate the TOS.
My floppy disks only hold 360K, you insensitive clod.
If you take the trouble to read through Amazon's TOS, and click to their actual rates, you can buy unlimited storage for photos, videos, AND ARBITRARY FILES for only $60 per year. Not only that, but Prime gets you 5 GB of videos and non-photo files for free.
Going through all the hassle of specially encoding your data files so that they masquerade as photos seems like a heapload of time better spent earning $60 so that you don't have the long-term headaches and potential for being banned from Amazon's service that such abuses flirt with. You want a real backup service? Buy it, it isn't expensive.
Backblaze, a darling of Slashdot, is only $50 per year. It isn't worth the hassle or time to beat the system for such low prices. Amazon Glacier is $0.007/GB/month. Both systems offer encrypted storage. Why work hard when someone else has done the figuring out for you?
Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.
or they'll just start compressing uncompressed image files, corrupting the encoded data.
That's interesting. I wonder if that's why Google offers unlimited photo storage but at lower res?
This seemed like a reasonable sig at the time.
Downloading 13GB compressed backup files from Amazon S3 is incredibly slow at only 938 kilobytes per second using their AWS CLI tools... even slower at 214.5 kilobytes per second using their AWS Powershell tools.
I have a fast PC and connection, and the load time for Amazon Cloud sites is VERY slow.
And they've already got a way to do so cheaply
https://www.mturk.com/mturk/we...
From the soon-to-be-banned dept.
Those days are dead and gone and the eulogy was delivered by Perl. --Rob Pike
I'm pretty sure that was an OS/2 convention. Windows bitmaps can have a negative height value resulting in a (now more conventional) top down packing.
Note that bottom-up ordering is more in line with mathematical conventions.
There was a similar tool, featured on a site that shall not be named, that did this last year: https://github.com/tylerpitchf... . It chopped files and directories into user defined bitmaps according to the write up. Made for Google, but worked for Amazon too apparently according to the write up from the article. https://www.linkedin.com/pulse...
Just wait until they turn on their automatic convert-to-low-quality-JPEG functionality. :) All your .BMP files will be converted to 400 KiB .JPG files. Hope your executable is OK with lossy compression.
http://www.petitcolas.net/steg...
"MP3Stego will hide information in MP3 files during the compression process. The data is first compressed, encrypted and then hidden in the MP3 bit stream. Although MP3Stego has been written with steganographic applications in mind it might be used as a copyright marking system for MP3 files (weak but still much better than the MPEG copyright flag defined by the standard). Any opponent can uncompress the bit stream and recompress it; this will delete the hidden information â" actually this is the only attack we know yet â" but at the expense of severe quality loss.
The hiding process takes place at the heart of the Layer III encoding process namely in the inner_loop. The inner loop quantizes the input data and increases the quantiser step size until the quantized data can be coded with the available number of bits. Another loop checks that the distortions introduced by the quantization do not exceed the threshold defined by the psycho acoustic model. The part2_3_length variable contains the number of main_data bits used for scalefactors and Huffman code data in the MP3 bit stream. We encode the bits as its parity by changing the end loop condition of the inner loop. Only randomly chosen part2_3_length values are modified; the selection is done using a pseudo random bit generator based on SHA-1.
We have discussed earlier the power of parity for information hiding. MP3Stego is a practical example of it. There is still space for improvement but I thought that some people might be interested to have a look at it."
Isn't it time ISPs started offering symmetric speed DSL? Then we could upload/download those photos through our own home servers without needing the privacy-invading cloud servers. With the current pathetic upload speeds in most home DSL plans, it's difficult to download anything big from a home server.
It's one thing to abuse the service, and use it differently than intended to personal gain.
But 1.44GB is not even what I'd consider very large. I'm yet again the victim of click bait headlines. I was expecting sizes in the TBs.
Back in school one of our tasks in our informatics course was to analyze and explain an algorithm which hides arbitrary data in the lowest value bits of 24 bit bitmap files. We did that in Turbo Pascal (o; It was very interesting to see this is possible. Of course, it were only text messages we hid there, as storage space was rather limited, to put it mildly.
Link or it didn't happen
2) Its huge. Bigger than is reasonable with any available consumer or professional camera.
In this case, it is trivially easy to write software to distinguish this "hack" from genuine photos.
Automatic conversion to JPEG would solve this problem and offer a benefit in terms of storage space and user download times.
Why are people griping about what this guy did? So he cheated the system. All of us here have cheated the system in one way or another. The real issue is that Amazon will now go back on it's work with the Unlimited photo storage. That's going to be the real problem, instead of them finding a way to prevent this and punish those that abuse the system. They just punish everyone instead. That seems like a useless learning lesson. This is just a vicious circle, as it happened with the unlimited 3G data until people tethered to it and ran up the data. Why not just find a solution to the problem instead of punishing everyone?
I personally wrote a steganography tool for JPEG-2000 files for a graduate school project - it just stored data in the least damaging sections of the file. The resultant files were still perfectly legal image files, lossy compressed, and minimally visually damaged.
Kudos for the hands-on. I was fascinated some years ago with progressive GIF overlays and coded some stuff to produce them, not so concerned with stenography and hiding the presence of a message, but more with novel ways of presentation.
One example was embedding a public key into a GIF image. Starting with a standard base image and palette that was the same for everybody, like a shiny golden key floating over a smooth blue gradient... the key bits encoded as a series of overlays that when displayed, made the key sparkle and the background vary in color, all happening over ~10 seconds. The idea was that while most people didn't stand a chance memorizing much "BEGIN PGP PUBLIC KEY BLOCK" gobblegook, we'd be better equipped to remember the distinct "sparkle" of an image. More of a style thing than a useful crypto concept.
I also experimented with things like encoding process/memory access and toyed with the idea of filesystem journals rendered as displayable GIFs. It was a fascinating foray into the realm of data structures and helped me to become the person I am today. I presently jet sewers for a living.
Wouldn't it be strange to see some future Slashdot shocker headline, "Bit Rot Discovered In Cloud, All Data Will Be Reduced to Gaussian Noise By 2030". And like the proverbial boiling frog we deny the problem or postpone dealing with it as everything progressively (but slowly) dissolves into static. People who try to raise consciousness and alarm are booed off Slashdot with comments like, "I can read it. What's wrong with you? posted by folks who are also having trouble reading things but enjoy sniping at others more. Then as it reaches the final stages all electronic mediums are projecting mostly static but people are pretending they see and understand the messages perfectly. And most oddly, when we hit Peak Gaussian something resembling a modern society continues to function. Then unfettered by structure society literally melts into phantasmagorical goo. Something... like... THIS.
<blink>down the rabbit hole</blink>
Tivo used to distribute some data at night on a TV channel. I caught it one night in a fit of insomnia, it looked like a video stream comprised of QR codes. I'm guessing the Tivo box recorded it and then decoded the full frames and then stored whatever the data stream was.
Like QR codes, the "data" would seem fairly impervious to scaling and resampling provided that the "bits" or white/black blocks were large enough to survive downsampling. You wouldn't really care if they converted them to compressed image data because the image was the data but represented at a low enough practical resolution that downsampling or format conversion wouldn't change the image enough to inhibit decoding.
You could even do something like the color-enhanced HCC2D code "extension" of QR codes for greater image data density.
Each image file could then be a rough equivalent of a disk block or sector, allowing the client side to manage a file system of sorts.
to get around an overbearing corporate firewall that forbade not only executables, but archives containing executables as well. In order to be able to e-mail new versions of a program that the overbearing company had bought, he wrote a program that packed the .exe code in a BMP file.
I thought we could already do this? I remember hiding .rar archives in .jpg images. Is Amazon able to detect this magic?
It is available if you don't mind it being a symmetric 2 Mbps, or "up to" 5 Mbps.
Bond four phone lines if you can afford it, but that's some hundreds euros/dollars a month.
Too damn expensive. If you can get cable internet that's highly asymmetrical but with 3 Mbps upload, that wouldn't be bad.
We just need fiber, alas fiber suffers from what I'd call the "last 100 meters problem", not just the last mile problem. As a society we're too cheap to wire the flats and homes themselves even though there is fiber lurking everywhere.
Consider that they only really need to find that 1-2% who are doing this
I think you overestimate the geek potential in the world... sure, maybe 1-2% of their users _could_ muster the technical skills to make this happen, but of them, I doubt even 1% would bother - putting the true abuse rate down around 1/10,000 or more.
What a pain it was to convert a BASIC game from bottom-up to top-down. In the end that game never worked on my computer anyway. Should have gone for a rewrite :)
I teach graduate CS courses at a university, and we get the occasional cheater. Sometimes, the cheating is blatant three students just turned in exactly the same work. However, there are occasions where we suspect cheating, but they did a good job of disguising it. Of course, they do poorly on exams. If those students would spend their time and energy on learning the material, they would learn something and get a good grade.
Yup $50 a year is a great & easy solution - ....And didn't folks try doing this with GMail back in the day? Google offered unlimited email so somebody figured out a method to "uuencode" their harddrive backup and email it to themselves? Kind of like porn back in the nntp news-group days?
People are having fun building Rube Goldberg machines. Let us all doubt that this is a serious commercial solution - and just admit it is a run "built it on a Raspberry Pi" toy solution.
A BMP consists of basically a simple header describing the file and the raw contents. I have done this several times to show i.e. mistakes in encryption usage concepts (for example, to see the startled face of students when showing them the effect of using ECB when encrypting an image with repeated patterns). Where's the novelty in that?
I could not read TFA since it was in Japanese. From Amazon.com's pages:
About Prime Photos: http://www.amazon.com/gp/help/...
In addition to the unlimited photo storage, you will also receive 5 GB of free storage space that can be used to store videos and files we canâ(TM)t recognize as photos.
Certain photo formats are excluded. For more information, go to Cloud Drive Photos & Videos File Requirements.
So apparently they decide what is a photo. Myself I'd not trust a third party to not degrade the quality; I'd opt for encrypted container with photos INSIDE it. The same page also restricts this to "personal use":
Prime Photos is for your personal, non-commercial use only. You may not use it in connection with a professional photography business or other commercial service.
Personally I think that sucks. By comparision, my VPS provider gives me a very cheap VPS which I can use for whatever purpose I want as long as I do not break any laws or disrupt other users. They price based on performance and bandwidth; not arbitrarily created market segmentations.
Cloud Drive Photos & Videos File Requirements: http://www.amazon.com/gp/help/...
Photos and videos you upload through your web browser on the Cloud Drive website must be 2GB in size or less.
File and folder names must contain less than 255 characters, and cannot include the incompatible characters listed below.
They list common supported formats; this includes RAW. And they do mention encryption:
For photos: JPEG, BMP, PNG and most TIFF files (these files typically have the .jpg, .jpeg, .bmp, .png or .tiff extensions). In addition, some RAW format photos can also be viewed. For more information, go to About RAW Photo Files.
For videos: MP4, Quicktime, AVI, MTS, MPG, ASF, WMV, Flash and OGG.
Note: The unlimited photos storage benefit for Prime members only applies to files recognized as photo files. Photo files that have been encrypted before they're uploaded will count against your storage quota.
About RAW Photo Files: http://www.amazon.com/gp/help/...
Nikon (NEF files) - Nikon D1, Nikon D1X, Nikon D4, Nikon Coolpix A, Nikon E5700, Nikon AW1, Nikon D800, Nikon D50, Nikon D610
Canon (CR2 Files**) - Canon 5D, Canon 1D, Canon 1D MarkIIN, Canon Rebel SL1, Canon 60D, Canon 5D MarkIII, Canon 1D MarkIV
**While Cloud Drive recognizes these files as photos, some of the information associated the file (like the time and date the photo was taken) may not be recognized.
Sony (ARW files) - Sony A7, Sony A7R, Sony A6000, Sony NEX-5T, Sony NEX-3N, Sony NEX-6
I doubt RAW format pictures can be compressed lossily? Does anyone know this for a fact?
No lossless formats, images will be slightly recompressed. Free is free, what do you expect.
Liability is limited to $50, so after you spend months transferring your files over your limited upstream bandwidth, they can delete your files for whatever reason, including that they just don't want to encourage people "misusing" their service in this way. You complain, they hand you $50, and they're done. Arguably, they can also do that for the Unlimited Everything service, so even if you pay, they can terminate service for anyone they're not making money with. Ultimately, your files are only as safely stored so long as its cheaper to keep 'em than to delete 'em. That's the free market golden rule.
That's one thing I thought of after I saw the announcement, but I doubt that's the primary reason. Probably the main reason is just that they want to avoid the service getting eaten up by people who don't even understand what the "quality" and "resolution" settings on their cameras or other camera-enabled devices mean. Even with Google's compression, I imagine it's not too hard to use steganography to fit the Constitution, or a chunk of the Bible, or most of 1984, or the Kama Sutra, or the technical plans for a planet-destroying battle station in an image.
If a service like Google, Amazon, Facebook, or Yahoo! resizes and recompresses the image data, that's one thing. If they start stripping iTXt chunks that contain copyright or attribution information, that could be a serious legal problem; likewise if they reduce quality so much that it obscures a watermark containing a copyright or trademark notice.
Ahem. I uploaded about 1 TB of files to Amazon Cloud Drive and it only took a few days. Not sure where you get the idea it takes "months transferring your files."
Kriston
Technically, this is already done affordably by the Cable TV industry. Fiber is run to the neighborhood loop, and the homes are served via coaxial cable and DOCSIS to 100 megabit.
I have Verizon FiOS and the cable that is run into my house is exactly the same as my old Cable TV service. The "FiOS" modem in my house is exactly the same modem given to me as a Cable TV subscriber.
My old Cable TV service has speeds competitive to FiOS up to around 100 megabits, but for less than 1/20th of the cost that Verizon spent building out the FiOS plant.
And, today, DOCSIS 3.1 will allow the same speeds as FiOS over the copper Cable TV plant. The old and already existing Cable TV plant gives customers the same speeds as FiOS for 1/20th the cost that Verizon paid for building FiOS.
This is why Verizon is not building FiOS anymore and sold off a few regions to other companies. They're only building out Washington DC because they were forced to. Verizon will not see dime one for 25 years. Fiber-to-the-home was a mistake and they admitted as much.
Kriston
Yep I have buddies with cable at home and it's good, but 30 megabits down and a few up. The low latency is the most noticeable thing, web pages load as if it were on a LAN and that's hugely better. But I doubt they will be in a hurry to upgrade to DOCSIS 3.1 or a european equivalent.
Historically though, antenna TV was the norm for the vast majority of the population, even at the turn of the century. There was a fad of building cable TV in the 80s and 90s but that's in specific neighborhoods and some public subsidized housing. Then there was a fad of satellite TV (digital) in the 90s and 2000s which allowed a similar selection of many channels without the need for infrastructure. Later still, TV over DSL is very popular, starting from the mid 2000s.
So the cable infrastructure isn't there :) and what's left is to bring fiber. Perhaps in some cases fiber to the building, then very high speed and low range VDSL would be a good idea. But that has never been in the news.
I'm a believer at least in rural fiber. Yes it's useful and cheap there and there is a demography problem there as farmers are aging and everybody left. They should be scrambling to bring fiber there, it should be profitable and yet another capitalist crisis is looming because investors have too much money and nothing worth investing into.
Yes, but the point being, if given the choice between deploying fiber to the neighborhood, looped with RG-11 and dropped with RG-59 cable to the home is far cheaper in manpower and materials than using fiber.
Kriston
Lucky. My floppy disks were read only, eight inches wide, and only held 80K.