Dropbox's New Policy of Scanning Files For DMCA Issues
Advocatus Diaboli (1627651) writes "This weekend a small corner of the Internet exploded with concern that Dropbox was going too far, actually scanning users' private and directly peer-shared files for potential copyright issues. What's actually going on is a little more complicated than that, but shows that sharing a file on Dropbox isn't always the same as sharing that file directly from your hard drive over something like e-mail or instant messenger. The whole kerfuffle started yesterday evening, when one Darrell Whitelaw tweeted a picture of an error he received when trying to share a link to a Dropbox file with a friend via IM. The Dropbox web page warned him and his friend that 'certain files in this folder can't be shared due to a takedown request in accordance with the DMCA.'"
Its been nice while it lasted, now on to other services!
If you are determined to use drop box, use an open source software as 7zip that will encrypt and zip. Otherwise, stop using drop box and move on to something else. One of the consequences of using the magical cloud is that your are bound to somebody else's rules for how they manage your data. Also note that those rules are subject to change at any time, and you don't have any say in those changes (I guess the only option is to speak with your wallet and move to greener pastures).
This is news, in the sense that Dropbox now actively crawls your files (DMCA still went about for publicly listed files anyway).
But my question is why are there people in the tech industry still surprised by the fact that Dropbox does not encrypt it's users's files and can read them outright...
That's how they do sharing between users, as well as file deduplication (Which probably works best for larger copyrighted files, funnily enough!)
I still use Dropbox, and promote it slightly: with the stern advise to use it simply as a convenient way of sharing crap, but treat it as a "public USB drive"!
Just never, ever, store sensitive data, like your business or evil masterplans, or your personal/bank/etc account details on it. But if you're sharing that MP3 you recorded on yesterday's block party, go right ahead!
All that's required of users is to use a encryption mechanism, even weak, to encrypt said files prior to uploading.
You could potentially even use an encryption key as weak as "password" because DropBox aren't going to be in the business of guessing encryption keys (won't have the CPU grunt) so anything is going to deceive them - potentially even just XOR. Or even use the file's name.
The only downside will be that DropBox will be just that little bit harder to use without some sort of application to make encryption and decryption of files easy.
The DMCA is concerned with whether Dropbox is hosting an infringing file, not who they may be hosting the file for or for what purpose. Unfortunately this approach is forced upon Dropbox by a US law passed in an era of dial-up modems.
No kidding!!! What do you say at this point?
But this isn't new, its been going on since Dropbox implemented their DCMA violation checking system a few years ago, and you can see *why* they are doing it.
Lets clarify a few things for those that aren't going to RTFA - this isn't for private shared folders, or for folders within your own Dropbox. This is for when you create *public* links, by either using the "Shared Links" facility or when you create a public link from the old style Public folder.
Thats it. The files Dropbox is including in these scans are *publicly linked* to - and they are fair game if Dropbox wants to stay ahead of the legal system on this front. Dropbox has no idea that you only intend to share it with yourself, or one other person, and there is no mechanism by which you can ensure that yourself anyway.
Yet again its forced outrage against basically something which is common sense - if the file has been taken down before, its going to be again, and the less man power Dropbox expends while handling DCMA requests the better for them as a company.
This whole issue can be summarized as:
1) User wants to ignore copyright law and share something they have no legal right to via a public service
2) Public service being used has no idea how many people will want to access the shared resource but they do know it is copyrighted as they auto match everything uploaded so they can avoid keeping to separate copies of identical files and save storage space and had a DMCA take down request for that same file previously.
3) Public service errs on the side of not getting their arse sued off by the various content owner conglomerates legal attack dogs and refuses to allow the file to be shared even though the person who uploaded it can still see it.
All in all seems pretty reasonable. Until copyright law is changed (like that is ever going to happen) dropbox have to follow it to the letter. I suppose they could have avoided the whole thing by storing more data and then not doing the duplicate file scan thing but even that is no guarantee it would prevent them from being sued to oblivion.
The only safe option for them that would also keep things private would be to use encryption keys that were only kept in the client. That way if you needed to share a particular folder you selected to store that under a different encryption key, and gave that key to other person / people who needed to access it.
The big problem with this is that it then becomes more awkward to provide web access to the files. People are comfortable remembering a username and password, they are not so comfortable remembering a bunch of encryption keys. If you store the encryption keys on a server at your end anywhere then you can access the files so you therefore get the legal responsibility to make sure your system is not being used to flout copyright law. The only legal way to run this sort of service and not be liable for it's misuse is to design it in such a way that you cannot see what is being stored at all.
I dont read
Publicly shared files that match known hashes are restricted, but not deleted, and any file can be shared to anyone privately without restriction, just not publicly to the world. Not much of a story. Read TFA.
What does somebody else's data have to do with your data?
There is no "your" data or "there" data. There is only dropbox data. It seems at the point you upload a file they check it to see if they already have a copy and of they do they just add a pointer to the existing file rather than store a fresh copy.
And what if there is a hash collision?
By the sounds of it they must actually do a direct file compare rather than use a hash. They probably use some kind of hash to narrow down the options of stuff to compare it with but in the fallback case of a hash collision, and both files being exactly the same size they must have to do an exact comparison. That probably does not happen very often though and it sounds like this is process is only done once at the point a file is stored.
I dont read
Do you know if dropbox is trying to determine what is a DMCA violation and stopping the share or if they have received actual takedown notices? I ask because if somebody shares something and dropbox recieves a takedown notice, then I would be okay with that. On the other hand, if they are trying to police what is out there, I'm not sure how they can make that determination or why they would stop at just shared content.
Not trying to troll or inflame the discussion, just actually wondering how the process works.
Drop Box is nothing more than a gussied up repackaging of a SFTP or FTPS and a nice fancy ol' GUI.
The post office is nothing but a gussied up repackaging of walking to your friend's house and giving him the letter yourself.
The fax machine is nothing but a waffle iron with a phone attached!
No, it's slightly more than that.
You set up a server for SFTP or FTPS and download a nice, friendly little program called FileZilla.
...and then? Will Filezilla run on startup, settle itself inconspicuously in the systray without a running window you could accidentally close, connect to the SFTP server, download files automatically to local directories so they're instantly accessible, then monitor, sync and notify you of any changes? Will it allow you to dish out invitations to share directories and files direct from your desktop, and manage those permissions for an unlimited number of users and directories?
systemd is Roko's Basilisk.
Part of it is in the 'terms of service' where you specifically allow dropbox to do certain things (like deduplication and retention after you've deleted it).
They're not actively searching *your* files to seek out these violations, they got a specific complaint about that file's data, which they are obliged to make publicly inaccessible. If you also share that file's data than that too is, according to the DMCA, in infringing and is prohibited from being shared.
About the hashes: they most certainly only use to hashes to find candidates for deduplication. All files with the same hash are most likely first compared byte-for-byte before they're really considered the same.
The 'takedown' probably happens on the deduplicated file's entry in some database, where it's marked as a 'DMCA violation'. Any attempt to access it via a share will notice that flag and show the appropriate message. They wouldn't need to actually "go through your files" to look for violation, but in case they want to they can simply look who has a reference to the deduplicated file and whether or not it's shared by them in order to notify them of the fact (in that case they's still not be going through your files, but just following the link back to your account).
They are actually very correct about it, since they only disable the sharing, not your access to the file (since that is yours and thus not necessarily infringing on the DMCA). They are just not allowing you to use their service to distribute a copyrighted work about which they we're told it's not allowed to be distributed by them.
"Moo!" -- Anonymous Cow
This is what OwnCloud is made for.
I know not everyone is able to set up their OwnCloud server. There are places that will host it and set it up for you.
OwnCloud is great, with one exception: the slightest change to a file necessitates an upload of the entire file. Dropbox does delta syncs using a modified version of rsync, so it only uploads change portions of a file.
For typical files and fast connections, the lack of delta sync is tolerable, but when you're dealing with large files or slower transfer speeds it's an issue: if you, for example, you keep a large TrueCrypt container file in OwnCloud and make a change to a small file stored in the container, OwnCloud needs to reupload the entire container. Dropbox would just update the blocks that changed.
Until OwnCloud implements some sort of delta sync functionality it is considerably less practical than Dropbox.
Or you could read the article and get answers immediately.
They use file hashes of previous DMCA requests when new files are shared. If it transgresses, it's blocked just like this situation.
It's not "policing", it's blacklisting the sharing of specific files via comparing file's hash against a list of blacklisted hashes.
I just hope they're not using CRC16.
But computing a hash-value IS going through your files.
What if they use a hash that is computed like this:
1. compute md5sum of the data
2. make the last bit zero or one, depending on whether the file has some interesting property.
Suddenly, they can profile you based on "hash-value" alone.
If Pandora's box is destined to be opened, *I* want to be the one to open it.
> Viola!
I fail to understand what a stringed instrument, slightly larger than a violin, has to do with it...
Oolite: Elite-like game. For Mac, Linux and Windows
It's called AppOps. Was in Android hidden, then removed, but still ships in standard Cyanogenmod.
The image of the error message did not say who, or which corporation, had made the DMCA complaint. I thought that in order for something to be taken down under the DMCA the user had to be told who was complaining.
In this case: the user admits that the file was something that he should not be sharing, but there have been cases where the DMCA is being used to prevent legal files - in a case like that the user must be told who is complaining so that they can challenge the DMCA complaint.
And DropBox is probably the most benign of mainstream cloud hosts. Google, Amazon, Apple and Microsoft all sell content and sign voluminous contracts for the sale of said content. It's not hard to imagine that they would or could be obliged to scan for infringing content and notify the content providers when they find any.
Change a character in the metadata fields, hash changes. If they're scanning the actual video portion of files, add a byte at the end. I don't think that would affect playback.
If you're not distributing copyrighted material I fail to see how this could be a problem in practice. You'd have to create a public link to your copyrighted file and that link would have to somehow wind up in the hands of the MPAA or other representative of copyright holders.
Check out my lame java blog at www.javachopshop.com
If you're not distributing copyrighted material I fail to see how this article is relevant at all. They wouldn't care.
Unity? Screw that: XFCE. Slashdot Beta? Screw that: SoylentNews. Australis? Screw that: Pale Moon. UX developers DIAF
He wasn't making an analogy between how you find a hash collision and how you win the lottery -- only comparing the odds.
Dropbox uses SHA-256 hashes. I'm assuming this is what they use for this feature, since it's what they use internally for file identification and deduplication. They actually hash 2 MB file chunks, which means that any file more than 2 MB produces multiple hashes (one per chunk, naturally).
The "many chances of winning" you're referring to here is the birthday collision problem. A good, rough approximation is that for an N-bit hash, while the number of different hashes is 2^N, the number you can generate before risking a collision is about 2^(N/2). So, with SHA-256, we run no significant risk of collision until we've generated around 2^128 ~= 10^38 hashes.
The total amount of data stored worldwide is on the order of 1 ZB. That's room enough for about 10^15 2-MB chunks. Of course, some of our files might be smaller than this 2 MB chunk size, enabling us to be more efficient with storage. We might be able to get somewhere around 10^20 different files in there.
That's a strange and untenable use of all of the world's storage, and it still puts us about 18 orders of magnitude short of being able to risk a SHA-256 collision. If you had this giant set of a ton of different files, the probability of a collision existing is about 1 in 10^37.
So, short of a flaw in SHA-256, you can assume that a hash collision will never happen. We know of no such flaws. (If we do, it will almost certainly be the case that the collision only occurs because one of the two files was specifically manipulated to produce the collision.)
On the other hand, the odds of winning the lottery are rarely worse than 1 in 10^9.
Dropbox is not useful because of what it does - it's useful because of how it does it (seamless for a non-technical end user) and its integration into other, especially mobile, applications. Until you can roll-your-own references into commercial mobile apps, or make sharing a cloud file with a colleague with a different OS and no access to your private net available with a single click, whatever you hack together won't be Dropbox.
Is it just my observation, or are there way too many stupid people in the world?
I refused to use Dropbox ever since its "end to end encryption" claim was shown to be false, and they were de-duping your files. (De-duping required access to the original files, which Dropbox tried to claim they didn't have.)
Then they said they were changing that practice. But how far could you trust them, considering that they had already lied to everybody? Fool me once, and all that.
NOW, apparently they're checking your files -- which back when they again claimed they weren't accessing -- for copyrighted content, which again requires access to your original files. (Even if you're just doing an MD5 hash or some such, you still need access to the original file to do it.)
So, yeah. For all those who didn't drop Dropbox when I did, maybe it's time.
You don't need public links for keeping your own account in sync.
That's how it used to be. Nowadays, copyright exists once a creative work is put into tangible form (this comment is creative enough to be covered, and is copyrighted as I type it, since computer memory is tangible). I believe this is the case in all Bern convention countries.
That doesn't mean there aren't advantages to registering a copyright, but if you bother to look up the law, you'll find that this comment is copyrighted.
"When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes