Slashdot Mirror


The Ultimate Weapon Against Censorship?

Erik Moeller writes "David Madore, mathematician at ENS, describes a method that might be the ultimate weapon in the battle against Internet censorship. In his paper A method of free speech on the Internet: random pads he introduces a system of so-called pads, chunks of random data that are used to encrypt controversial information.(Read More)

Every byte in the source file is XOR'd with exactly one byte in the random file. The result file, by itself, is totally indistinguishable from white noise, provided that the pad used is truly random. Madore now suggests that users store pads on different servers and use several of them in combination to encrypt data.

A FTP or WWW site that stores one of the pads could argue that they are only storing random noise, and another might do the same. It would be mathematically impossible to prove them guilty of storing illegal information (unless there is a way to prove that one pad was created after the other). Only by the combination of the two (or more) files I am able to retrieve the original controversial information. The critical parts are the links to the pads I need to obtain the information, but those might be traded on a distributed system like Gnutella or FreeNet. Plus links take very little space and can be relocated easily to freespace ISPs.

The concept is a little more complicated than my summary here, so please read the paper (and mirror it, it's GPL'd!). There are already scripts and programs to create pads and restore the original files (including a GUI program for Win32). I might add that the idea of pad encryption is fairly old, already used in WWII -- its advantage is that it is mathematically safe if the pads are truly random and only used once, thus its name "One Time Pad"."

7 of 181 comments (clear)

  1. You can go further with secret sharing. by Paul+Crowley · · Score: 4

    "Secret sharing" allows you to break a piece of data (usually a secret key) into N "shares", such that you only need M %lt; N shares to reconstruct the secret, but such that you don't have sufficient information to reconstruct the secret with M-1 shares (ie it's not just impractical, it's information-theoretically impossible). This means you could extend the scheme to keep working even if one or more of the participating sites go offline.

    However, I don't believe any such scheme will work. If it turns out that existing law is insufficient to prosecute participants, they'll extend the law so that acting in a way that could facilitate such a scheme is illegal, and that will include participating in FreeNet, Gnutella, the Eternity service, or whatever. That's why we need both the technology and the data havens.
    --

  2. Ultimate Weapon Against Censorship?! by hypergeek · · Score: 4
    While this, in conjunction with Freenet may make censorship more difficult, and possibly more tricky from a PR point of view, it's a simple matter for a large governmental body to find and stamp out all the freenet-type servers in its jurisdiction.

    The best weapon against censorship is getting the general public rallied to your cause. Slinking around in the underground only makes you look more criminal to the average joe, and easier for any censorial body to sway public opinion against you. (Remember the panic about "hackers" from the early 90s to the present?)

    Failing that, though, the second best weapon, IMO, would be true anonymity. Would it be possible to have host addresses spontaneously, randomly generated, encrypted, and routed to the destination in a kind of virtual circuit?

    Then, when the connection is terminated (or even beforehand if constant generation of new addresses is part of the scheme), the address is discarded, never to be used again (except perhaps by coincidence).

    If someone wants to communicate, um, "nonymously" (as opposed to anonymously, of course :), they'd simply use digital signatures, but anonymity would be the default.

    Unfortunately, I'm not sure exactly how the non-addressing scheme would be implemented, and it would be of limited use to servers (which would require static addresses anyway), but with a shared client/server mechanism such as Gnutella, Freenet, (or OpenCOLA, for that matter ;), you could have a "swarm" network. Like a swarm of insects, you can definitely see that the swarm's there, you can tell when one insect bites you, but you can't track down that individual insect, as it gets lost in the swarm again.

    Or something like that. (It's 1:50-ish a.m., so I'm not exactly bright-eyed and bushy-tailed... :)

    --
    Stay up hacking each weekend. Sleep is for the week.
  3. NSA & Venona by Detritus · · Score: 4

    The NSA and its precessors have been attacking problems like this for over fifty years. You take a bunch of intercepted messages, select two messages, overlay one message on top of another, subtract or exclusive-or the messages, look for a non-random result, shift or rotate one of the messages by a character or code group, and repeat. Continue until each message has been compared to every other message. The statistical anomalies indicate that two messages were encrypted with the same pad or additive. The NSA used this method to detect Soviet messages that had been encrypted with the same one-time-pad. The Soviets ran short of one-time-pads during World War II and issued duplicate pads to AMTORG and the KGB. It was also used to break naval codes that used a code book and random additive from a second book. Using multiple files makes the problem larger but the same techniques can be used.

    --
    Mea navis aericumbens anguillis abundat
  4. I'm not impressed. by rjh · · Score: 4

    I am an InfoSec professional IRL, but I am not speaking for my employer, yadda yadda, this is not professional advice, insert standard disclaimer.

    First: I've never heard of this fellow. I don't recall seeing his name in any of the crypto journals. I don't recall seeing any particularly clever attacks from him in the past. Protocol design is tough; it is an exceedingly nontrivial task, even harder than designing new algorithms. Anyone who says they have a great new protocol is most likely lying, unless they're a Tuchmann, a Coppersmith or a Schneier.

    Always assume all new protocols are full of it, until enough time and attacks have gone by to give confidence that the protocol is only mostly full of it.

    Second: this system is not secure. Repeat after me: a one-time pad is secure as long as it's only used once. The likelihood of a birthday attack is orders of magnitude more likely than he's making it out to be. The reason for this is because Net traffic is not uniform; certain places tend to be "hot" and others "cold".

    Let's have a thought experiment. Let's say Slashdot begins to implement this system, and has a few thousand "pad blocks" available. This means a few hundred megabytes of purely random data--let's completely ignore the practical difficulties of purely random data for now and just assume we can do it.

    When Alice decides to store something unpopular and encrypts it with Pad(s) alpha, beta and gamma, so that Bob and Charlie can read it later, what's Alice going to do? -- Probably use one of the first twenty pads listed. Why? Because people are lousy at choosing random numbers. If you ask someone to pick a number at random, they're most likely to pick a number between one and ten, not one and fifty billion. Things that are at the head of a list get selected more often than those that aren't.

    Let's say that Slashdot randomizes these pads, though, so they always come up in an unpredictable order. (Never mind the practical difficulties in how to do this in the first place. It's a thought experiment. Just keep alive in the back of your mind the fact that (a) we've had to create hundreds of megabytes of purely random numbers, and (b) we have to present them to people in a purely random way.)

    After some mathematics, Alice's super-secret Neiman-Marcus cookie recipe is now pretty much totally obscured. She posts the recipe to a Website, and then tells Bob and Charlie, "Psst! I posted the information to this site. Find pads with IDs of [she recites their IDs] and use that to recover the information!"

    At that point the secret police storm in, having been eavesdropping on the entire conversation. They throw Alice, Bob and Charlie in jail. They go to the website, pull the information, get the pads and read the Neiman-Marcus Cookie Recipe for themselves. Guess what? This protocol has completely, totally and utterly failed.

    The naieve response is to say "well, they wouldn't say it in the open... they'd use encrypted email to share the pad IDs!" Okay, fine. All that's happened is the encrypted email is the weak link in the security; if that goes, the entire scheme falls apart.

    Now recall those two extremely thorny problems from before. Hundreds of megabytes of purely random data are very hard to come by, and purely random presentation of random data is very hard to do. Add in the implementation weaknesses to the weakness of the communications channel between Alice, Bob and Charlie, and you've got a protocol which has very little merit.

    This protocol solves a problem which doesn't exist, as far as I can tell. Now, admittedly, I'm not the sharpest knife in the drawer and I'm also bone tired and I could be totally misunderstanding what the goal of the protocol is.

    But for a secret-sharing protocol, or as a way to securely store information in a way which is deniable, it's pretty dismal.

  5. The Real Ultimate Weapons Against Censorship... by istartedi · · Score: 4

    ...are a strong social framework, a tradition for the respect of individual rights, and a rational government working in harmony.

    Stop looking for technological fixes to problems that aren't technological.


    The regular .sig season will resume in the fall. Here are some re-runs:
    --
    For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
  6. One Time Pad Snake Oil by Effugas · · Score: 5

    *Sigh*

    Everybody loves the One Time Pad.

    Can't imagine why. It's like, couple words out of Shannon saying a system can be provably uncrackable, as long as it's far too annoying to actually use, and people convert that to:

    Lets just make it not annoying to use.

    Problem is, the security comes from that annoyance, and degrades ungracefully: Very, very ungracefully. As in, the moment one pad gets compromised, or even reused, boom. Game over. You're done.

    Compound that by having key material retrieved by the encryptor over a network(as this system depends on), and you're even more done. Lets analyze what's going on here a bit.

    All cryptosystems are essentially engines for extracting the secrecy from a set of data. Secrecy is something even more intangible than the raw data that itself is secret; a very large quantity of information can be stored and transfered, but a secret can only be transfered if that data can be understood. Cryptography essentially works by allowing the comprehensibility of data--though not the data itself--to be extracted and simplified down to some other piece of data.

    Now, often that data can be much, much smaller. Broadbridge Media, for instance, takes direct advantage of this for reasonably secure mass data distribution of music videos on CDs--some large ciphertext gets mass distributed on CDs or DVDs, while a small, personalized transaction over the Internet allows an individual to retrieve the key which decrypts the ciphertext into plaintext. The mass data is moved, but remains incomprehensible until a relatively tiny amount of key material is transmitted to the destination host.

    Madore's system is somewhat similar; he still has a chunk of extracted secrecy composed of a "recipe of pads" which, when XOR'ed together, reveal the plaintext. This recipe can be as small as literally two pads; an innocent "complete works of Shakespeare" page and some extension thereof.

    First problem? Madore gets his pad indexes from the first couple of bytes of whatever pad he's come across. PGP has survived reasonably well with a 2^^32 complexity attack against its public keyspace indexes(it's called the DEADBEEF attack); Madore's system however is likely to find collisions in everyday use.

    It never ceases to amaze cryptographers that, for all the functionality of the fixed-output, one way hash(password storage, small indexes to arbitrarily sized inputs), people don't use them. There really aren't that many flat out solved problems in all of crypto, this is one of them. IF YOU'RE NOT STORING YOUR PASSWORDS AS EITHER MD5 OR SHA-1 HASHES, YOU'RE WAITING TO GET HACKED. *sigh*

    Anyway, beyond that small chunk of data which gives the recipe of which block to use, there's also the censorworthy-but-XOR-obfuscated block which will supposedly diffuse itself throughout the network. Whereas Broadbridge got its incomprehensible data out the door on CDs, Madore's system invokes the distributed nature of many, many XORable keyblocks to hide which block on the network is the actual censor-worthy block.

    But how many blocks do I need to use for a recipe? Suppose I have 200 random blocks to choose from, and I download one block of random key material. Wait. Lets say I'm really paranoid, and I generate my own random block to XOR against, and upload it to a server. OK. So I've gotten my single block to XOR against, I do so, and I upload my data-containing block to the padservers.

    I've already lost.

    Whether I downloaded my keyblock from the network, or uploaded it to the network, anybody sniffing my network traffic will see the exact block I used to encrypt against. They'll either watch it leaving the keyserver or going back in.

    Worse, lets assume there was no sniffer--just 201 random blocks, any two of which can be XORed together to reach plaintext. The complexity isn't one of fifty billion, it's 201*201, or a good 40,401 operations. Use of two pads isn't particularly specified...but then, use of this as a viable encryption system isn't particularly specified either. You can tell, by this line:

    "Your first task is to locate an announcement stating that the data you want are recoverable by XORing such a set of pads."

    Oh, that's all.

    "Go find your key."

    Obviously, with no special complexity applied to locating your key, there's nothing that separates You As Reader from You As Censor. And, since whoever determines a key used *once* for secret information determines it for all time...boom.

    But, lets be fair. Madore's goal mainly seems to be able to give websites the capability to host information they can't recognize. Freenet did this; Madore doesn't actually even come close. Among other things, the system isn't particularly fault tolerant. Good secret sharing systems allow m-of-n functionality, i.e. retrieval of any m number of shares from n total(like 3-of-5) reveals the data. This system? Any block is missing--and there doesn't need to be more than two--and your data is gone. Loss of a single pad archive is likely to cause some data to disappear forever. Ouch.

    Honestly, I'm putting too much energy into this. Madore writes the following:

    The pads, of course, are just named by their 16-hex-digit names (thus, strictly speaking, the announcement makes it possible to recover the first eight characters of the data; but that should not be a problem).

    Any cryptosystem which leakes information about the plaintext in the key material never should have left the drawing boards. I congratulate Madore on noticing this, of many flaws in his design, but this really is Bad Crypto. It's timely, and it's useful, and it'll hopefully prevent people from falling for other Pad scams by sheer nature of the /. reaction, but it's still Bad Crypto.

    *Sigh* At least he wasn't trying to sell us anything.

    Yours Truly,

    Dan Kaminsky
    DoxPara Research
    http://www.doxpara.com

  7. Some replies to various criticisms by David+A.+Madore · · Score: 5

    Hi. I'm the author of the page in question, and victim unaware of the Slashdot effect (well, not truly unaware: Erik Moeller, who posted the story, was kind to notify me in time). I received many emails about it, which I've all read, as well as a good many posts in the current discussion. I can't possibly reply to them all, but I'll try to answer some of the most frequent or important comments here.

    First note that the page was written in february (2000/02/19 to 2000/02/23 to be precise), so it is not new. However, I do not claim any kind of originality, nor paternity of the idea: it is a small variation on the protocol described in section 6.3 ("Anonymous Message Broadcast") of Bruce Schneier's book on cryptography. In any case, I think it is pretty obvious in the first place. I am merely suggesting a few practical ideas to make it workable. There is nothing great or revolutionary about anything, and I never made that claim.

    One thing should be made clear from the start: the whole idea is not about obscuring what the data is (i.e. it is not strictly speaking cryptography) but about who is sending the data. And, even more specifically, it is about making legal conviction impossible so long as the presumption of innocence is maintained (whether the presumption of innocence still means anything in these dark days is another question:-/&nbsp); thus, it is normal that the story appeared on Slashdot's "Your Rights Online" section.

    Please also note that I am not making a political statement. This is not a libertarian manifesto. I am not stating that you should use this system to send out assassination messages against the President / the Prime Minister / the King / the Pope / <insert your favorite assassination victim here>; I am merely stating that you can, and that this is none of my business.

    Many have pointed out that my suggested way of naming pads is bad. That's true: using the MD5 (or SHA1 or any other kind of hash) signature would be a better idea. But it doesn't really matter all that much what the pads are named unless we want the system to be resistant to malicious tampering, which was not one of my avowed goals. Indeed, we can get this almost for free, so we might as well. Let's say we could have a symlink pointing from pad_md5_whatever.dat to the pad of the given md5 for each pad in each repository, and "combination recipes" could be given with these links so as to make them resistant to tampering.

    Similarly for secret sharing: my idea was not to have a system which is hard to censor (there are other, far better, solutions for this), but to have one which is hard to track.

    Another thing I should make quite clear is that the system in itself is not used to hide data: it is used to hide the origin of data. This is why all comments on the "OTP is secure as long as the pad is truly one-time" line, or all remarks to the effect that it is trivial to find all relevant data among the padset, are quite true but completely irrelevant. If you want to hide the data on top of hiding the origin, then you use a traditional cipher; for example, you encrypt your data using blowfish and you use that data (the ciphertext, which for all intents and purposes is random) as input to the pad system. So long as you don't release the key, nobody can tell that there's a blowfish-encrypted data hidden in the pad system. The two are completely orthogonal. (It is true that my remark about the difficulty of finding "recognizable data" in the pad system is very misleading and irrelevant. I should remove that: never mind that part.) As for my comment about the birthday effect, it is merely about accidental collisions, not at all about malicious action.

    Somebody asks what is wrong with storing all pads in the same place since anyone can download them all. That is true, but that is beside the point. The point is that as long as a site does not have a complete set of pads yielding readable data, it is not, by iself, breaking any law, and all it is distributing is white noise; whereas if it stores one complete set of pads, then it is distributing the forbidden document in some form. Naturally, if someone wants to collect a complete set of pads, it is a good idea; but to distribute it is dangerous.

    Finally, there is the central question of whether the legal argument (which is the crux of the matter) holds water. Presumably it doesn't, but that will at leas prove one thing: the argument shows that any kind of law restricting free speech contradicts the presumption of innocence. Some have pointed out that one could monitor the pad system, and the last pad published in a set of pads would always be the culprit: this is not true, because it might have been delayed, or it might be provably innocent (which implies the former, actually), and you can never quite be sure.

    Imagine the following scenario: someone points out on some Usenet group that eight publically available pads, when XORed together, give something like DeCSS code. Judge summons the 'someone' in question, who claims that he just noticed that by randomly XORing pads together; not unconvincing, so judge lets the guy go. Then judge summons the pad owners. Starts with the most recently published pad: but the owner explains "look, my pad is just an encryption using the key 'foobar' of the first 128kb of (some standard transcription of) Shakespeare's Tempest; the idea had been floating around for some time, I just decided to publish it". Judge checks statement: it's true. So apparently the data was "published" earlier than was thought, it just took some time to come out; that makes things rather difficult to track. Second owner similarly points out that his pad is just a sequence of decimals of pi in binary. Third owner is in a country over which judge has no jurisdiction, so nothing to do there. Fourth and fifth owners seem to have created their pads at the very same time, and both state obstinately that they generated pure white noise (following, say, a story on Slashdot about pads being a great idea). Sixth owner says he generated his pad by XORing another dozen other pads with an innocent message (which he shows to judge). Seventh owner refuses to answer judge's question. Eighth owner posted his pad before DeCSS even appeared, so must be innocent (or really?). Now what does judge do? Convict some owners? All? None? Problem is, judge is impressed with first poster's proof, and can't run the risk of convicting someone who might afterward prove that his pad was innocent. Presumption of innocence. Even if judge merely issues an injunction that the pads be taken off the network, every owner appeals on the ground that the pads were reused in making some other messages (innocuous ones) and that removing them would be a serious breach of first amendment (or whatever you call this thing about free speech).

    Anyhow, this is the summary: there's nothing new or revolutionary about the whole pad system; in fact, it's pretty trivial. But it does make one point: that information is fundamentally delocalized and that any attempt to pinpoint it or to find a culprit will fail. For the better or for the worse.