Brightnets are Owner Free File Systems

← Back to Stories (view on slashdot.org)

Brightnets are Owner Free File Systems

Posted by CmdrTaco on Monday June 30, 2008 @01:00AM from the something-to-think-about dept.

elucido writes "OFF, or the Owner-Free Filesystem is a distributed filesystem in which everything is stored in reference to randomized data blocks, as opposed to a 1:1 copy of the original data being inserted. The creators of the Owner-Free Filesystem have coined a new term to define the network: A brightnet. Nobody shares any copyrighted files, and therefore nobody needs to hide away. OFF provides a platform through which data can be stored (publicly or otherwise) in a discreet, distributed manner. The system allows for personal privacy because data (blocks) being transferred from peer to peer do not bear any relation to the original data. Incidentally, no data passing through the network can be considered copyrighted because the means by which it is represented is truly random." Their main wiki page discusses a bit of what this means and how it might work as well. I've been saying that we need this for many years now, if only because we all have 10 gigs free on our machines and if we could RAID the internet we'd need fewer hard drives.

9 of 502 comments (clear)

Min score:

Reason:

Sort:

Re:Psst. Copyright doesn't work like that! by Richard_at_work · 2008-06-30 01:13 · Score: 4, Interesting

Yup, and attempted get-arounds like this are stuff courts love to slap down.
Re:Data != Information by iocat · 2008-06-30 01:18 · Score: 5, Interesting

You're right, but wouldn't this move the 'infringer' to the guy who had the URL to put all the little random chunks together into a Maroon Five file on his PC, not the girl who had one 128K chunk that *could be* used to represent the Maroon Five file -- or a shopping list -- on her PC?

--
Dude, I think I can see my house from here.
From the Wiki by Lord+Bitman · 2008-06-30 01:20 · Score: 4, Interesting

"A simple analogy is seen in that every number has an infinite number of representations (3+2=5, 2*2+1=5, 10-5=5, 10/2=5, etc). Even if the number (file) in question can be copyrighted under current legislation, it is practically impossible and unreasonable to state that every other representation of that particular number is copyrighted."
Actually, no, it's not unreasonable or impractical. In fact, that's how it actually works. Star Wars is copyrighted as a DVD, Film, mpeg, script, live performance, song, interpretive dance, etc. ..right?

--
-- 'The' Lord and Master Bitman On High, Master Of All
Re:Psst. Copyright doesn't work like that! by Richard+W.M.+Jones · 2008-06-30 01:22 · Score: 3, Interesting

You copyright the actual tangible information. Attempting to abstract the law into mathematics is pointless. They are not compatible.
You're dead right. What is interesting is that if you're "caught" with some of these random blocks on your disk, they're just random blocks of data. You can't decode them unless you have the key, hence there's no charge of copyright infringement.
One problem with the proposal (which, by the way, is very obvious, and is how FreeNet and other systems work) is that their key length needs to be the same length as the data, because it's effectively a One Time Pad. If it's any shorter than the original data, then there will be a way to unencrypt the data without the key (proof by a simple counting argument).
Rich.

--
libguestfs - tools for accessing and modifying virtual machine disk images
Worrisome... by zetazentra · 2008-06-30 01:42 · Score: 5, Interesting

http://wiki.offdev.org/Talk:Why_is_OFF_safe%3F :
Trojan detected with avg free
Another side to the safety issue. I'm hoping this is a false positive, as I like OFF
* avg free v7.5.516 virus base 269.17.13/1208 finds
o Trojan Generic9.AKLU in
+ offsystem.exe from OFFStystem-0.18.00-win-installer.exe from sourceforge January 3 2008
This is worrisome...
Re:Encryption by Hal_Porter · 2008-06-30 01:47 · Score: 5, Interesting

Replying to my own post, but this IS just a sort of encryption - their main claim being because the data is encrypted, it's not copyright.
As has been pointed out below, the data transferred is not the thing copyrighted - it's what it represents. So it's an arduous and painful encryption, with high overhead, easy to crack and no plausible benefit. With some hand-wavy 'it annuls all badness from bad things' explanation.
Except that is probably bullshit to copyright lawyers
There's a great explanation of why in this essay, What Colour are your Bits. It's actually about another system based on the same sort of ideas.
http://ansuz.sooke.bc.ca/lawpoli/colour/2004061001.php

The fallacy of Monolith is that it's playing fast and loose with Colour, attempting to use legal rules one moment and math rules another moment as convenient. When you have a copyrighted file at the start, that file clearly has the "covered by copyright" Colour, and you're not cleared for it, Citizen. When it's scrambled by Monolith, the claim is that the resulting file has no Colour - how could it have the copyright Colour? It's just random bits! Then when it's descrambled, it still can't have the copyright Colour because it came from public inputs. The problem is that there are two conflicting sets of rules there. Under the lawyer's rules, Colour is not a mathematical function of the bits that you can determine by examining the bits. It matters where the bits came from. The scrambled file still has the copyright Colour because it came from the copyrighted input file. It doesn't matter that it looks like, or maybe even is bit-for-bit identical with, some other file that you could get from a random number generator. It happens that you didn't get it from a random number generator. You got it from copyrighted material; it is copyrighted. The randomly-generated file, even if bit-for-bit identical, would have a different Colour. The Colour inherits through all scrambling and descrambling operations and you're distributing a copyrighted work, you Commie Mutant Traitor.
To a computer scientist, on the other hand, bits are bits are bits and it is absolutely fundamental that two identical chunks of bits cannot be distinguished. Colour does not exist. I've seen computer people claim (indeed, one did this to me just today in the very discussion that inspired this posting) that copyright law inescapably leads to nonsense conclusions like "If I own copyright on one thing, and copyright inherits through XOR, then I own copyright on everything because everything can be obtained from my one thing by XORing it with the right file." That sounds profound only if you're a Colour-blind computer scientist; it would be boring nonsense to a lawyer because lawyers are trained to believe in and use Colour, and it's obvious to a lawyer that the Colour doesn't magically bleed to the entire universe through the hypothetical random files that might be created some day. You could create the file randomly, but you didn't. Maybe you could create a file identical to the complete works of Shakespeare by XORing together two files of apparently random garbage. "Why, so can I, or so can any man;" but that doesn't mean that I am William Shakespeare.

--
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
Reinventing the wheel by kenp2002 · 2008-06-30 03:21 · Score: 3, Interesting

I love kids these days, always thinking they are clever.
A long time ago a man wrote a book, he then made an index of all the words in the book and listed them in alphabetical order.
He then re-copied the book as a reference to the index.
Original: "I am the king of scotland"
Index: AM,I,KING,OF,SCOTLAND
Story: 2-1-3-4-5
Now this idea is nothing more then seeding a network with the index of data then to rebuild a particular file you pass is an index reference.
They would simply bust people for passing the index reference.
Ironical that old book became the foundation for modern day text compression schemes that used indexes and many of the key concepts that cryptography was born from.
Clever kids, if it was still the 1500's and you were trying to smuggle banned books under the nose of the inqusition. They just burned people with the indexes just as if they had the books themselves.
Honestly do they really thing that people are that stupid? If I use a pencil to stab someone I am going to jail just the same if I had used a knife. If someone is smuggling something across the border, but I don't know what, I am still an accomplice to some degree.
Plausable deniability is a great idea but the moment one of those indexes lands on you PC your gonna get dinged for whatever the index points too.

--
-=[ Who Is John Galt? ]=-
Re:Encryption by maraist · 2008-06-30 11:36 · Score: 3, Interesting

I read all the wiki had to offer. I agree that this is a problem - I'm going to see if I can post to their complaint forum as well.
Basically you have a URL that contains 4 pieces of information.. A file name (largely meaningless except to the end user), a file-size, and 3 30-Byte SHA-1 hashes, referred to here as a 3-tuple (represented in HEX). You search the local disk cache for file names that match each of the SHA-1 digests. For every digest file not found, search the local network for a match and download the block locally (this is the peer-to-peer part).
You XOR the contents of the 3 blocks (which happen to be sized at 128K - no significance) to produce the decoded data.
The first decoded block (provided from the URL) is a sequential-list of 90 byte 3-tuples (similar to the original URL). The contents of all of these 3-tuples are the desired data, except the last 3-tuple which is a chain to the next descriptor block.
The file-size tells you when to stop obviously.
The 'theory' is that highly randomized data should be randomly reused by completely unrelated data.. .mp3 and .txt files, for example. Moreover, there is 'no way to reconstruct' useful data w/o the 3-tuple AND the file-size. However, small files will have a high probability of SHA-1 collisions (and thus corrupted data - they only talk about virus corruption, but there's the more important inadvertant collsions which overwrite valid data - BackupPC resolves this by creating MD5;1 MD5;2 file-names). The large 128K should alleviate this, but also assures a low probability of block reuse.
The problem I see is that data-blocks are not inherently random by default.. In order to be practically random, you'd have to take the recommended 1TB file-system, randomize it - produce approx 8 million SHA-1 digests, then for each real-data insert, delete in an LRU fashion. Otherwise, if you only had a hundred-thousand blocks - It would not be THAT difficult to grab the first 30 bytes of every block and XOR them with several of the most recently inserted blocks until you found something that matched an existing file-name. If matched, try the next 30B, etc. Now you have a starting point AND the appropriate 3-tuple. You're only missing the file-size.. But if it spits out music in one of like 5 codecs, you've got a winner. Shouldn't be able to do statistical analysis to find random-noise or invalid media format. Many files contain internal end-of-file signifiers (.zip, .gz for example).
With 8 million records, that becomes hard(er) to do. But how long does it take to initialize that?
Now with respect to the network, there's no need to actually store the file-descriptor block remotely, Thus for highly sensitive files, you can probably encrypt the descriptor block and keep it locally (sharing on a private trusted network). But for text-based files, you'd probably still be weary of having network stored timestamp ordered data-blocks - as the contents of the last 100 blocks could easily be determined, (text files are not as order sensitive as mp3s and zip files).
The stated goal is purely open, freely shared, perfectly legal data-store... Which allows the occasionally masked sensitive data. Though the RIAA/MPAA would read it as, a front for illegal data.
They say they have better bandwith than obscured P2P networks, since you can allow open download by the RIAA as well as your clients, and it's all meaningless w/o the starting points/blocks. You do have a 3x bandwidth over a pure HTTP/FTP download - as you have to download 3 blocks to XOR against each other to produce 1 block of data. They suggest that once you have a descriptor block you 'should download the tuples in random order to reduce pattern matching by ISPs' which furthers the notion that this is for illicit purpose.
I'm highly suspicious that the SHA-1 digests produce useful collisions and provide you bandwidth reduction via your local disk-cache for the above comments.
I'm also

--
-Michael
Reductionism vs the law by Loki+P · 2008-06-30 11:48 · Score: 3, Interesting

The OFF argument is akin to this: take a copyrighted work, let's say it's a novel. Cut it in half. Is that half still under copyright? Yes. OK, cut it half again, and again, and again. At some point you'll get down to individual words, letters, or single bits. These do not have copyright in themselves, and so can be joined together with other words, letters or bits from other places and stored in 128K chunks which likewise don't themselves have any copyright. These chunks can then be distributed because they are just random-looking chunks of data.
The problem with this argument is it's reductionist. If you blend up 5 copyrighted works and pour them into 10 shot-glasses (the network), sure you can claim each individual shot-glass doesn't fall under the same copyright of any single one of the original works. But since you can extract each of the 5 original works from the collective set of the 10 shot-glasses, then the network as a whole does contain the copyrighted works, and does fall under copyright protection. In a sense, they have smeared each copyright out over many (possibly overlapping) chunks, but it's still there because the originals can still be retrieved. Banning the whole network seems a possible legal outcome, since non-infringing uses may still involve moving chunks which contain a partial copyright.
As much as the creators of OFF might claim their work is different to a darknet, actually it relies on very similar principles of obfuscation.