Domain: open-content.net
Stories and comments across the archive that link to open-content.net.
Comments · 28
-
Re:BT would be good for flat rate servicesThe most desirable, in my opinion, would be some sort of content-neutral caching mechanism that would move data closer to consumers.
I agree - something like a caching HTTP proxy, but with files identified by location-independent URNs rather than URLs, would save a huge amount of bandwidth. The question is how to deploy it incrementally.
There's already a spec for URNs based on hash trees, so we could start with a browser plugin that handles hash tree URNs, fetching them from the same server as the embedding page if no proxy has been configured. Maybe hash tree requests should go to a different port so that ISPs can see how much transit traffic they'd save by installing a proxy.
The proxy should deal with IP addresses, not DNS names, so that P2P requests can be proxied.
-
Gnutella should be OK
Gnutella uses SHA1 but they also verify swarmed downloads using THEX. I think this should defeat their attempt at spamming though it will consume processing and bandwidth cycles.
-
Re:Just an annoyanceI believe there is a hashing algorithm called TigerTree. TigerTree computes a single hash based on 1024 byte blocks. As the file is downloaded, each block can be independantly verified.
So if they try to pollute a network by giving corrupt data for a valid file, all the downloader needs to do is notice that a particular client keeps sending corrupt parts. And of course if they send some real bits nad some fake bits, the downloader will keep the real bits and discard the fake ones.
Don't ask me how it works, but I know that Shareaza makes use of this hash.
Link I ripped from the Shareaza wiki: Tree Hash EXchange format (THEX)
-
Self-Healing Data Transfer
For swarmstreaming, we use the Tree Hash EXchange format (THEX) to provide cryptographic integrity verification down to a single 1KB resolution so we can automatically repair the corruption.
-
Re: The "Detailed Summary"
The intuitive short-term solution, at least to me, is to use N different MD5sums for each file, taken at N different offsets within the file (each unique mod block size).
It won't do the offset thing, but tiger tree hashing uses blocks, and combines them into another checksum. If the blocks were small enough, they'd catch this. I'm not sure about the block size involved in either, though; I suppose with a bit more CPU, you could get arbitrarily small blocks. -
Re:Can someone please explain to meIt's not centralization that prevents you from having to search through 30,000 random files, it's the ability to link to a particular file in a verifiable way. Merkle hash trees can achieve the same thing in any filesharing network. In a hash tree the file is broken up into equal-sized chunks. The chunks form the bottom layer of the tree. Each chunk is hashed, and the concatenated hashes form the next layer of the tree. Repeat until there's only one hash, and that's your filename. You can request branches of the tree in parallel from different peers, and every chunk can be verified as soon as it's downloaded.
BitTorrent trackers just give you a way of finding peers who are downloading the same file - they are *not* necessary for data verification. A P2P search network like CRL would allow you to find peers that are interested in the same (verifiable) filename in a completely decentralized way. You could then use BitTorrent's parallel, incentive-based download mechanism to retrieve the file.
-
Re:Torrent pool
Sounds likee OCN.
-
Gordon Mohr
-
P2P RSS ChannelsThe Tornado client for the Open Content Network has support for P2P download channels based on RSS.
Basically, you click on a link which will subscribe the peer to the channel, and the peer will automatically download/pre-cache any new items that are added to the RSS feed.
You simply have to create an RSS feed and create a link that converts that feed into a channel that is subscribable via the Open Content Network. I've set up an example of a movie trailer RSS feed here And have linked it into the Open Content Network here.
-
Re:The hackers .. what do they do?Just a FYI, but:
- eDonkey uses a fixed chunksize of 9MB.
- BitTorrent uses a partsize which is a power of 2; with a default of 256KB (2**18).
- Many Gnutella clients use hash trees which have granularity variable from 1024bytes to 10MB.
Small chunksizes, like BitTorrent's, are great because you can pretty much be assured that you'll get the whole chunk from one uploader, so if it turns out to be corrupt, it doesn't take so long to redownload, and you can simply ignore the rogue (though I don't think BT clients do this yet). The downside of course is that the
.torrent files double in size as you half the hashed partsize (and currently .torrent files are usually hosted by central webhosts rather than distributed by BT clients).Hash trees, on the other hand, are great because not everyone needs to store the entire fine-grained hashset down to 1K. If corruption mysteriously increases, you could simply ask for another level, making it easier to detect & ignore the rogues who are uploading corrupt chunks and/or hashes.
Eventually p2p networks will have to implement distributed webs-of-trust based on keypairs rather than spoofable IP addresses or user-hashes. Everybody would start with zero trust (unless you had a 'sponsor' that would give your pubkey initial trust), but bad behavior (like uploading corruption, or hammering queues) get's you on people's distributed shitlists, and good behavior (like uploading many gigabytes of illicit data) earns you trust. Over time, well-known "trustworthy" nodes would rise to the top (or rather, "move to the center of the web"), and can be used as a foundation similar to a central server.
--
- eDonkey uses a fixed chunksize of 9MB.
-
The Technology is Here Already
I run a software company called Onion Networks that provides peer-to-peer content delivery technology to movie studios building VOD systems.
With fast P2P content delivery technology, MPEG-4 compression, and PVR-like time shifting devices - the speed, storage, and economics are there today to provide DVD-quality VOD.
The only problem is that it is taking the studios a long time to roll out there VOD solutions, but trust me, they'll be upon us in the near future.
For more information on the protocols that underly these P2P content delivery systems, please check out the Open Content Network Specs -
Re:The "About" informationThe Tiger Tree specification was worked on and agreed by all of the members of the Gnutella community. The implementation itself is relatively trivial -- agreeing on the spec is the hard part. Even in this case, where there's a very strong, open specification for Tiger Tree exchange in the Tree Hash Exchange (THEX) protocol, Shareaza implements it slightly differently, breaking compatibility with other clients and flying in the face of the specification. You can find THEX at:
THEX is an open specification developed by Gordon Mohr and Justin Chapweske. It's not Gnutella-specific, but it certainly isn't Gnutella 2.
-
Re:Free book cost real money (for us)
You're exactly right, although a P2P network would only be part of it: someone without access to the client software should still be able to download the book.
I've seen situations where the P2P client is built into a browser plugin or Java app. For an example of this, see the Open Content Network, which provides distributed downloading free for content under an approved license. -
P2P Video BloggingIf you're going to be video blogging, I would highly recommend checking out the Open Content Network which provides P2P distribution of web sites.
The Internet Archive currently uses it for distributing live concert recordings, so it should work great for video too. -
Re:Photos
Oh, if only we had an Open Content Network.
-
the WWW *is* content-addressable
Regarding:
This document specifies HTTP extensions that bridge the current location-based Web with the Content-Addressable Web. -- HTTP Extensions for a Content-Addressable Web
The World Wide Web is "the universe of network-accessible information", i.e. anything with a URI, including URIs that are not tied to a particular hostname.
The Web already includes non-location-based URIs like mid: (for referring to message-ids), and urn:sha1: for referring to a specific set of bits by their checksum.
This proposal seems like a decent way of bridging HTTP-space with URN-space, but please remember that the Web is more than just HTTP. (see also: URIs, URLs, and URNs)
Anyway, it seems to me that sites that tend to suffer from slashdotting are:
-
those that use dynamically-generated pages for what is basically static content: this problem can be fixed by sites making sure their content is cacheable, and further deployment of HTTP caches. (I'm not convinced a p2p-style solution is the solution here.)
-
those with large bandwidth needs (kernel images, linux distribution
.iso's, multimedia): as p2p software becomes more mature and widely deployed, everyone will have a urn:sha1: resolver on their desktop (pointing to their p2p software of choice), then whenever a new kernel is announced, the announcement can say:Linux kernel version 2.4.20 has been released. It is available from:
Patch: ftp://ftp.kernel.org/pub/linux/kernel/v2.4/patch-2 .4.20.gz
a.k.a. urn:sha1:OWXEOVAK2YJW3G6XSULXDWFCNWTX7B2K
Full source: ftp://ftp.kernel.org/pub/linux/kernel/v2.4/linux-2 .4.20.tar.gz
a.k.a. urn:sha1:PPWXYMA32YNDNO35UD3IQTCWBVBYK5DCand people can just fetch the files using urn:sha1 URIs instead of everyone hitting the same set of mirrors. (gtk-gnutella already supports searching on urn:sha1: URIs)
-
-
Re:Obvious technical solution take 2Nah, you can use a hash-tree, and check each segment as it comes down.
See here for details.
-ZK-
-
Tree Hash EXchange (THEX)
The crew at the Open Content Network have released a specification for serializing hash trees. The specification is called the Tree Hash EXchange (THEX) and is being implmented in both the Open Content Network and Gnutella. Furthermore, this specification is compatible with the TigerTree hashes used for Bitzi.
-
Tree Hash EXchange (THEX)
The crew at the Open Content Network have released a specification for serializing hash trees. The specification is called the Tree Hash EXchange (THEX) and is being implmented in both the Open Content Network and Gnutella. Furthermore, this specification is compatible with the TigerTree hashes used for Bitzi.
-
Reed Solomon Library and Swarmcast
We wrote an optimized Reed-Solomon library for Swarmcast that can do up to 65k shares. Its available under a BSD-style license. Also, since Swarmcast is a P2P content delivery system, the library also supports cryptographic integrity checking, so you can ensure that everything is in tact.
Unfortunately the Swarmcast project has languished after 1.0, but we have started a new project called the "Open Content Network" -
Open Content Network (P2P for open source)
Another complementary project in progress is the Open Content Network
The OCN provides an important piece of the puzzle with its metadata proxy servers. These servers automatically generate the verification information (SHA-1 hashes) necessary to perform secure P2P downloads.
It would be nice if this project leveraged the significant amount of work going into the OCN to provide a standard way to securely delivery any open source content across peer-to-peer networks.
Check out the OCN specifications here. -
Open Content Network (P2P for open source)
Another complementary project in progress is the Open Content Network
The OCN provides an important piece of the puzzle with its metadata proxy servers. These servers automatically generate the verification information (SHA-1 hashes) necessary to perform secure P2P downloads.
It would be nice if this project leveraged the significant amount of work going into the OCN to provide a standard way to securely delivery any open source content across peer-to-peer networks.
Check out the OCN specifications here. -
Re:Even though I'm not a big fan of copyright....
Bitzi offers a solution similar to the one proposed in the parent's parent(? file ratings and other metadata associated with full file hashes). For partial/subrange verification, check out the proposed Tree Hash EXchange format.
-
Re:No, he doesn't want to legalise DoS attacksThe spoofers could still send the fingerprint of the good version before sending the bad version. Unless the service does several individual fingerprints on different parts of the mp3
Tree Hash EXchange describes a cool way of doing this. That's a big reason Bitzi uses the top of a tiger hash tree in its bitprint file identifier (a sha1 hash is the other part).
-
Re:Need for ChecksummingIt is part of the specs
1.2 Untrusted Caches
It is currently unsafe to download web objects from an untrusted cache or mirror because they can modify/corrupt the content at will. This becomes particularly problematic when trying to create public cooperative caching systems. This isn't a problem for private CDNs, like Akamai, where all of their servers are under Akamai's control and are assumed to be secure. But for a public CDN, the goal is to allow user-agents to retrieve content from completely untrusted hosts but be assured that they are receiving the content intact. The CAW solves this problem by using content addressing that includes integrity checking information.
-
Open Artificial Intelligence Network
Great. A big hurrah for the Open Content Network and the Creative Commons.
Artificial Minds will be able to spread all over the 'Net. They are already at:
http://mind.sourceforge.net/index.html
http://mentifex.virtualentity.com/jsaimind.html
http://users.resentment.org/ai/jsaimind.html
http://www.scn.org/~mentifex/jsaimind.html
http://victoria.tc.ca/~uj797/jsaimind.html -
In case...
it gets slashdotted, which would be truely ironic:
What is the Open Content Network? We are in the process of creating the Open Content Network, which aims to be the world's largest content delivery network (CDN).
Users will soon be able to download open source and public domain software, movies, and music at incredibly fast speeds from this global, distributed network.
Using a new Peer-to-Peer technology, called the "Content-Addressable Web", indviduals will be able to contribute to the open source movement by donating their spare bandwidth and disk space to the network.
btw, note how the "implementations" section is not active yet. -
OK, but...
The Open Content site just announces a list of intentions. Anyone can put this kind of info up. It looks to me like nothing has been achieved yet, making this not really news.