Swarmcast GPLed
miguel writes "OpenCola has just released SwarmCast which is a very interesting mechanism for distributing software. At Ximian we are looking into integrating this into Red Carpet to accelerate software downloads by using their sharing software. The demo of their product is pretty amazing." Very clever: essentially it creates a peered network so larger files can be shuttled around faster. Each client can serve a small piece of data to other clients so that a massive centralized data center isn't necessary. Now the cola on the other hand...
I've had no success getting their software to work in Opera v5.11 for Windows. Anybody else gotten it to work?
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
It isn't true that Mojo Nation is "not focussed on performance". I'm one of the Mojo Nation hackers, and we care about performance. It is true, however, that Mojo Nation is pretty complex, providing both data transfer (using a "swarm" like technique), and data storage, and a queriable search engine. The end result is something like a distributed, non-deletable World Wide Web. (Sort of like Freenet plus persistent data, or the earlier concept of Ross Anderson's "Eternity Service".)
Performance isn't that great on Mojo Nation right now, but it is good enough, in my experience, for daily use.
I'm pretty excited about the Swarmcast open source release, both because I think Swarmcast is a cool app in itself, and because I can now start taking ideas and code from Swarmcast to put into Mojo Nation, and vice versa.
In the long run, both Mojo Nation and Swarmcast will improve because of this sharing, as will other related open source projects like Freenet and Free Haven.
Regards,
Zooko
P.S. I've been talking with Justin Chapweske, the Swarmcast, lead, on irc.openprojects.net, and he's already pointed out a potential bug that we need to avoid in future versions of the Evil Geniuses Transport Protocol...
Thing is, I don't see why anyone would use this. In reality, the transfer rates aren't as good as a single fast dedicated server
True. If you have an adequate dedicated server with plenty of bandwidth, it will be hard to top with swarmcast.
The benifit comes in terms of cost. Those beefy servers cost a shitload of money in terms of colocation and bandwidth.
A small, unfunded content shop may want to stream animations but cannot afford the cost of a central server to do it.
If they use swarmcast, however, the streaming is accomplished by utilizing bandwidth of everyone participating. No expensive server is needed, and performance is at least decent.
Real world uses for this technology are still lacking, so we shall see how swarmcast gets adopted IRL.
I'm not sure how you would deal with corrupted files, however. Any thoughts?
--Ben
You are basically correct, we are using SHA-1. I should put up a security FAQ.
-Justin Chapweske, Lead Swarmcast Developer
Cool, I'll make sure the FAQ gets cleared up.
Thanks!
-Justin Chapweske, Lead Swarmcast Developer
Swarmcast is neither a fragile chain structure, nor a hierarchy, it is a many-to-many 'swarm' structure where peers send and recieve data from many peers in parallel. The use of Forward Error Correction allows us to have a potentially huge number of unique packets in the mesh where only a small subset of those packets are needed to recreate the original content. This allows the peers to swap data back and forth in a fairly random fashion to provide a high level of resiliance against changing network conditions, very high throughput, and rapid scalability.
-Justin Chapweske, Lead Swarmcast Developer
Everything you need to both serve and download content is released under the GPL. Besides, its peer-to-peer so there really isn't that much of a "server" concept. The gateway is mostly for content management and permissions, the kind of stuff that companies pay money for so that I can keep my job and write more open source code. -Justin Chapweske, Lead Swarmcast Developer
No.
With mojonation, I've never seen the performance that a solution like this should provide. Mojonation's not focused on performance, and Swarmcast is.
Mojonation's a lot more complex, too. With Mojonation, there's a searchable, virtual repository of files. Every host on the network is a peer.
With Swarmcast, only hosts downloading a given file are peers, and files are simply linked at web sites. (Personally, I think the proxy server approach is niftier, though.)
There's more tech info here: http://opencola.org/projects/swarmcast/swarm_tech1 .shtml
Man, you don't usually see download rates like that from anything except akamai. Problem was, the data got corrupted.
And no, it's not like a bucket brigade. It's more like building a house with more than one bricklayer.
Re: slow hosts-- if I understand correctly, all swarmcast hosts maximize their available bandwidth, automatically balancing the download.
Each node downloads packets one at a time. If a host is slow, it won't grab as many packets, and someone else will.
If machines A and B are downloading, and A is 5x as fast as B, then in the same time, A will download 5 packets and B will download 1. If there are 12 packets in the file, A will download 10, and B will download 2.
A secure hash function is the solution for validity. Get a trusted source to give you something like a mojonation sharemap or a freenet content hash key which point you to the file you want by telling you its hash. Use whatever mechanism is provided by the system to query for a particular file, which the provider identifies by its hash. Download it, and re-compute the hash. If they don't match, throw it away (and mark the source as less trustworthy). If they do match, it's either exactly what the trusted source specified, or someone managed to get a hash collision (fairly unlikely with a secure hash).
A series of high-speed mirrors requires a lot of (expensive) fat pipes. Getting away from that requirement is the whole point behind swarm distribution.
You may find MojoNation a bit more to your liking; MN brokers transact with one another in a crypto e-cash currency called "Mojo". Using others' bandwidth costs Mojo, providing bandwidth to others earns you Mojo.
First of all, what type of security is going to be implemented to prevent hacks. It seems that it would be pretty easy to shore up a single server, or even several in a single datacenter, but it would be a daunting task to protect thousands of machines spread thoughout the world against hackers.
So long as the swarmcast client software doesn't have any holes, how is this an issue? If it does have holes, yes, this is a big deal, but if you've got a machine on the net with other security holes, that's an entirely different problem.
Secondly, what type of redundancy is going to be built into this system. Again, if a file is going to be served from a single centralised machine, it should ideally be fairly reliable, with multiple connections, and RAID to ensure continuous uptime. However, if you're serving tiny pieces of a file from thousands of boxen, it seems to me that if even one of those fragments doesn't make it all the way downstream, the whole file would be worthless. Has Swarmcast done anything to prevent this from happening.
This is a pretty trivial problem to solve. There's still got to be a central server somewhere that tells you where to look for the peer bits. I don't know the details of their setup, but it certainly seems trivial to implement MD5-checking per chunk, so you can tell if there are bad chunks in your download. You just have to trust the main server's md5 signatures, and never end up trusting the peer servers.
This applies to the first point, too. If a serving machine is hacked, it's not a problem beyond that box because the real security comes from the MD5 key, or equivalent. It may be possible for a hacked server to serve bad code, but, given reasonable client design, it should be impossible for this bad data to actually be used by another machine. (Or, at least, should be impossible without big flashing lights and sirens screaming "You fool, don't do that!" :)
Anne: You are the weakest link, g'bye.
Ok, what is this, some kind of new "all your base" meme-storm?
Secondly, what type of redundancy is going to be built into this system. ...if you're serving tiny pieces of a file from thousands of boxen, it seems to me that if even one of those fragments doesn't make it all the way downstream, the whole file would be worthless. Has Swarmcast done anything to prevent this from happening.
One of the main benefits of distributed anything is that there's an incredible amount of redundancy because every node is willing to contribute a little towards achieving your goal.
Are you trolling?
First of all, what type of security is going to be implemented to prevent hacks.
Against what? To prevent against file modification, checksums can be distributed from a single point since they're much smaller. All other security problems will be the same as any other P2P system.
--
True peer to peer is a terrible way to distribute anything important. It becomes harder to prove the validity of anything. I'm sure md5 could help, but a series of high speed mirrors would work better.
The courts would laugh at this. It is distributing content that is legal to be distributed. That is the issue with napster, not peer to peer file sharing. In fact I used to get all my mp3 in a client/server setup.
Maybe its time these developers list what exactly their software is being used for, and who is using it, to promote it, as opposed to waiting for groups like RIAA, MPAA to cry foul over them. Doing so would provide a nice argument, such as the ones EFF was looking for earlier.
As for the brief mention on security I browsed through, personally I don't see it as a big deal provided you know how to set perms, and or can configured some form of SSL behind it or something similar, perhaps make some rules on your firewall or IDS to ensure nothing gets broken along the way.
I can't wait to see how groups will react to cDc's Peekabooty, thats sure to be a kick in the ass for groups like RIAA, MPAA.
Want Root?
Why not distribute the pieces on Freenet? Using SSKs you can be sure that each piece was inserted by the original poster. Maybe this would make a cool client for Freenet, though...
Can your IM do this?
--
314-15-9265
Still a cool idea, but if OpenCola wants everybody to put eggs in their basket, maybe somebody should release a GPLed server too...
--
314-15-9265
Well, my first problem is that the FTP server for downloading the JWS file seems to be /.'ed, so I can't get that and can't install SwarmCast, which would have solved the problem in the first place. Hmph...ironic, in a Catch-22 kinda way, isn't it? ;)
nlh
Ferrari and other exotic car rentals in New York
Sure, their client may be under the GPL but some of the terms on their license agreement are quite unpleasant:
(a) Reverse assemble, reverse compile, or reverse engineer the Gateway (or any component or portion thereof), or otherwise attempt to discover any underlying Proprietary Information (defined below) of the Gateway;
Isn't reverse engineering explicitly granted by law?
(b) Sublicense, rent, sell, lease or otherwise transfer the Gateway (or any portion thereof) to any third party;
(c) Remove or alter any marks or designations indicating the ownership of copyrights, trademarks or other intellectual property rights of any party contained in the Gateway;
In fact I'd go so far as to say these conditions wouldn't feel out of place in a Microsoft EULA.
It'd be a fun academic exercise to implement something like this in Sun's jxta environment...
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
Wasn't someone compiling a list of valid uses for P2P? This certainly sounds like one to me.
The EFF is seeking help in this area of finding legitimate uses of peer to peer technologies...
Maybe this might be a very good argument in court... content distribution at high speeds.
Ever need an online dictionary?
AFAIK, the origin server is always willing to send you more blocks. So if there are few (or no) other Swarmcast peers currently running, you just download most of the file from the origin server.
The cola is pretty good. I'm not going to give up Coke, though.
Swarmcast breaks a file into packets, which are encoded using Forward Error Correction (FEC), a mathematical technique that Swarmcast uses to make it easy to reconstruct the file. The encoded packets are then distributed randomly to the computers that have requested the file. These computers become nodes in the mesh (a temporary network) that Swarmcast creates for this download.
Each node receives only a portion of the original packets. But each node is also aware of some of the other nodes receiving packets. Even as a node receives a packet, it also rebroadcasts it to other nodes, so packets are rapidly swapped back and forth.
As each packet is received, the receiving node checks to see if it is useful in rebuilding the file; if it is, Swarmcast decodes that packet. As soon as a receiving node receives sufficient useful packets, it reconstructs the file. Thanks to FEC, a Swarmcast download is a bit like playing poker when every card in the deck is a wild card: as soon as you've received five cards, you can build a winning hand.
So there seems to be some amount of redundancy built in, provided enough servers are running at a time. When I tried it out, there was just one provider, but at 500Kbps.
Using GPL'd code may put you into severe financial jeopardy. Make sure to check with your local Microsoft representative to make sure you have paid the appropriate "GPL tax". Remember: this applies even if you don't own any Microsoft software.
There are stringent limitations on the ways you can redistribute GPL'd code. If you redistribute any GPL'd tool in whole or in part, you must put a link to Midori's home page in the About box of your program.
The GPL is "viral", and can cause your program to become GPL'd too. However, instead of including the code directly in your new program, use a C trick called "linking". To do this, all of the GPL'd code you wish to use must be in a "library". If it isn't, take all the parts you need and compile them into a library, then distribute that with your program.
If you follow these tips, you can make great use of other people's hard work-- without running the risk of having to share any of your own. Good luck!
First of all, what type of security is going to be implemented to prevent hacks. It seems that it would be pretty easy to shore up a single server, or even several in a single datacenter, but it would be a daunting task to protect thousands of machines spread thoughout the world against hackers.
Well, that's an easy one. One centralized server which tells client what file to get, AND also tells md5 sum of the file & possibly also fragments (512k each, maybe).
Has Swarmcast done anything to prevent this from happening.
Now, to get an answer to that, you should read the article..
--
fucktard is a tenderhearted description
This saves the content providor bandwidth at my expense. Bandwidth is, of course, not free and somebody has to pay for it. Ximian take note, if you don't want to pay for the bandwidth involved in being in the software distribution business, get out of that business. For Joe Average that installs Red Carpet and doesn't understand that it now includes this new feature, is going to be mighty pissed when he gets his bandwidth bill only to discover that he's been serving 10 gigabytes to strangers due to this feature.
A journey of a thousand miles starts with a brutal anal raping at airport security
;-)
And tie up more band width.
Although I acknowledge that this is the opposite of what is intended by the system.
maybe they ought to have a few distros on the system so we can help them test it out.
Check out the Vinny the Vampire comic strip
"It is a greater offense to steal men's labor, than their clothes"
This sounds like a great idea and all, but what incintive do I have for people to use MY bandwith, MY diskspace, etc for others? I know this sounds trollish, but one of the reasons napster was successful is because you could download music from anybody, but you coulnd't prevent people from downloading from you. Therefor you are willing to share your mp3s in return for being able to download some music. Unless you can come up with some reason for people to run the software, I don't see why people would want to run it. (Referring specificly to redcarpet)
No matter how good you think humanity really is, I sure don't see it- I'm a greedy bastard and I want my bandwith.
First of all, what type of security is going to be implemented to prevent hacks. It seems that it would be pretty easy to shore up a single server, or even several in a single datacenter, but it would be a daunting task to protect thousands of machines spread thoughout the world against hackers.
Secondly, what type of redundancy is going to be built into this system. Again, if a file is going to be served from a single centralised machine, it should ideally be fairly reliable, with multiple connections, and RAID to ensure continuous uptime. However, if you're serving tiny pieces of a file from thousands of boxen, it seems to me that if even one of those fragments doesn't make it all the way downstream, the whole file would be worthless. Has Swarmcast done anything to prevent this from happening.
-atrowe: Card-carrying Mensa member. I have no toleranse for stupidity.
The previous poster makes a vary good point. The system seems to depend on a corum which may not exist. How is corum maintained? This is unclear.
This is the same problem currently being experienced with Mojo Nation, a P2P file sharing system which implements a similar method of delivering blocks of data from various locations then re-assempling the requested file.
Perhaps OpenCola's solution to this problem is explained someware on their website but I looked and didn't see it.
--CTH
--
--Got Lists? | Top 95 Star Wars Line
-- Chris
-- Chris
$email=~s/[^a-zA-Z0-9@.]//g;
You've successfully quoted the Sun Java Web Start license. Now please click the word 'license' in Swarmcast's about box and you will see that there are no such reservations or claims.
You are the weakest link. Goodbye!
Thing is, I don't see why anyone would use this. In reality, the transfer rates aren't as good as a single fast dedicated server (I can easily get 75-80KB/sec on this line), so there's really no gain on the client end from using this. On the server end, yeah you're using a lot less bandwidth. This might be useful for open source projects or other products that are downloaded by knowledgable people, but you're average computer luser isn't going to want to download 6MB of Java Runtime/Swarmcast Client just so they can save your company money by getting slower-than-normal downloads. (And I shudder to think what a distributed p2p network comprised mostly of 56k modems would be like, at least at the moment it seems most of the users online are broadband.)
Anyways, pessimism aside, here's what I'd like to see:
OK, I'm done for the moment. If they play this right I can see it helping out smaller outfits with knowledgable users, maybe eventually even going mainstream if they can convince people it's worth their while to install. It's definitely sparse on information in it's current form, but hey, it's a beta. It's a good idea, and I wish them luck.
-Jade E.
Actually, I'm working with a few guys on a program similar to this, and the technology is a "killer app". You don't need to have X users online in order for all the packets to be available. Essentially, everyone that has downloaded a "swarmcasted" file will have all the packets necessary to produce a complete file. The more people with all the packets, the more routes the packets are able to go through. If only two clients are up, you can DL from two routes simultaneously. If 5, then you can go 5-way, etc etc. This helps when.. say.. each of 5 routes are bottlenecked at 10k/s. You can effectively get a transfer rate of 50k if you simultaneously DL from all 5 routes, or, if you like the traditional way, you can take your pick of which source to DL from and get stuck at 10k/s. Imagine if you could simultaneously DL all packets from all your neighbors for a linux distro (assuming they all have the distro you want) instead of having to go through some central server a thousand miles away with bottlenecks all over. Even if you have a T-3 and they all have DSL's, you effective get a througput of N*M (N = number of neighbors with the distro, M = the max DTR of the DSL modems) The upper limit is either the max combine DTR of all DSL's or the max DTR of your T-3. Of course, that is a best case scenario, but you get the picture. That technology facilitates a minimum transfer rate = the fastest connection with a complete set of packets and a max that is hugely scalable (in theory) to nearly infinity.
...
string* plamenessFilter =
*plamenessFilter = "Flaming Death!!";
This is a great idea! It'll let high traffic sites like kernel.org distribute the work out to others. I mean it's nice that they have a 100 Mbit pipe thanks to donated bandwidth, but now they can more optimally use it by allowing users to share portions of files between each other.
Don't get me wrong, there are plenty of problems with this system also. Security, lowering the bar for newbies and other problems have been brought up in other posts. I won't rehash them here.
However, this will go a long way towards helping the open source community more easily share files -- without requiring that you beg some private money making companies to donate bandwidth. And that in my mind outweighs some of the negatives.
I especially like that they forbade sharing any files for which you don't have a copyright. Yes, I'm sure some people were thinking of using this for mp3's, warez, etc, but that would have killed the general acceptance of what is a great way to scale file downloading.
Just my $ 0.02 worth.