Metalinks Tries to Simplify Downloads
ant_tmwx writes "Metalinks collect information about files in an XML format used by programs that download. The information includes mirror lists, ways to retrieve the file on P2P networks, checksums for verifying and correcting downloads, operating system, language, and other details. Using Metalinks details the Free Software programs you can use to download them with. There are also clients on Mac and Windows. With a list of multiple ways to download a file, programs can switch to another method if one goes down. Or a file can be downloaded from multiple mirrors at once, usually making the download go much faster. Downloads can be repaired during transfer to guarantee no errors. All this makes things automatic which are usually not possible or at least difficult, and increases efficiency, availability, and reliability over regular download links. OpenOffice.org, openSUSE, and other Linux/BSD distributions use them for large downloads."
http://www.exampleURL.com/file.metalink
Or, depending on your location, use one of these mirrors:
http://uk.exampleURL.com/file.metalink
http://nl.exampleURL.com/file.metalink
http://de.exampleURL.com/file.metalink
Seriously though, I like the basic idea, but the system does add an extra point of failure.
Are there clients that integrate (ie: extensions) for Firefox, IE, Safari, and Opera? If there is proper integration with these clients (meaning seamless downloading without opening third party download managers), this might actually go well.
It's bad enough when I tell my dad to download a torrent and he complains that a torrent manager client pops up; especially when he doesn't realize that closing the window may not stop the torrent.
Help! I'm a slashdot refugee.
You can do that with computers? Honestly, how hard is it to pick your OS out of a list of download links?
/.
:-)
I don't think it's a bad idea, I just don't get why it's on
Or in the case of us gentoo folk, just build the damn program and eliminate any doubt
Tom
Someday, I'll have a real sig.
From their page:
Why should you use it?
Users
Your downloads will be simpler, faster, and more reliable...without you doing anything differently.
Bittorrent already does this just about as effectively as this idea will.
Developers
It's a neutral framework that doesn't favor any one program, Operating system, or group, and is easy to implement.
Once again, bittorrent is just as easy. And its OS agnostic.
Site owners
Resume and recover from single servers going down.
Sorta an issue with bittorrent, but not really. House the seed in multiple locations. Or better yet, have your clients take a copy of the seed and share that with their peers in the case of a downed server.
Downloads can automatically be split between sources (mirrors/P2P) and all downloads will be verified.
More people can get access to your files easier, more reliably, even at the most heavily accessed times.
This means less retries and cheaper bandwidth and support bills. Saving money = good.
Once again, this is where bittorrent shines. A lot of people going after your files? Great, that means it's got a better availability on your torrent, more bandwidth for everyone.
To me, this looks like a solution in search of a problem.
Mod me down with all of your hatred and your journey towards the dark side will be complete!
If this standard is accepted by IE and Firefox, *then* it will be news. Until then it's just buzzword compliant crap.
It will be great for sites that have a LOT of mirrors. Particularly sourceforge, of which no single mirror is reliably "fast enough" for me.
It does not look like it excludes ANY type of file transfer, if your client supports it you can do it is how it looks to me.
Example - MetaLink XML contains the following formats:
5 different HTTP sites
2 FTP sites
3 BitTorrent Trackers
eMule/Edonkey Hash
Example - Client One has implemented:
HTTP, FTP and BitTorrent
Example - Client Two has implemented:
BitTorrent and eMule
Example - Client Three has implemented:
HTTP, FTP, BitTorrent and eMule
I'm surprised it's taken this long to come up with this sort of client independant format.
Jonah HEX
Horror & SciFi Erotic Nudes
Strangely enough, the OpenOffice distribution page links to a pay-for Metalink client for the Mac, but hopefully, just two links above, there's the cross-platform open source client. That's confusing. I clicked on the Mac client, thinking I would end up at a Mac-oriented free (even open source?) client, but no, one must choose the cross-platform. Nothing really wrong there, only that it's confusing.
Unrelated, I saw numerous attempts of such integrated p2p downloads. The part that got me from the Metalink main page: "Metalink is an Open Standard". This makes me believe I will join the bandwagon. And yup, the Wikipedia Metalink page is (surprisingly?) informative.
Animoog.org
Ok. I just had to follow some more links to realize that the open source cross-platform alternative doesn't offer MacOS X binaries yet, one has to compile from source. That's a showstopper for most Mac users... well, with Metalink getting more attention, I'm not worried, mature clients will come soon.
Animoog.org
I tried to find additional info about location embedding but haven't succeeded so far. From the wikipedia page, the XML code includes url type="http" location="uk" preference="90">http://www.example2.com/example.ex t /url, the part I don't get, as a geospatial professional (see sig), is why the location is encoded as the country code. What are the reasons? Does it make more sense to encode location with a simple lat-lon values (similar to, say, the georss standard do). Some countries being so large, I fail to see the country as a good indicator of distance between computers. What did I got wrong?
Animoog.org
I've looked around the site, and I've found no document for the specification of a metalink file. IMO, this will easily lead to many conflicts with different clients each having their own version of "Metalinks".
Is there a metalink repository somewhere? A place where people upload a metalink file for all the data sources of a popular file?
When servers are bogged down I'm often looking for a collection of mirrors and download sources (a metalink file) but I don't know where to go; most sites don't provide metalink files themselves.
I have learned to become very sceptical when some technology built around XML claims to bring "simplicity". That's often just not the case.
Take AJAX. Anyone who has developed even a trivial AJAX application knows it gets ungodly complex, even just when dealing with the XML sent between the client and the server. The use of one of the numerous toolkits is basically necessary if you wish to maintain your sanity.
Apache Ant is another example. Traditional makefiles are quite a bit easier to read, design and manipulate than the XML-based build scripts of Ant. Not only that, but a subset of the make functionality is available on just about every single system in existence. Ant is limited to platforms with a suitably modern JDK, which in the greater picture are actually very few platforms. Size is also a problem. The Ant 1.7.0 binaries, in a zip archive, are 11 MB! The gmake binary on my Linux system, on the other hand, is mere 115 KB.
Anyone who has dealt with some of the J2EE frameworks has also experienced the problems XML brings. Many of the frameworks use various XML files for configuration. These configuration files are often extremely verbose, and difficult for humans to read and manipulate. The non-XML configuration files of Apache and X.org, for instance, are far easier to work with.
Maybe the only place where XML has had a beneficial impact is on XHTML. It has brought some degree of consistency. But that's only because HTML 4.x and earlier were such godawful messes. But then again, XML-style documentation markup is not nearly as clear to deal with as LaTeX or even nroff/troff/groff files.
So I hope they reconsider their use of XML. So far it has proven to do little but hassle users.
This is a good idea. It attempts to formalize something thats been done many times before. We do it manually when download from Sourceforge, Yum has a list of mirrors it does automatically. A standard would be nice. I would like to see a new web protocol for it - ie:
metalink://host.com/file.ml
Then inside file.ml simply a list of URLs and weights...
ftp://host1.com/file.rpm 10
http://host2.com/file.rpm 10
torrent://host3.com/file.rpm 20
etc
XML doesn't help.
But "simplify" and "XML format" in once sentence does not always "return true;". If the number of information stored in XML will grow how much CPU time and storage it will require? Wouldn't it be better to get in into database and provide XML based API? Querying XML is a bit slower than asking any DB. Even SQLite.
Rocksteady, are you ready to ska?
Well, the summary makes it sound divine - one link, one bit of software that accesses P2P, FTP etc interchangeably to maximize download speed.
That seems like a logical growth of Bittorrent.
Trying to figure out exactly what is needed though was another matter. After a half hour and three or four web sites I wound up with the wxDownload Fast Windows download manager and a Metamirrors Firefox plugin.
Is it all working as advertised? Well, stuff is downloading (OpenSUSE 10.2) but I have no idea of it's faster, or even if it's also uploading in P2P fashion.
For God's sake, is it too much to ask that the people behind stuff like this include a simple checklist?
To download using Metalinks you needs
a) A download client (here are links to a few),
b) This Firefox plugin and
c) then do THIS.
Three Squirrels
This seems like an ommision. Ive read through the full spec, and see no mention of them. Oh, you can include hashes for the whole file easily - md5, sha-whatever, should be extendable to whatever you wish. But only for the whole file. The metalink has no way to include support for individual segments. This bit of data would be very usful for detecting inconsistant sources (corrupted mirrors) before they do any damage, rather than obtaining a four-gig file before seeing the "hash failed, please redownload" error. It would also allow the importing of partial files, if just downloading repairs for something corrupted.
A few weeks ago I downloaded a set of files of BT - they got half way, then stalled, every single file was net-dead and incomplete though the tracker was still up. Seeing the 150 other users stuck, I did the polite thing and based on the filenames and sizes located the matching files on emule, then imported those into the BT client and reverified - ive been uploading since, watching the files slowly become fully available. I feel like a hero, but things like that could be much easier with segment hashes.
Oddly, the specification does recormend that clients which are able to download bit both BT and other means - specificly HTTP - should be able to use the segment hashes obtained from the BT tracker to verify data obtained from other sources. So why make it dependent on the BT tracker, rather than just including the relivent segment hashes as part of the metalink file.
If it's a pretty horrible protocol, why has it become so popular? Is it possible that all or some of those design "flaws" you mentionned are necessary for it to do what it does? And if not, why have you not created your own, awesomer fileswarming protocol? In fact, why has nobody else?
And I'm really asking here. Not just disguising my attacks as questions.
It seems like we might as well just use magnet links, since they can include different hash types of the file and different locations to download from. I already have them for certain larger files on my website and work fine if you have a client that supports it.
(\(\
(=_=) Bani!
(")")
If this application doesn't fuck up my router like BitTorrent does, I'm there.
Please, for the good of Humanity, vote Obama.
M users simultaneously download from N servers individually. Or M users simultaneously download from N servers in parallel. Aside from some load balancing, can you really gain dramatically? The same number of bytes must be transfered from the same number of servers.
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
And it exists. Either...
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
See here for a little experiment that illustrates the point and that you can reproduce easily.
And if you follow the replies on that link you get to see why server admins don't like clients that grab more resources than necessary.
It works as long as you are the only one doing it, but when everyone starts using segmented downloads, everyone loses. And you not only
go back to as it was before, you also manage to waste more than the same number of poeple would have before, if they would stick to using
one connection per download.
There has been discussions on the openoffice distribution lists about this too. Not everyone is convinced that having each client opens
loads and loads of connections per download is the desired solution in the end.
-- I'm as unique as everyone else.
These worked really great for the openSUSE release. Why is it good? It takes out the hassle of having to track down a working/fast mirror. Want to download a large DVD ISO *quickly*? Then this is the way. The small metalink file will have a populated list of around 50 mirrors. The client your using works with making multiple connections to all of these, so you will pretty much ALWAYS MAX OUT YOUR CONNECTION. And still get a really safe download. Why? Because it checks checksums on, IIRC, 4-megabyte chunks.
:)
Reliable and fast, and it spreads out the weight between mirrors, AND it can work with torrents. Metalinks are indeed awesome. Wget and kget support for it is in the works, so stay tuned.