Kernel.org Needs Some Help, Perl Foundation Got Some
Dante wrote in to say "I just read this on the Linux Kernel mailing list, it's from Peter Anvin, one of the ftp.kernel.org maintainers...
H. Peter Anvin writes: "The recent troubles we've had at kernel.org pretty much highlight the issues with having an offsite system with no easy physical access. This begs the question if we could establish another primary kernel.org site; this would not only reduce the load on any one site but deal with any one failure in a much more graceful way.
Anyone have any ideas of some organization who would be willing to host a second kernel.org server? Such an organization should expect around 25 Mbit/s sustained traffic, and up to 40-100 Mbit/s peak traffic (this one can be adjusted to fit the available resources.) If so, please contact me."
In related news, mbadolato wrote in to tell us that "there's a press release over at dyndns.org announcing that they've donated $20,000 to the Perl Foundation!
'Thanks primarily to Perl and other Open Source technologies, we are able to provide DNS services to over 180,000 members of the Internet community,' said Tim Wilde, founder and chief executive officer of DynDNS.org. 'This is our way of giving back to some of the people whose tireless devotion to writing quality software has enabled us to provide our services to the Internet community over the past three years.'
The donation page for the Perl Foundation can be found here
To persuade RH, AOL willing to sponsor second kernel.org site.
-- Hasbullah bin Pit (sebol)
What is the composition of these bandwidth requirements? I mean, if it's primarily file download, requirements could maybe be met by a decentral system. Just curious...
Perl Foundation Got Some
Well it's about time! I couldn't bear to think about those 45 year old GNU hippie geek virgins working at the Perl Foundation anymore.
-Metrollica
Read my UPDATED journals!
Wow! Because of this donation to the Perl guys and gals, my check is in the mail.
I use DynDNS, and have been thinking about sending them *something*. I don't have much, but to see them donate a little something in return is nice. Any donation is cheaper than getting a 'real' domain name. Plus *.ath.cx is kinda cool, I wonder if goatse.ath.cx is available?
I just hope all these donations don't go to stuff like strippers. I could be spending my money on that.
Get your Unix fortune now!
But I would imagine that everytime a new kernal is released that world+dog go to view the site. I serious doubt that everyone who goes to the site downloads, most just read - lots of people reading still requires a fair old chunk of bandwidth.
/. effect seemed to bring that site to its knees, but as regular news sites see linux as more and more relevant to their audience they too will link people in, adding to the problem.
/. effects which stop people going to your site and could potentially discourage them from going there in the future.
As we saw earlier in the week the
Realistically they are victims of their own success - people want information about the new kernal and doubtlessly they want to download stuff too.
As these once limited interest sites become more mainstream, then it's clear that they need to maintain quality of service, and that means no
Just my 2c
Is kernel.org a 501(c)3 org so that whomever decides to donate bandwidth can have 'some' help and not take all the expense of moving that much traffic?
The solution to the problem is really quite simple. As Larry McVoy, who maintains the powerful but non-free BitKeeper RCS system and knows a thing or two about patches, has hinted towards kernel.org may be better off not providing a tarball for each release, instead providing some kind of utility that downloads the latest available full kernel, but only if necessary, plus patches. I'd be all for it. In the meantime, there are a number of incremental patching systems for the Linux kernel that automatically download patches, verify their signatures and patch the kernel which may be worth looking into to save time, bandwidth and resources:
Of course, it goes without saying that everyone should still use their local mirror, particularly as kernel.org will only be accessible to mirrors for the forseeable future.
If only there were some organization out there with a vested interest in linux. One that owes its existence to linux. Preferably one with a history of involvement in the linux community. Maybe even some corporation that runs it own websites dealing with open source issues. And while we're wishing, maybe even some entity with experience dealing with massive traffic requirements similar to the dreaded 'slashdot effect.'
Nah, nothing comes to mind. Shame.
The Revolution. Now available as a convienent six tape series from PBS.
Maybe Microsoft could host kernel.org.
Then again, maybe not...
They made of lot of claims that they support open source etc.
Owner of a Mensa membership card.
25 Mb/s = 3.125 MB/s = 187.5 MB/min = 11.25 GB/hr = 270 GB/day = 8.1 TB/month
Looking for any old 8-bit Heathkit/Zenith software/hardware - http://heathkit.garlanger.com
Didn't google say recently how they save so much money with Open Source, etc etc etc?
:/
They probably have that kind of bw...
This is just another case, where the Freenet Project could help, in the future.
Besides being an anonymous (but authentic) information storage, it is also higly distributed.
In this case, that would mean there would be no "bottleneck", instead, the kernel tar.gz would be distributed, in small blocks.
Too bad it's yet under development, but it's getting better and better.
How 'bout if people who use P2P like Edonkey, who downloaded the kernel-source, just put it in their shared directory? That would distribute the load a little bit
Could some form of broadcast or streaming help?
What about Net-News it is an existing system that could distribute the patch to many of the people within a day.
The new kernel could be released,
mirrors and approved developers could have access to kernel.org for the first 3 days. Then only be patch downloads from kernel.org for the next 4 days.
BUT through net-news and most people would have it in a day.
Asking for a big chunk of bandwidth and centralized management is the problem. It's expensive. Instead:
- Use the existing file sharing networks
- Netnews (I can get the file faster from my ISP's news server than anywhere else), and software like pan makes getting all the pieces trivial.
- Are there any open file sharing projects that we could use? Something that limited to a single download per user wouldn't be onerous. There are lots of cable/DSL linux users.
Can You Say Linux? I Knew That You Could.
Before starting a download, eople should get the answers to two questions:
1) Who *REALLY* needs to update, and why?
2) How to patch an older kernel.
Perhaps I'm uneducated about what all is out there currently. But it seems to me that with a common base of GNU, Open Source Software, etc.. The building of the Public Common Wealth of computer operating systems and the benefit this is providing to everyone around the world, that there should be some sort of Sponsorship type of program or organization that would help to streamline the searching for and finding, the matching up of corporate sponsors to software projects.
Would it be so bad that in return the Sponsor gets a mention in the source code and perhaps even in any "about this program" information box or command line option?
A old paper of mine that might generate some ideas
How about some of those porn sites that use Perl extensively donating some of their profits?
Of course, maybe they do - if I was getting bucks from porn people I might not be issuing press releases about it :)
1) Only allow access by mirrors and those ACTUALLY working on the kernel (ie, the kernel maintainers).
2) Get more mirrors. We're talking like several thousand here. As an ISP, I know I would not mind hosting a mirror, but I cannot afford $25,000/month in bandwidth. Splitting up the load using a large number of mirrors would make it MUCH cheaper to mirror the kernel files.
3) Use a highly-efficient load-sharing/balancing mechanism to direct people to mirror sites. Make it so the user can browse/select the files from the main kernel.org site, but the downloads are redirected from there to the mirrors.
4) Use a better patch process to reduce the size of the average download: 1) The x.x.0 release is the only full download, 2) use a patch system that downloads all the necessary updates, applies them to the x.x.0 version (or whatever the version the user already has) to get the latest version, and 3) MD5 checksums EVERY file to verify that it was patched correctly.
-SS "Teach the ignorant, care for the dumb, and punish the stupid."
It's been up on the kernel.org site for a while now, and it only sends binary diffs, so it's quite light. The overhead is like 1%, plus the diffs. In fact, Tridge wrote this exact case (rsyncing the linux kernel) up in one of his early papers on the need for rsync. rsync.samba.org for more information.
I'd rather download my latest kernel from a known and reputable source.
P2P is not the way to distribute a critical thing like the kernel source. It only takes one individual with an malicious intent to spread a virus in the kernel itself! Linux has been virus free for over 10 years and I would personally like to keep it that way.
Just do what a large number of larger sites like Yahoo do, and ask the mirror list (currently over 100+ sites) to act as full mirrors, and round-robin the dns.
Further, make kernel.org alpha.kernel.org, and have alpha be the site everyone mirrors from, and restrict access to it to only core kernel developers.
Overnight, you'd have taken care of the problem.
GPL'd web-based tradewars themed space game
That said... An idea struck me. Suppose kernel.org develops a system where incoming requests are sent to a server; the routing is based on the preferences of the admin of that server. For example, let's say I work at a small webhosting company, and have a couple of T3s. (I don't really.) All our servers run Linux, and I want to give back to the community, and show everyone how cool I am. But I'm gonna go out of business if I allow 90 Mbps of bandwidth to be going to kernel.myfakelittlehostingcompany.com, because my customers wouldn't have any bandwidth.
So I decide "Well... I can spare 10 Mbps at the most." I could tell the kernel.org admins this, and when you went to kernel.org, you would be redirected to a site, based on what the mirror sites wanted.
I'm willing to be that companies like OSDN, RedHat, Mandrake, Rackspace, etc. might be willing to let a kernel.org mirror have a small bit of their bandwidth, if they had a way of knowing that it would be controlled.
________________________________________________
suwain_2
Rsync is going to want to work on uncompressed tarballs or plain old unpacked source trees. (diffing .gz or .bz2 files doesn't work well, your first change usually causes the entire remainder of the file to be different) That is fine for bandwidth because it will compress the data before sending, but you do need to watch out for CPU use. My very rough estimate is that pumping out 50Mbit/sec of traffic with rsync is going to take something like a pair of top notch ia386 cpus.
:-)
.o files and header links and I'll always be doing a full build, plus if I need to do a quick 'forgot a module' build my kernel version will have changed. If I do not turn on --delete it would mostly work, but I could accumulate obsolete files and there is a danger of date stamp problems.
I think it would still be a win. CPUs are cheap compared to bandwidth, but it does change the hosting dynamic a bit. You can't just use a nasty old desktop PC or virtual server to soak up the excess bandwidth. You need something with a little meat to it. (Not to disparge virtual servers, but they usually have paying clients that care if their CPU gets saturated.)
Now that you mention it, it is such a good idea that I will set one up today. I can't publish the access to it. I only have 2mbps uncomitted and that won't go far on a slashdotted kernel loader.
I suppose I will settle for rsyncing the tar file around. It is seductive the rsync the unpacked source tree, but if I turn on --delete then it will whack my
In FreeBSD, CVSup is used to keep source trees in sync. It's a very efficient way of keeping several hundred megs of source code up to date.
/etc/make.conf (once)
/usr/src && make update
I realise that CVSup is oriented towards CVS trees, which the Linux kernel isn't, but even an rsync server would be better than continuously downloading the patch.
The reason I mention this is because of the support infrastructure available in FreeBSD:
1. Install cvsup (once)
2. edit
3. cd
CVSup is available at http://www.polstra.com/projects/freeware/CVSup/
Why not AOL?
Because AOL Time Warner funded the DMCA.
Will I retire or break 10K?
If iBiblio is willing to host Propaganda, i'm sure they're more than willing to host a kernel.org mirror. In my experience, they've been wonderfully good hosts and run a very professional operation. Better still, they aren't hiding alterior motives by hosting free software projects, unlike the two-letter chameleon we've all grown to hate over the past year or two.
As for SourceForge, I wouldnt bother..The company that runs it turned its back on the community that made it's existance even possible. That alone should dissuade anyone. More tangible perhaps would be that the company has only one product (which they cant sell), and only enough cash on hand to last another year at most.
Cheers,
Bowie J. Poag
Burris
I'm willing to be [sic] that companies like OSDN, RedHat, Mandrake, Rackspace, etc.
Not Mandrake at least. They, wisely, don't host a thing. It's all mirrors and it works well, especially since most people are downloading 650MB ISO images. Something kernel.org should think about. The only problem with that is they need fast syncing of the mirrors, because a lot of -pre patches are only tested for a few days until the next one comes out....
"Karma can only be portioned out by the cosmos." -Homer Simpson
Twenty thousand dollars is, what, two months of the burdened cost of a single mid-range software engineer? Why is this worth an exclamation point? Many organizations pay several full-time programmers to work on open source projects -- any one of these organizations exceeds this tiny donation in a week.
Tim
I would suggest Canadians start using the Canadian Communications Research Centre's servers. They do have the bandwidth, especially to University students (CA*Net III and other academic/research networks) who are probably a large amount of users of the Linux kernel.
Incidentally, just some of the files available via rsync from ftp.crc.ca (which, sadly, has an anon-ftp limit of 25 users):
Perl CPAN mirror
GNOME desktop and utilities
Linux HowTo's
KDE desktop and utilities
XFree86
ALSA Linux sound drivers
Debian Linux
Debian Linux ISO images
FreeBSD
Alexy Kuznetsov's IP Routing Tools for linux
Blackdown's port of JAVA for Linux
CRC's Linux Kernel Archive (I wonder if this is different from the standard kernel? they don't say "CRC's" on everything)
CRC's RedHat mirror
CRC's RedHat Contrib (interesting)
Slackware Linux
SUSE Linux
TurboLinux
CRC's VQEG Digital Video Experiments
CRC's XAnim mirror
So if you are Canadian and use any of these software packages (or the others on the page I linked), PLEASE use this site, it's extremely fast on broadband and even more so to university students. I used it for my Debian packages until they dropped the limit on FTP users. Maybe if I ask real nice they'll give me a login....
The site itself is interesting too. Neat stuff.
--Dan
That's an idea! Linus should ask Bill G to front the green for the Linux kernel site. I know Billy-boy would do it. He's all for helping the community... ;-)
"The recent troubles we've had at kernel.org pretty much highlight the issues with having an offsite system with no easy physical access. This begs the question if we could establish another primary kernel.org site; this would not only reduce the load on any one site but deal with any one failure in a much more graceful way."
I'm confused; I don't see how it begs the question. I dont see an attempt to prove anything with itself.