Bandwidth without servers is not that worthy. Do you know any hosting provider where several billion dynamic pageviews and terabytes of content would fit the 750k$/year bill? I'd be glad to hear it!:)
Obviously donated money doesn't go to someone's Porsche budget. All expenses are shown in public budget reports. All purchases are shown in purchase reports. All of them can be seen on http://wikimediafoundation.org/ - it's quite transparent there.
Running a read-only site would be much easier, we could do that with much smaller budget. What money is spent for - supporting collaboration infrastructure. We're running on 100 servers now, all quite cheap and efficient. We're pumping out 500mbps of information now, but we're still doing that low budget. But it all needs to grow and scale, and though software is doing that quite well, resources are needed.
This is very low-budget operation, comparing to other huge sites. There's no corporate funding, no huge revenue streams. I've seen sites running with same budgets but only 1% of Wikipedia's load. A donation made will go into collaboration infrastructure, rather than being forgotten forever. A donation made may allow thousands of articles to be created, extended and viewed. There is a price for information, but you won't find lower margins;-)
I did find X++ trademark amusing. They could review lots of code and find trademark infridgements. On the other hand, it is a trademark for a letter Y. Poor Yahoo.
AAA! Every time someone suggests peer2peer wikipedia, we bang our heads against the wall. think of lag times... How fast do you get a.jpg file from BitTorrent? Or... Kazaa, or whatever.
Total bandwidth now at peak times is ~200mbps (our text content is usually compressed).
We need distributed caches for speedy content delivery, not some bloated and overhyped p2p's.
:-) hey, come and help doing that! all input is greatly appreciated. if you managed to sync data between two schemas without any downtime, it would be great. no sarcasm, we need that functionality.
and... regarding db servers. we have like... 6 working boxes, dealing with 6000-9000 sql queries per second. and we have got 200gb of data.
we want to deploy global reach. pages from a server 1000km away load _much_ faster than from 10000km. hey, light speed is there, and packets travel back and forth and cause delays, etc.
a new cluster was deployed in Amsterdam, which serves whole Europe, thus by reducing page loads multiple times.
Well, first of all, everything grows. Number of user increases all the time - doubles every two or three months. Number of pageviews increases as well. And last but not the least, there are more and more, bigger and bigger articles with more and more of history.
Wikipedia is growing and it is running on really low-budget hardware. And... every time we make a site running faster, more users come and use available resources. Therefore, we can do two things. Optimize our software platform and increase our hardware capacity.
There are questions why are boxes added in Seul. We're trying to bring content as close to people as possible. Light speed means slow in information age. We already have donated cluster in Amsterdam, which serves all Europe, we want to have same or better capabilities in Asia. And sure, we're improving constantly our main cluster in USA.
Why we really need that much cpus? Wiki means a website with a content that could be edited last second. It cann't be desynced, as editing outdated content isn't that sane. Also, it doesn't simply serve HTML content. In wiki all documents are related, links tracked, document quality observed, etc. Therefore, for a task, that might look quite simple, we need quite lots of servers. We could serve those poor 2500 requests / second (~1500 pageviews per second) with two or three web servers, but.. hey, EDIT THIS PAGE.
Well, distributing a wiki is a task a bit more complex than distributing search index (async!) or seti@home (async). You don't care in async data arrays wether the packet you sent to some node is hour or day old. You care about that in wiki, because every user will be pressing 'edit' button, and data should be consistent everywhere.
We are working on distribution.
Distributed caches - now majority of hits are served by caches, and some of them are offsite. It was a pilot project for a while and now we're trying to design and build scalable infrastructure for that. But still, lots of edits are served uncached.
Distributed file systems - are there any? NFS is single-server system, MS has something, PVFS has no redundancy, GoogleFS is closed and not released, Coda, AFS, all of those just don't work. Right now we're trying to develop MogileFS (the perl-based app-level file storage by LiveJournal) store and sure there are other ideas.
Distributed database - there are no proper large database multimaster opensource solutions. MySQL with replication and transactional data store is used. In this event it would be great to have second datacenter nearby with additional DB replicas and gigabit interconnection, but that costs money. And app-level bidirectional replication is in plans for both MySQL and PostgreSQL. And SAN deployment is too costly.
And yes, MediaWiki code has PostgreSQL support, but migrating from one database to another without proper tests, benchmarks and insurance isn't very mature.
You're quite right. Every time cluster works fast, more users come and load it;-) Running wikipedia is a tough exercise of constant resource planning and distribution, fast emergency responses and stuff. Sure in corporate environment such site would run on hundreds of servers, but it would still require number of developers;-) Whole project - content, software, hardware - is evolving in such manner, that not too many of companies did that. IMO it's wikipedia, which trains skilled engineers, and those are ready for hardcore environments;-)
And... sure, there were errors and things, that could have been made better, but that's what opensource is - bringing the best of ideas, and giving them away.
-bash-2.05b$ telnet 64.94.110.11 25 Trying 64.94.110.11... Connected to sitefinder-idn.verisign.com. Escape character is '^]'. 220 snubby1-wceast Snubby Mail Rejector Daemon v1.3 ready ehlo dammit 250 OK mail from: 250 OK rcpt to: 550 User domain does not exist.
I have hundreds of CD's stuffed with various software (I'm developer of it as well), ISP accounting data (just in order to have whole history). CD-R's and CD-RW's is really affordable media for data you don't use often, but you have to keep.
So...
How will recording industry pay to free software developers?
How will recording industry pay to those, who keep their backups, accounting data, and similiar stuff, which has no relations to music?
How will recording industry pay to independant artists?
Why not charging floppy disks, mini discs, zip's etc? They're recordable as well, and can hold files, documents and of course mp3's:)
I can't find the answer. I hope Canadian government does. BTW, notebooks are also quite good mp3 players, and they've got HUGE hard drives. As well I can mention mp3 workstations or... mp3 servers with terabyte raid arrays.
Re:All domains resolve!
on
.biz Open For Biz
·
· Score: 5, Informative
It will have plenty of negative impact.
SMTP sender check is no longer valid.
Instead of bouncing 'domain not found' messages smtp servers will have to wait until port 25 responds (argh, tons of mail in queues, tons of bounces later)
Scripts checking for existing domain (host) names will have to be redone to check the stupid damn undocumented IP address instead of normal NULL answer in resolver libs... Guys, this isn't only domain registration (more banners for registrars page) thing!
Rest in peace John, but the organisation you left isn't listening to your advices and design. Your voice is required here.
It is not standards conformant. Internet should be based on standards, because without normal common language we won't be able to talk.
I am working for ISP in.lt, that also focuses on web hosting and mail solutions and in both spheres I really hate what happens right now. As well as unneeded.biz TLD - maybe it has some urgency in USA where guys forgot they've got local TLD (.us) and even classified ones.
I hated new.net with their new suffixes, but right now I'm really angry about what official institutions (ICANN) are doing - adding banners to non-existing A records. Pals, this is application implementation, how it should display non-existing pages, not network information backend's.
Bandwidth without servers is not that worthy. Do you know any hosting provider where several billion dynamic pageviews and terabytes of content would fit the 750k$/year bill? I'd be glad to hear it! :)
You could have put a link to expense report, then I could mod you 'funny'. Now you're just troll ;-)
Running a read-only site would be much easier, we could do that with much smaller budget. What money is spent for - supporting collaboration infrastructure. We're running on 100 servers now, all quite cheap and efficient. We're pumping out 500mbps of information now, but we're still doing that low budget. But it all needs to grow and scale, and though software is doing that quite well, resources are needed.
This is very low-budget operation, comparing to other huge sites. There's no corporate funding, no huge revenue streams. I've seen sites running with same budgets but only 1% of Wikipedia's load. A donation made will go into collaboration infrastructure, rather than being forgotten forever. A donation made may allow thousands of articles to be created, extended and viewed. There is a price for information, but you won't find lower margins ;-)
iPod wasn't just providing same stuff with some click wheel. It brought a rather unused concept into masses.
Therefore, iPod may be killed by a new concept only. Let it be... direct audio->brainwave projection or audio-pills.
I did find X++ trademark amusing. They could review lots of code and find trademark infridgements. On the other hand, it is a trademark for a letter Y. Poor Yahoo.
Total bandwidth now at peak times is ~200mbps (our text content is usually compressed).
We need distributed caches for speedy content delivery, not some bloated and overhyped p2p's.
and... regarding db servers. we have like... 6 working boxes, dealing with 6000-9000 sql queries per second. and we have got 200gb of data.
in two or three days... ;-)
we want to deploy global reach. pages from a server 1000km away load _much_ faster than from 10000km. hey, light speed is there, and packets travel back and forth and cause delays, etc. a new cluster was deployed in Amsterdam, which serves whole Europe, thus by reducing page loads multiple times.
Well, first of all, everything grows. Number of user increases all the time - doubles every two or three months. Number of pageviews increases as well. And last but not the least, there are more and more, bigger and bigger articles with more and more of history. Wikipedia is growing and it is running on really low-budget hardware. And... every time we make a site running faster, more users come and use available resources. Therefore, we can do two things. Optimize our software platform and increase our hardware capacity. There are questions why are boxes added in Seul. We're trying to bring content as close to people as possible. Light speed means slow in information age. We already have donated cluster in Amsterdam, which serves all Europe, we want to have same or better capabilities in Asia. And sure, we're improving constantly our main cluster in USA. Why we really need that much cpus? Wiki means a website with a content that could be edited last second. It cann't be desynced, as editing outdated content isn't that sane. Also, it doesn't simply serve HTML content. In wiki all documents are related, links tracked, document quality observed, etc. Therefore, for a task, that might look quite simple, we need quite lots of servers. We could serve those poor 2500 requests / second (~1500 pageviews per second) with two or three web servers, but.. hey, EDIT THIS PAGE.
sysadmins are developers. developers are sysadmins. this is the way to make site running low budget and high profile.
- Distributed caches - now majority of hits are served by caches, and some of them are offsite. It was a pilot project for a while and now we're trying to design and build scalable infrastructure for that. But still, lots of edits are served uncached.
- Distributed file systems - are there any? NFS is single-server system, MS has something, PVFS has no redundancy, GoogleFS is closed and not released, Coda, AFS, all of those just don't work. Right now we're trying to develop MogileFS (the perl-based app-level file storage by LiveJournal) store and sure there are other ideas.
- Distributed database - there are no proper large database multimaster opensource solutions. MySQL with replication and transactional data store is used. In this event it would be great to have second datacenter nearby with additional DB replicas and gigabit interconnection, but that costs money. And app-level bidirectional replication is in plans for both MySQL and PostgreSQL. And SAN deployment is too costly.
And yes, MediaWiki code has PostgreSQL support, but migrating from one database to another without proper tests, benchmarks and insurance isn't very mature.You're quite right. Every time cluster works fast, more users come and load it ;-) Running wikipedia is a tough exercise of constant resource planning and distribution, fast emergency responses and stuff. Sure in corporate environment such site would run on hundreds of servers, but it would still require number of developers ;-) Whole project - content, software, hardware - is evolving in such manner, that not too many of companies did that. IMO it's wikipedia, which trains skilled engineers, and those are ready for hardcore environments ;-)
And... sure, there were errors and things, that could have been made better, but that's what opensource is - bringing the best of ideas, and giving them away.
-bash-2.05b$ telnet 64.94.110.11 25
Trying 64.94.110.11...
Connected to sitefinder-idn.verisign.com.
Escape character is '^]'.
220 snubby1-wceast Snubby Mail Rejector Daemon v1.3 ready
ehlo dammit
250 OK
mail from:
250 OK
rcpt to:
550 User domain does not exist.
So...
I can't find the answer. I hope Canadian government does. BTW, notebooks are also quite good mp3 players, and they've got HUGE hard drives. As well I can mention mp3 workstations or... mp3 servers with terabyte raid arrays.
I am working for ISP in .lt, that also focuses on web hosting and mail solutions and in both spheres I really hate what happens right now. As well as unneeded .biz TLD - maybe it has some urgency in USA where guys forgot they've got local TLD (.us) and even classified ones.
I hated new.net with their new suffixes, but right now I'm really angry about what official institutions (ICANN) are doing - adding banners to non-existing A records. Pals, this is application implementation, how it should display non-existing pages, not network information backend's.