Slashdot Mirror


Wikipedia Hits 300,000 Articles

Raul654 writes "Today Wikipedia reached the 300,000 article mark. Wikipedia is a 3-year-old non-profit project to build an encyclopedia using WikiWiki software. All text is licensed under the GFDL. It has everything that a traditional encyclopedia would, but also many things that would never get written about, such as Crushing by elephant and the GNU/Linux naming controversy. For size comparisons, the English Wikipedia has 90.1 million words across 300,000 articles, compared to Britannica's 55 million words across 85,000 articles. (All the languages combined together reach 790,000 articles.) For much of the first half of 2004, Wikipedia's growth has outstripped server capacity - however, the shortage of PHP/MySQL developers is probably the biggest long term problem facing the project. Slashdot had previously reported when Wikipedia reached the 200,000 mark."

47 of 507 comments (clear)

  1. Congrats! by dn15 · · Score: 3, Informative

    So this isn't very informative but I just wanted to say how much I like Wikipedia. I've used it countless times and I consider it an invaluable resources. I only wish more people knew about it. :)

    1. Re:Congrats! by Anonymous Coward · · Score: 2, Informative
    2. Re:Congrats! by CurlyG · · Score: 2, Informative

      I doubt anyone but your archetypal American redneck would argue much with the Wikipedia entry for 'Jihad'. Read it, and you may even learn something yourself.

      --
      You know they call 'em fingers but I've never seen 'em fing. Oh, there they go.
    3. Re:Congrats! by datan · · Score: 5, Informative

      hm...I learnt a lot about slashdot from wikipedia. particularly about the various arts of trolling on slashdot, the bad Russia/Natalie/BSD/Beowulf jokes and of course our good and sorely misssed friend goatse. I even found out about the anti-slash site, where the trolls gather to plot their strategy (Do your civic duty and um... visit this site).

      Here are some very informative links (no surprises, I promise :))

      Slashdot
      First posts and other trolls
      Hall of fame
      The coming of Evil
      A History lesson
      Slash and Burn
      On the AC
      More than just a discussion board
      Our fearless leader

    4. Re:Congrats! by David+Gerard · · Score: 3, Informative

      What tends to happen in an edit war is that either (a) a compromise is approached and the article stabilised (b) someone beats the participants upside the head and locks the article until (a) is achieved. Severely antisocial participants can get banned from editing, though this is avoided as long as possible.

      --
      http://rocknerd.co.uk
    5. Re:Congrats! by Trolling4Dollars · · Score: 2, Informative

      That issue is addressed by the fact that they do leave room for differences by stating the controversies or conflicts. They do this by noting that an entry is "controversial" or "disputed". See the entry on anti-zionism for a disputed entry. See the entry on fascism for a controversial entry. This approach is pretty fair as it does give others the opportunity to be represented. The other thing is that you can't beat the hypertext format for an encyclopedia. In a word, it rocks.

  2. Note the new features by gangz · · Score: 2, Informative

    Definitely, the new look Wikipedia is wonderful to use.The latest news, the selected aniversaries and the did u know section were nice features thought by the folks there. Also the browse by section can be very handy. I have found Wikipedia's explaination on a wide range of topics very useful. It goes on to show how an open collaboration model can be made to produce wonderful results. And congratulations to the people at Wikipedia for achieving this landmark. I hope this prompts more people to contribute.

  3. The Parent Poster by tarunthegreat2 · · Score: 4, Informative

    should also have mentioned that Wikipedia has a whole article on Slashdot Subculture where n00bs like me cut our teeth. Plus The Economist mentions Wikipedia as a successful example of Open Source in this already slash-dotted article

  4. Re:Celebration! by mandalayx · · Score: 4, Informative

    actually Wikipedia is busier than slashdot, according to Alexa.

    And for good reason. (disclaimer: I am a Wikipedia contributor.) Also recommend Wikitravel.

  5. consider donating... by Anonymous Coward · · Score: 5, Informative

    Looks like they can use a few donations:
    http://wikimediafoundation.org/fundraising
    (tax deductable too!)

  6. Re:Funding? by tanveer1979 · · Score: 5, Informative

    No they havent. Frequent shutdowns are there. The best way is to Make a donation. The amount of knowledge on Wikipedia dwarfs other encyclopedias.

    --
    My Aurora : http://www.youtube.com/watch?v=o91ZsGwJYyg
    FB : https://www.facebook.com/TanveersPhotography
  7. Re:Funding? by Anonymous Coward · · Score: 2, Informative
    It was temporarily resolved, but they have problems with maintaining funding stability in the long term. Here is the fundraising page which explains their financial situation (9k dollars -> not much) and is also the place to donate to Wikipedia

    Almost all the money disappears immediately on servers to keep the online editing system going.

  8. Re:DMCA Anyone by Shalom · · Score: 5, Informative

    The 1911 Britannica, from which most of the articles you mention were "ripped," is in the public domain. And most of thos articles were used as starting points for people to work off, as intended. Knowledge has changed a bit since 1911 man.

  9. Re:Goatse by CGP314 · · Score: 3, Informative

    I'm proud to say I contributed to the goatse.cx article.

    Don't be shy, post the link.

  10. Re:from the GNU/World departement by CountBrass · · Score: 2, Informative

    I had a quick read of that article and the two sides can be summarised as: GNU/Linux: "Credit where credit is due please" and "Linux is inaccurately applied". Linux: "It's the term commonly used therefore we shouldn't change it". Have to admit Linus' quote did make him appear a right little shit.

    --
    Bad analogies are like waxing a monkey with a rainbow.
  11. Re:Goverment Funding by Anonymous Coward · · Score: 2, Informative

    actually, in the last few weeks a group of wikipedians began working on obtaining government grants. See: http://meta.wikipedia.org/wiki/Grants

  12. Re:Copyright by MaelstromX · · Score: 5, Informative

    Actually, there is a system in place to combat this potential problem. This page shows some of the recent instances of possible copyright infringement that will be fixed.

    I personally was responsible for pointing out an entry that was copied wholesale from an author's (copyrighted) web page containing electronic versions of his work. I did so after I noticed some of the language was kind of suspect, and Googling some of the phrases found the copyrighted work.

    With the massive amounts of traffic Wikipedia gets, and as a result more people like me reading the pages, this problem tends to fix itself rather quickly. The same goes for fears of massive vandalism -- it gets fixed very soon.

  13. Re:Wikipedia Interview by presroi · · Score: 2, Informative

    Actually, there has been an interview years ago (/. seems to be an early adopter :))

    here is the announcement and here's the interview.

    Well, It could be time for an update on what has happened within the last three years.

  14. donation-based wikipedia by Anonymous Coward · · Score: 4, Informative

    Please mod this up out of importance.

    Please don't forget that Wikipedia is totally advertisement free and free information. In order to make this possible, you're donations are greatly needed. Please donate and help to keep this information free and available for all of us.

  15. Re:Celebration! by Anonymous Coward · · Score: 2, Informative

    Note that the Alexa statistic only counts traffic from Alexa toolbar users. Given the different, if overlapping, target demographics of Slashdot and Wikipedia, this result should not be identified with actual traffic.

  16. Wikipedia keymark by Anonymous Coward · · Score: 4, Informative
    I use the following as a Mozilla keymark to quickly access a Wikipedia article. It takes advantage of Google's "I'm feeling lucky" feature to generate a redirect to the page I want. Name the keyword wiki (Right-click -> Properties -> Keyword) and type wiki search terms in the location bar.
    http://www.google.com/search?q=%s site:en.wikipedia.org&btnI=I'm+Feeling+Lucky
    1. Re:Wikipedia keymark by danila · · Score: 2, Informative
      Personally I use the following Opera shortcut:
      http://en.wikipedia.org/wiki/%s

      Much simplier if you ask me, and also faster.
      --
      Future Wiki -- If you don't think about the future, you cannot have one.
  17. Funding - situation, what we spent the money on. by Anonymous Coward · · Score: 5, Informative
    The funding isn't resolved. We're about to spend another $20,000 or so on more equipment and that will exhaust the currently available funds. If you look at the Ganglia cluster stats you can see that the web servers are pretty heavily loaded.

    Longer term we're working on how to scale the databases (which of the many options to use). We're using three at the moment, one primary writes, one for slow queries and one for backup, the latter two both being replicating children. For data see:

    1. Squid statistics showing total and cache hits. You can see the rise when the listing here appeared at about 09:30 UTC.
    2. Ganglia cluster stats showing load. The ones which are mostly blue are the web servers, the red/blue mixed are the squid caches (about 60% max is right for max load for them or connect time suffers) and Suda and Ariel are the most heavily loaded database servers. Suda is disk-limited. Ariel is memory/CPU bandwidth limited because it has faster disks, more cache and different workload.

    For what we did with the previous donations from the start of the year see:

    Our growth is pretty simple: when we're fast we grow to use all the capacity until we're slow again. Still no sign of us hitting the limit on demand, so it appears that we'd have no problem at all serving more people if we had another $50,000-100,000 to spend - there are ballpark growth estimates suggesting that we'd end up doing that by the end of the year if we could stay fast until then.

    If anyone wants to donate, as one of the hardware people, I'd rather see monthly recurring payments of a smaller amount than a lump sum. It makes it easier for me to try to predict what we can buy based on some moderate predictability of available funds.

    One common question: can we use commodity PCs as web servers? We'd like to but fitting them in the colo isn't currently practical. We're going for dual CPU 1U boxes as the next most cost-effective option for subsequent web server purchases. The Jan purchase was in part about getting enough boxes so we'd be able to switch them around to cover for failures, so those were cheaper per box 1U boxes. We've enough of those now, so it's CPU power/density time.

    If anyone has any suggestions please feel free to drop comments on the talk page - we've a dozen or so people on the technical team and more input is always welcome, since we're after the most effective options we can find! Jamesday (author of much of the April planning document, one of the technical team members)

  18. Re:Size doesn't matters by Gadzinka · · Score: 5, Informative

    Yes, but Britannica's 85,000 articles are credible and verified for accuracy, while some of Wikipedia's content should be questionned.

    Verified by whom? As all generalisations, this one is also not true ;)

    When it comes to some controversial topics, Britannica gives usually only one theory, presented as a god-given truth. Sometimes it isn't even the most agreed upon theory among scientists of the relevant field.

    I haven't used B. for a long time, since it started to charge for access. Last time I did, it showed ``Arian inviasion'' as the only theory of indo-european language apearing in India.

    Wikipedia on the other hand shows other theories, even some very unorthodox ones from Indian nationalists. But it clearly states that ``Arian inviasion'' isn't highly regarded at least since the fifties.

    Same goes for ``balto-slavic theory'', breaking of Enigma before WW2 etc

    Go, look for yourself.

    Robert

    --
    Bastard Operator From 193.219.28.162
  19. Er, What about E2? by Noodlenose · · Score: 3, Informative
    Everything2 has been around since 2000, has currently 445301 entries, is editor - and peer reviewed and has much better inter-user communication facilities. There is also a strong sense of community and lacks any editorial wars.

    A much more enlightened and pleasant place to be.

    Oh yes, and we have the EDB.

    1. Re:Er, What about E2? by pilkul · · Score: 2, Informative
      It's strange that Everything has changed to the point where people are actually comparing it to Wikipedia. I was an Everything user in its very first days, and back then we noded any nonsense we wanted for fun. But the editors got more and more serious. I left when they made the transition to Everything2. Writing long articles went against the entire spirit of the original Everything, and having people vote you down was nasty. Since then I occasionally revisit and see how many of my old nodes have been deleted.

      Despite sucking all the fun out of noding, Everything is still fundamentally not built to become a useful reference like Wikipedia. The voting system only allows deletion, it's not nearly as powerful as a wiki for peer reviewing. Everything lacks Wikipedia's clear content guidelines and NPOV policy, so much of it is still subjective nonsense. I don't think it's very enlightened at all --- nowadays, Everything is neither fun (to me at least) nor useful.

  20. Re:Exactly how big is this thing? by Seumas · · Score: 2, Informative

    Well, you can download the entire wikipedia database in SQL and do whatever you want with it. That'd also be a good idea to find out how much space it would take up.

  21. Re:Exactly how big is this thing? by CGP314 · · Score: 2, Informative
  22. Re:Exactly how big is this thing? by Yath · · Score: 2, Informative

    At 625 MB, you could fit the text of the current database on a CD. The images will jack it up another 3.6 GB. So you could reasonably fit the current revision on one DVD. If you also want the full record of changes and revisions, it's about 15 GB just for the text.

    You can download this stuff easily, and it's obvious from recent Google searches that many people do.

    --
    I always mod up spelling trolls.
  23. Re:Copyright by Anonymous Coward · · Score: 2, Informative

    It happens routinely. We have something called the Recent Changes Patrol (people who watch the list of recent changes) and just general watchers who notice, check and report as possible copyright infringements.

    If a copyright holder does find something, it's easy enough to edit it out and say why. Or ask us to remove it. We respect polite requests and if we received one, we'd respect a DMCA takedown notice as well. We're after completely legitimate work.

    Not a big deal, overall. Easy enough to handle and it's mostly picked up during the normal anti-vandalism watching that goes on.

    Jamesday

  24. Re:Exactly how big is this thing? by IamTheRealMike · · Score: 4, Informative
    From their site:

    Currently a full database dump total size 14,828MB (501MB for just current revisions). If you thought that's 14.48 gigabytes, you're absolutely correct! At a v.90 modem connection, it will take you only 500 years! (Actually it would take 29 days if you got the full 50,000 bps, but that's usually not the case).

    So, the full encylopedia would currently fit on a CD, but only the most current versions of each page. Bear in mind that's just the database dump though. If you wanted to pre-render it to HTML you'd probably need a lot more space, so it'd be simpler to just ship MySQL and a decent local web server on the CD.

  25. Re:Size doesn't matters by KjetilK · · Score: 4, Informative
    Actually, Britannica has historically had a lot of problems too. Take for example the alleged evolution of Ptolemy's geocentric system. In 1910, the entry on Ptolemy was pretty good. Not anything like modern research, but at least it was a reflection of the general consensus among contemporary historians.

    In the 1950-ties, some got the weird idea that epicycles were added on epicycles throughout the middle ages. This was based on some very bad early research that historians of 1910 may have been aware of, but did not find worthy of elaborate comment.

    Britannica was the publication that really took this to its extreme, at some point they wrote that 40-80 epicycles were added per planet! Not only is it horrendously wrong, it is completely absurd: Nobody in the middle ages had neither observational capacity nor the mathematical methods to deal with anything like that.

    Britannica is largely to blame that this myth could get into university curriculums world-wide as an example of "ad hoc hypothesis gone wrong".

    If you have a good research library available look for articles by Owen Gingerich on Ptolemy for details on this. The facts is that Ptolemy's system was hardly modified at all.

    It was moderated in the 1980-ties, and the most horrendous claims were removed. Around 1995, I still found the articles lacking, as the gist of the articles were that the addition of epicycles was a good example of "ad hoc hypothesis gone wrong", and I exchanged a few e-mails with the editors about it.

    It has been a few years since I last checked these articles, but last time I checked, they still did not reflect general consensus among contemporary historians.

    So, it is very much reason to question articles you read in Britannica as well, not only Wikipedia. The bottom line is that critical reading of any source is a vital survival skill.

    Hm, I'm wondering what Wikipedia has to say about this... Unfortunately, I don't have any time to kill. What am I doing on /.? ;-)

    --
    Employee of Inrupt, Project Release Manager and Community Manager for Solid
  26. Re:Celebration! by arvindn · · Score: 2, Informative

    I think the link you meant to give was the comparison graph between slashdot and wikipedia. Wikipedia passed slashdot in traffic early this year, and the difference has widened since then. By now wikipedia gets twice to three times as much traffic.

  27. mod parent up by tanveer1979 · · Score: 4, Informative
    I guess tech support from slashdot will help Wikipedia a long way. As for the communitypage, the link is this.

    The best ways to help, without donating are:


    Every article you contribute also adds to the wealth
    --
    My Aurora : http://www.youtube.com/watch?v=o91ZsGwJYyg
    FB : https://www.facebook.com/TanveersPhotography
  28. Re:Funding - situation, what we spent the money on by BReflection · · Score: 5, Informative

    In regards to adverts, check out this e-mail by Wikipedia founder Jimbo Wales (It's old but states his position)

    "With the resignation of Larry, there is a much less pressing need for funds. Therefore, all plans to put advertising of any kind on the wikipedia is called off for now. We will move forward with plans for a nonprofit foundation to own wikipedia, and possibly to solicit donations and grants to help us carry out our mission. (Ironically, I think that grant money would come with many annoying strings attached, which we could not accept, comparted to advertising money, which is virtually 100% string-free.) Just as the National Geographic Society is supported in large part by advertisments in the National Geographic Magazine, I expect this to be a potentially necessary thing at some point in the future, if we wish to have an impact beyond our own little corner of the Internet. (And, I think we all do.) But for now, there's no pressing need unless and until we find chaos descending on us from the lack of constant oversight. The hosting of Wikipedia I can continue to do for no charge for the foreseeable future. Even if Wikipedia traffic were to grow by a factor of 10, I would be willing to absorb all the bandwidth and hardware costs. If it grows beyond a factor of 100 or 1000, obviously, alternative solutions would have to be found."

    --
    python -c "x='python -c %sx=%s; print x%%(chr(34),repr(x),chr(34))%s'; print x%(chr(34),repr(x),chr(34))"
  29. If you're using Mozilla/Firefox... by pilkul · · Score: 5, Informative

    You can add Wikipedia to your search bar. Pretty convenient when you know it's going to be better than Google :).

  30. Re:One thing I've missed with Wikipedia... by Jon+Chatow · · Score: 4, Informative

    You can find links to previous revisions through the history page for each article; this will remain available unless the article and its history is deleted, normally due to being full of nonsense and/or a copyright violation. For example, this is just the current revision of the /. article, whereas a version from February this year is here.

    --
    James F.
  31. There is verification. by Famatra · · Score: 3, Informative

    "There is nearly nothing in the way of verification on Wikipedia."

    Are you joking?

    First of all, people may not be generally smart but usually people are smart, very smart, at least one thing and usually it is because it is a topic they are interested in. Such people navigate to their topic of interest on Wikipedia and can can see easily if there are any factual problems. Second, there is nothing illegal about cross referencing a wikipedia article with other sources or encyclopedias to *verify* the facts - The only no-no is copying material directly. Third, there are many 'professionals', professors and other university graduates, who also contribute. There are probably more voulinteering for wikipedia then the total number working at other encyclopedias.

    Plus if you think there are any factual errors you raise the point in the article discussion page, and within hours the issue probably has been reviewed by dozens of people. Believe me, from experience, if someone puts in nonsense or nonfactual information into an article people immediately engage discussion on the point. People, including me :), can really quite anal if they think someone is being blantly false.

  32. And...? by Safety+Cap · · Score: 3, Informative

    Wil's site is WilWheaton.net. The shock site is WilWheaton.org.

    --
    Yeah, right.
  33. Re:An 8 ton elephant? by o'reor · · Score: 2, Informative

    Not exactly, this article specifies that the Asian elephant can weigh up to 7500 kg. Although exceptional, 8 tons does not sound impossible. And African elephants tend to top out at 12000 kg (biggest elephant shot, Angola 1974), not 4000 as you said.

    --
    In Soviet Russia, our new overlords are belong to all your base.
  34. Re:Size doesn't matters by 6Yankee · · Score: 2, Informative

    ...assuming the existence of an infinite number of people...

  35. Re:Size doesn't matters by CanadaDave · · Score: 2, Informative

    With Wikipedia it may well have been written by some guy with spare time on his hands, enthusiasm, but not much knowledge. Or worse, it may have been written by an expert and then "corrected" by Jo Schmo. Trust me, this doesn't happen. Jo Schmos don't have the time to create bogus articles or correct real articles. If there isn't an article on something, it doesn't get written until someone qualified writes it. In some cases Wikipedians write a short article (stub) with not much info and maybe a few external links until someone more knowledgable comes along to to make it better. There are trolls, and these are easily identifiable. As on Slashdot for example. So far vandalism has been a manageable issue. Having complete revision control system for every article helps tremendously. Changes by vandals are backed out all the time.

  36. Re:Size doesn't matters by clap_hands · · Score: 2, Informative

    Just a nitpick: while I agree with you that Wikipedia is unique in its presentation of an array of controversial theories, and that that's a good thing, the pre-WWII history of the breaking of Enigma isn't particularly controversial or unorthodox. Britannica Online has: "The Enigma code was first broken by the Poles in the early 1930s".

  37. Re:Funding - situation, what we spent the money on by Anthere · · Score: 2, Informative

    Thanks for your feedback :-)

    Yes, we are planning to make it possible for people to have a small amounts automatically debited once a month.

  38. Re:wiki = falsehoods? by lawpoop · · Score: 2, Informative

    The thing for you to do is to change it. If it gets changed back, find out why. Consider the possibility that you might be mistaken.

    --
    Computers are useless. They can only give you answers.
    -- Pablo Picasso
  39. Re:Size doesn't matters by henrygb · · Score: 2, Informative
    Try Annan Plan for Cyprus, the last two parts of which look as if they have been written by a Greek Cypriot opponent of the plan who simply does not understand the world reaction. The Talk page shows this bewilderment.

    That being said, the article used to be even worse for most of June.

  40. Re:Why PHP? by Jamesday · · Score: 2, Informative

    Sorry, I didn't address the database part of your comment. The site often is database-bound. The current high web server load will just move the slowest point to the database, the number of visits will go up and we'll see where the next pressure point is. Today we were sufficiently database-bound that we temporarily turned off local search and used Google instead for a while.

    At the moment:

    One Squid cache server loss hurts responsiveness, so we'll be getting at least one more so we can stand one failure there. For two months of growth that probably means at least two to keep up.

    At least several web servers are needed to remove them as a choke point for a while (my guess is that five or so dual Opterons will handle traffic growth here for two months or so).

    More database servers are needed and more load spreading between them. The load balancing work is ongoing. Since there are differences about how to handle this (how to spread the load) I'll abstain from describing possible options here until there's more general agreement. Will certainly involve slaves offloading some queries and slaves offloading search from one or more primary servers.

    There's no sign yet that the growth is growing (we're in a seasonal relatively low load period at the moment though) and that means that we'll continue to see stress points moving around.