Domain: kuro5hin.org
Stories and comments across the archive that link to kuro5hin.org.
Stories · 88
-
RIP Kuro5hin (kuro5hin.org)
themusicgod1 writes: Can we please get a moment of silence? Long-time sister site to Slashdot, Kuro5hin has finally gone offline. -
Persistent Terminals For a Dedicated Computing Box?
Theovon writes "I just built a high-end quad-core Linux PC dedicated to number-crunching. Its job is to sit in the corner with no keyboard, mouse, or monitor and do nothing but compute (genetic algorithms, neural nets, and other research). My issue is that I would like to have something like persistent terminal sessions. I've considered using Xvnc in a completely headless configuration (some useful documentation here, here, here, and here). However, for most of my uses, this is overkill. Total waste of memory and compute time. However, if I decided to run FPGA synthesis software under WINE, this will become necessary. Unfortunately, I can't quite figure out how to get persistent X11 session where I'm automatically logged in (or can stay logged in), while maintaining enough security that I don't mind opening the VNC port on my firewall (with a changed port number, of course). I'm also going to check out Xpra, but I've only just heard about it and have no idea how to use it. For the short term, the main need is just terminals. I'd like to be able to connect and see how something is going. One option is to just run things with nohup and then login and 'tail -f' to watch the log file. I've also heard of screen, but I'm unfamiliar with it. Have other Slashdot users encountered this situation? What did you use? What's hard, what's easy, and what works well?" -
Who Will Google Buy Next?
Androsynth writes "Kuro5hin is running an article entitled Who Will Google Buy Next?, which features a list of all Google's previous buyouts and some interesting suggestions for the future." A Google-buyout betting pool seems in order. -
Who Will Google Buy Next?
Androsynth writes "Kuro5hin is running an article entitled Who Will Google Buy Next?, which features a list of all Google's previous buyouts and some interesting suggestions for the future." A Google-buyout betting pool seems in order. -
Web Developer's Handbook
vitaly.friedman writes "Bookmarks for web-developers, a list of useful tools supposed to make the life of web-workers easier, includes over 500 manually selected resources, tutorials, references and examples related to web-development. The "Web-Dev-Bookmarks" also includes links to other catalogs and lists related to web-design and web-development." -
Burnout and Depression Among IT Workers?
Cultural Sublimation asks: "All of us working in IT seem to be especially prone to problems like burnout and depression. Could part of the reason be directly related to our professions? Recently, there have been a number of interesting features on Kuro5hin which have focused precisely on this issue. From people claiming that " The Internet Is Driving Me Crazy", to an in-depth two-part series trying to demystify depression, the message is that too much information might be making us sick. What are the experiences of fellow Slashdot readers on this topic?" -
Burnout and Depression Among IT Workers?
Cultural Sublimation asks: "All of us working in IT seem to be especially prone to problems like burnout and depression. Could part of the reason be directly related to our professions? Recently, there have been a number of interesting features on Kuro5hin which have focused precisely on this issue. From people claiming that " The Internet Is Driving Me Crazy", to an in-depth two-part series trying to demystify depression, the message is that too much information might be making us sick. What are the experiences of fellow Slashdot readers on this topic?" -
Burnout and Depression Among IT Workers?
Cultural Sublimation asks: "All of us working in IT seem to be especially prone to problems like burnout and depression. Could part of the reason be directly related to our professions? Recently, there have been a number of interesting features on Kuro5hin which have focused precisely on this issue. From people claiming that " The Internet Is Driving Me Crazy", to an in-depth two-part series trying to demystify depression, the message is that too much information might be making us sick. What are the experiences of fellow Slashdot readers on this topic?" -
The Pseudoscience of Intelligent Design
Mime Narrator writes "An article over at Kuro5hin discusses the controvery over the Intelligent Design movement. The Dover, Pennsylvania school board recently adopted a policy requiring that high school science teachers teaching evolution tell their students that evolutionary theory, a theory that has been shown to explain the origins of life time and time again, is flawed, and that intelligent design is a valid alternative. The ACLU, along with the AUSCS (Americans United for the Separation of Church and State), and 11 parents, are suing the school board, accusing the board of violating the separation of church and state. " -
The Pseudoscience of Intelligent Design
Mime Narrator writes "An article over at Kuro5hin discusses the controvery over the Intelligent Design movement. The Dover, Pennsylvania school board recently adopted a policy requiring that high school science teachers teaching evolution tell their students that evolutionary theory, a theory that has been shown to explain the origins of life time and time again, is flawed, and that intelligent design is a valid alternative. The ACLU, along with the AUSCS (Americans United for the Separation of Church and State), and 11 parents, are suing the school board, accusing the board of violating the separation of church and state. " -
The Early History of Nupedia and Wikipedia, Part II
Today, read the continuation of Larry Sanger's account of the early history of Nupedia and Wikipedia (below), in which Sanger talks about the difficulties of governance in a large, free-wheeling project, some final attempts to save Nupedia, and how he came to resign from the organization. (And if you missed it, you might want to start with yesterday's installment.)Contents:
Why Wikipedia started working
A series of controversies
The governance challenge
My resignation and final few months with the project
Some final attempts to save Nupedia
ConclusionsWhy Wikipedia started working
This is a good place to explain why Wikipedia actually got started and why it worked (and still does work, at least as well as it does). The explanation involves a combination of quite a few factors, some borrowed from the open source movement, some borrowed from wiki software and culture, and some more idiosyncratic:
- Open content license. We promised contributors that their work would always remain free for others to read. This, as is well known, motivates people to work for the good of the world--and for the many people who would like to teach the whole world, that's a pretty strong motivation.
- Focus on the encyclopedia. We said that we were creating an encyclopedia, not a dictionary, etc., and we encouraged people to stick to creating the encyclopedia and not use the project as a debate forum.
- Openness. Anyone could contribute. Everyone was specifically made to feel welcome. (E.g., we encouraged the habit of writing on new contributors' user pages, "Welcome to Wikipedia!" etc.) There was no sense that someone would be turned away for not being bright enough, or not being a good enough writer, or whatever.
- Ease of editing. Wikis are pretty easy for most people to figure out. In other collaborative systems (like Nupedia), you have to learn all about the system first. Wikipedia had an almost flat learning curve.
- Collaborate radically; don't sign articles. Radical collaboration, in which (in principle) anyone can edit any part of anyone else's work, is one of the great innovations of the open source software movement. On Wikipedia, radical collaboration made it possible for work to move forward on all fronts at the same time, to avoid the big bottleneck that is the individual author, and to burnish articles on popular topics to a fine luster.
- Offer unedited, unapproved content for further development. This is required if one wishes to collaborate radically. We encouraged putting up their unfinished drafts--as long as they were at least roughly correct--with the idea that they can only improve if there are others collaborating. This is a classic principle of open source software. It helped get Wikipedia started and helped keep it moving. This is why so many original drafts of Wikipedia articles were basically garbage (no offense to anyone--some of my own drafts were sometimes garbage), and also why it is surprising to the uninitiated that many articles have turned out very well indeed.
- Neutrality. A firm neutrality policy made it possible for people of widely divergent opinions to work together, without constantly fighting. It's a way to keep the peace.
- Start with a core of good people. I think it was essential that we began the project with a core group of intelligent good writers who understood what an encyclopedia should look like, and who were basically decent human beings.
- Enjoy the Google effect. We had little to do with this, but had Google not sent us an increasing amount of traffic each time they spidered the growing website, we would not have grown nearly as fast as we did. (See below.)
That's pretty much it. The focus on the encyclopedia provided the task and the open content license provided a natural motivation: people work hard if they believe they are teaching the world stuff. Openness and ease of editing made it easy for new people to join in and get to work. Collaboration helped move work forward quickly and efficiently, and posting unedited drafts made collaboration possible. The fact that we started with a core of good people from Nupedia meant that the project could develop a functional, cooperative community. Neutrality made it easy for people to work together with relatively little conflict. And the Google effect provided a steady supply of "fresh blood"--who in turn supplied increasing amounts of content.
Probably, all or nearly all other project rules were either optional, or straightforward applications of these principles. The project probably would still have succeeded nicely even if it had moderated or tweaked some of the above principles. For instance, radical openness, that is, being open even to those who brazenly flouted and disrespected the project's mission, was surely not necessary; after all, without them, the project would have been more welcoming to the many people who felt they could not work with such difficult people. And if we had required people to sign in, that would not have made very much difference (although it probably would have made some in the beginning; the project wouldn't have grown as fast). Of course we didn't have to use the GNU FDL for the license. Certainly, we did not need to set the community up initially as an anarchy governed by some vague consensus: instead, we could have adopted a charter from the very start. The project could have been managed quite differently; there could have been specially-designated and well-qualified editors. The project could have officially encouraged and deferred to experts. An article approval process could have been adopted without threatening the principle of posting unedited content for collaboration. Certainly, many of the later bells and whistles--the arbitration committee, a three-revert rule, having administrators with the particular configuration of rights they have, etc.--were not absolutely necessary to adopt in the precise forms they took. These differences would not have threatened the basic principles that made the project work, listed above.
So the basic principles that explain why Wikipedia could start working--and still does work--are relatively simple, few in number, and above all general. The more specific principles that Wikipedia wound up with was a matter of historical accident. There was a great deal of "wiggle room." Those intent on studying or replicating the Wikipedia model would do well to bear that in mind.
A series of controversies
So much for the very early history of Wikipedia; the next phase involved rapid growth and some serious internal controversies over policy and authority. If Wikipedia's basic policy was settled upon in the first nine months, its culture was solidified into something closer to its present form in the next nine.
The project continued to grow. We had 6000 articles by July 8; 8000 by August 7; 11,200 by September 9; and 13,000 by October 4. Consulting the website logs, we noted a Google effect: each time Google spidered the website, more pages would be indexed; the greater the number of pages indexed, the more people arrived at the project; the more people involved in the project, the more pages there were to index. In addition to this source of new contributors, Wikipedia was Slashdotted several times, and had large influxes of new users particularly after two articles I wrote for Kuro5hin were posted on Slashdot: "Britannica or Nupedia? The Future of Free Encyclopedias (July 25, 2001) and Wikipedia is wide open. Why is it growing so fast? Why isn't it full of nonsense? (September 24, 2001).
This growth brought difficult challenges, challenges that perhaps I did not sufficiently anticipate and plan for. Some of our earliest contributors were academics and other highly-qualified people, and it seems to me that they were slowly worn down and driven away by having to deal with difficult people on the project. I hope they will not mind that I mention their names, but the two that stick in my mind are J. Hoffman Kemp and Michael Tinkler, a couple of Ph.D. historians. They helped to set what I think was a good precedent for the project in that they wrote about their own areas of expertise, and they contributed under their own, real names. The latter has the salutary effect of making the contributor more serious and more apt to take responsibility for his or her contributions. They are also very nice people, but did not "suffer fools gladly," as the phrase goes. Consequently, they wound up in some pretty silly disputes that would have driven less patient people away instantly. So there was a growing problem: persistent and difficult contributors tend to drive away many better, more valuable contributors; Kemp and Tinkler were only two examples. There were many more who quietly came and quietly left. Short of removing the problem contributors altogether--which we did only in the very worst cases--there was no easy solution, under the system as we had set it up. And I am sorry to have to admit that those aspects of the system that led to this problem were as much my responsibility as anyone else's. Obviously, I would not design the system the same way if given the chance again.
As a result, I grew both more protective of the project and increasingly sensitive to abuse of the system. As I tried to exercise what little authority I claimed, as a corrective to such abuse, many newer arrivals on the scene made great sport of challenging my authority. One of the earliest challenges happened in late summer, 2001. The front page of Wikipedia--then open to anyone to edit, like any other page on the project--was occasionally vandalized with infantile graffiti. Someone then tried to make an archive of the vandalism that had been done to the front page of Wikipedia. I maintained that to make such an archive would be to encourage such vandalism, so I deleted the archive. This occasioned much debate. Then a user made the archive a "subpage" of his own user page--and user pages were generally held to be the bailiwick of the user. Consequently I deleted that subpage, which occasioned a further hue and cry that, perhaps, I was abusing my authority. The vandalism-enshrining user in question proceeded to create a "deleted pages" page, on which the deleted vandalism archives were listed, as if to accuse me of trying to act without public scrutiny; but this was, of course, perfectly acceptable to me. At the time, I thought that this controversy was just as silly as it will sound to most people reading this; I thought that I needed only to "put my foot down" a little harder and, as had happened for the first six months of the project, participants would fall into line. What I did not realize was that this was to be only the first in a long series of controversies, the ultimate upshot of which was to undermine my own moral authority over the project and to make the project as safe as possible for the most abusive and contentious contributors.
Throughout this and other early controversies, much of the debate about project policy was conducted on the wiki itself. Other debates were conducted on mailing lists, Wikipedia-L and then later, for the English language project, WikiEN-L. In addition, people had taken to putting their own essays on Wikipedia, as subpages of their user pages. These too were occasioning debate. It seemed to me, and many other contributors, that this debate was distracting the community from our main goal: to create an encyclopedia. Consequently I proposed that we move the debate to another wiki that was to be created specifically for that purpose--what became known as the "meta wiki." This proposal was very widely supported, so we set it up.
As it happened, the meta-wiki became even more uncontrolled than Wikipedia itself, and for many months was continually infested with contributions by people that can only be called "trolls." That epithet came to be discouraged, however, for reasons soon to be explained. The existence of trolls was a problem we felt we should tolerate--and deal with only verbally, not with harsh penalties--for the sake of encouraging the broadest amount of participation. In the first years, only the worst trolls were ever expelled from the project. I do not know whether this policy has been changed as a result of the operation of the much-later installed Arbitration Committee.
The reasons the meta-wiki became (at least temporarily) more uncontrolled are not far to seek. First, it had no specific purpose, other than to host project debate and essays that do not belong on the main wiki--which was not enough to make anyone care very much about it. Second, because many people did not care what happened on the meta-wiki, they did not do the very necessary weeding that takes place on Wikipedia; besides, as the meta-wiki was a repository of opinion, people felt less comfortable editing or deleting what was, after all, only opinion.
What happened was that project policy discussions moved almost exclusively to the project mailing lists. There is a reason why this was a superior solution to having much debate on an uncontrolled, "unmoderated" wiki. On a wiki, contributions exist in perpetuity, as it were, or until they are deleted or radically changed; consequently, anyone new to a discussion sees the first contribution first. So whoever starts a new page for discussion also, to a great extent, sets the tone and agenda of the discussion. Moreover, nasty, heated exchanges live on forever on a wiki, festering like an open wound, unless deliberately toned down afterwards; if the same exchange takes place on a mailing list, it slips mercifully and quietly into the archives.
At about the same time that we decided to start the meta-wiki, and soon after the vandalism archive affair, I was thinking a great deal about Wikipedia's apparent anarchy, and I wrote an essay titled, "Is Wikipedia an experiment in anarchy?" This and the discussion that ensued tended to ossify positions with regard to the authority issue: I and a few others agreed that Jimmy and I should have special authority within the system, to settle policy issues that needed settling. Jimmy was relatively quiet about this issue; this, I think, was probably because his authority was generally not in question, but mine was, because I was "in the trenches" and continuing to encourage good habits and solidify policy positions.
By November or December of 2001, Wikipedia was growing so fast and the subject of regular news reporting, even by the likes of The New York Times and MIT's Technology Review; after the two major Slashdottings earlier in the year, we knew that large influxes of members could have a tendency to change the nature of the project, and not necessarily for the better. If there were some major news coverage--an evening news story in the U.S., for example--there might be many new people who would need to be taught about Wikipedia's standards and positive cultural aspects. So I proposed what I thought was a humorously-named "Wikipedia Militia" which would manage new (and very welcome) "invasions" by new contributors. By this time, however, there was a small core group of people who were constantly on the watch for anything that smacked the least bit of authoritarianism; consequently, the name, and various aspects of how the proposal was presented, were vigorously debated. Eventually, we switched to "The Wikipedia Welcoming Committee" and finally, the "Volunteer Fire Department"--which eventually, it seems, fell into disuse.
The governance challenge
After the September Slashdotting, I composed a page originally called "Our Replies to Our Critics" (and now called "Replies to Common Objections"), in which I addressed the problem that "cranks and partisans" might abuse the system:
Moreover--and this is something that you might not be able to understand very well if you haven't actually experienced it--there is a fair bit of (mostly friendly) peer pressure, and community standards are constantly being reinforced. The cranks and partisans, etc., are not simply outgunned. They also receive considerable opprobrium if they abuse the system.
This reflects very well the conception I had in September 2001 of Wikipedia's culture; the reply above was as much hopeful and prescriptive as descriptive. But it turned out to be only partly true. As difficult users began to have more of a "run of the place," in late 2001 and 2002, opprobrium was in fact meted out only piecemeal and inconsistently. It seemed that participation in the community was becoming increasingly a struggle over principles, rather than a shared effort toward shared goals. Any attempt to enforce what should have been set policy--neutrality, no original research, and no wholesale deletion without explanation--was frequently if not usually met with resistance. It was difficult to claim the moral high ground in a dispute, because the basic project principles were constantly coming under attack. Consequently, Wikipedia's environment was not cooperative but instead competitive, and the competition often concerned what sort of community Wikipedia should be: radically anarchical and uncontrolled, or instead more singlemindedly devoted to building an encyclopedia. Sadly, few among those who would love to work on Wikipedia could thrive in such a protean environment.
It is one thing to lack any equivalent to "police" and "courts" that can quickly and effectively eliminate abuse; such enforcement systems were rarely entertained in Wikipedia's early years, because according to the wiki ideal, users can effectively police each other. It is another thing altogether to lack a community ethos that is unified in its commitment to its basic ideals, so that the community's champions could claim a moral high ground. So why was there no such unified community ethos and no uncontroversial "moral high ground"? I think it was a simple consequence of the fact that the community was to be largely self-organizing and to set its own policy by consensus. Any loud minority, even a persistent minority of one person, can remove the appearance of consensus. In fact, I recall that (in October 2002, after I resigned) I felt compelled by ongoing controversies to request that Jimmy declare that certain policies were in fact non-negotiable, which he did. Unfortunately, this declaration was too little, too late, in my opinion.
By late 2001, I had gained both friends and detractors. I think I had become, within the project, a symbol of opposition to anarchism, of the enforcement of standards, and consequently of the exercise of authority in a radically open project. But I was still trying to manage the project as I always had--by force of personality and "moral" authority. So when people arrived who clearly and openly disrespected established policy, I was, in my frustration, very short with them; and when the project continued to try to establish new policies, my role in articulating those policies and actually establishing them (attempting to express a "consensus") was challenged. This undermined what moral authority I had. I felt my job was on the line, and the project continued in turmoil day in and day out; from my point of view, fires were spreading everywhere, and as I had become a somewhat controversial figure, I did not have quite enough allies to help me put them out. Consequently I was rather too peremptory and short with some users. This, however, exacerbated the problem, because the attitude could not be backed up by punishment; harsh words from a leader are empty threats if unenforceable; I thereby handed my anti-authoritarian "wiki-anarchist" opponents an advantage, because--ironically--they were able to portray me as dictatorial, when I was anything but. I came to the view, finally and belatedly, that it would be better to "ignore the trolls." But as it turns out, this is particularly hard to do on a wiki, because, again, unlike on an e-mail list, trollish contributions do not just disappear into the archives; they sit out in the open, as available as the first day they appeared, and "festering." Attempts to delete or radically edit such contributions were often met by reposting the earlier, problem version: the ability to do that is a necessary feature of collaboration. Persistent trolls could, thus, be a serious problem, particularly if they were able to draw a sympathetic audience. And there was often an audience of sympathizers: contributors who philosophically were opposed to nearly any exercise of authority, but who were not trolls themselves.
It is surely very ironic that it was I personally who (initially) so strongly supported the lack any enforceable rules in the community. Some legal theorists would maintain that a community that lacks enforceable rules lacks any law at all. In retrospect it is clear that there was a fundamental problem with my role in the system: to have real authority, I needed both to be able to enforce the rules and, for both fairness and the perception of fairness, there needed to be clear rules from the beginning. But, by my own design, I had very early on rejected the label "editor-in-chief" and much real enforcement authority; a year into the game, it would have been difficult if not impossible to claim enforcement authority over active but problem users. Moreover, I was the author of the "ignore all rules" rule. My early rejection of any enforcement authority, my attempt to portray myself and behave as just another user who happened to have some special moral authority in the project, and my rejection of rules--these were all clearly mistakes on my part. They did, I think, help the project get off the ground; but I really needed a more subtle and forward-looking understanding of how an extremely open, decentralized project might work.
In retrospect, I wish I had taken Teddy Roosevelt's advice: "Speak softly and carry a big stick." Since my "stick" was very small, I suppose I felt compelled to "speak loudly," which I regret. (This was not such a problem, by the way, on Nupedia; partly, that was because there were not nearly as many problem users on Nupedia, but partly it was because there was clear enforcement authority.) As it turns out, it was Jimmy who spoke softly and carried the big stick; he first exercised "enforcement authority." Since he was relatively silent throughout these controversies, he was the "good cop," and I was the "bad cop": that, in fact, is precisely how he (privately) described our relationship. Eventually, I became sick of this arrangement. Because Jimmy had remained relatively toward the background in the early days of the project, and showed that he was willing to exercise enforcement authority upon occasion, he was never so ripe for attack as I was.
Perhaps the root cause of the governance problem was that we did not realize well enough that a community would form, nor did we think carefully about what this entailed. For months I denied that Wikipedia was a community, claiming that it was, instead, only an encyclopedia project, and that there should not be any serious governance problems if people would simply stick to the task of making an encyclopedia. This was strictly wishful thinking. In fact, Wikipedia was from the beginning and is both a community and an encyclopedia project. And for a community attempting to achieve something, to be serious, effective, and fair, a charter seems necessary. In short, a collaborative community would do well to think of itself as a polity with everything that that entails: a representative legislative, a competent and fair judiciary, and an effective executive, all defined in advance by a charter. There are special requirements of nearly every serious community, however, best served by relevant experts; and so I think a prominent role for the relevant experts should be written into the charter. I would recommend all of this to anyone launching a serious online community. But indeed, in January 2001, we were in both "uncharted" and "unchartered" territory. The world, I think, will be able to benefit from this and our other initial mistakes.
But in fairness to ourselves, it was a good idea to allow the community to decide by experience and consensus what article content rules to endorse. This allowed us to generate a very sensible set of article content rules. To be clear, I think it was not such a good idea to apply the same thinking to the organization of the community itself; we should have acknowledged that a community would form, that it would have certain persistent and difficult issues that would need to be solved, and that a lack of any effective founding community charter might result in chaos.
My resignation and final few months with the project
Throughout the governance controversy, I was preparing for my wedding, which happened December 1, 2001. A few days after I arrived back from my honeymoon, I was informed that I should probably start looking for another job, because Bomis was having to lay off most of its workers; they had 10-12 workers at the end of 2000, and by the beginning of 2002 they were back to their original 4-5. My salary was reduced in December and then halved in January. This seemed inevitable because Wikipedia was not bringing in any money at all for Bomis, even if Wikipedia was becoming even more of a publicly-recognized, if still modest success. Our first anniversary came just before we announced having 20,000 articles, and I was invited to talk about the project at Stanford on January 16 (text here; you might notice that I was still plugging the notion of using Nupedia to vet Wikipedia articles, as an answer to the objection that Wikipedia articles are unreliable).
I was officially laid off at the beginning of February, which I announced a few weeks later. I had continued on as a volunteer; Wikipedia and Nupedia were, after all, volunteer projects. But I was laboring in the aftermath of the governance controversies of the previous fall and winter, which promised to make the job of a volunteer project leader even more difficult. Moreover, I had to look for a real job. So throughout the month of February I considered resigning altogether.
But Jimmy had told me the previous December that Bomis would start trying to sell ads on Wikipedia in order to pay for my job. Even in that horrible market for Internet advertising, there were already enough pageviews on Wikipedia that advertising proceeds might have provided me a very meager living. We knew that this would be extremely controversial, because so many of the people who are involved in open source and open content projects absolutely hate the idea of advertising on the web pages of free projects, even to support project organizers. In fact, when this advertising plan was announced, in late February of 2002, the Spanish Wikipedia was forked (something I urged them not to do).
Bomis was not successful in selling any ads for Wikipedia anyway--you might recall that early 2002 was at about the very bottom of the market for Internet advertising. I also had some hope that we might, finally, set up the project's managing nonprofit, which we had discussed doing for a long time (and which eventually did come into being: Wikimedia). The job of setting up the nonprofit was left to me, but ongoing controversies seemed to eat up any time I had for Wikipedia, and frankly I had no idea where to begin. So, after a month without pay, I announced my general resignation; I completely stayed away from the project for a few months.
Just by the way, Wikipedia's offshoot projects--a dictionary, a textbook project, a quotation project, a public domain book repository, etc.--were all started in 2002 or later, and I cannot claim any credit for them. I did supply the name "Wiktionary" in April 2001, more or less on a whim. I quickly disavowed any responsibility for leading any such project, and it seems the Wiktionary project did not start up for another year and a half (December 12, 2002). My view now is that Webster's and the OED are quite good enough as far as English dictionaries go, and there will always be excellent free dictionaries in every language online. To try to develop a dictionary by collaboration among random Internet users, particularly in a completely uncontrolled wiki format, now strikes me as a nonstarter. I confess I am now puzzled why I didn't think so instantly; it was no doubt because I simply was throwing out ideas as they occurred to me, and also because we had too many dictionary definition-type entries in Wikipedia. (So why not give people a place to put their dictionary definitions?--Perhaps that's what I was thinking, but it hardly seems like a good justification for starting a project.) But Jimmy's first reaction was properly skeptical regarding the use of wikis and Ruth Ifcher made a stronger criticism very nicely. Dictionaries, even more than encyclopedias, must be extremely reliable to be even minimally usable; without direct oversight by linguists, a public dictionary project seems pointless. As to the other projects, they are mostly conducted using wikis and according to some of the basic founding principles of Wikipedia. But other sorts of project--for example, textbook projects, quotation repositories, and archives--necessarily require quite different specifications from those of an encyclopedia. For example, the fact that the wiki format works for encyclopedia development hardly means that it is appropriate for the hosting of public domain books. Since the same texts are available in many other places online, such as the wonderful Project Gutenberg, why would anyone choose to read The Iliad on a wiki, which could have been subtly changed by any random passer-by, without any oversight by someone who had access to an authoritative text? There is a fact about the way the text actually reads; so is editing via wiki software more apt to increase or reduce the number of errors over other systems, such as Project Gutenberg's? I do not mean to dismiss any such efforts. I simply think that considerable thought needs to be put into exactly how those other projects should be organized: the wiki format is not a magic pill that somehow makes all problems go away. Wiki is just one software paradigm, which must be adapted, supplemented, changed, or replaced in order to solve the unique set of problems a project faces.
In the spring, a controversy erupted. Caring as I did--and as I still do--about the future of free encyclopedias, I felt compelled to get involved. The controversy featured a troll who was putting up huge numbers of screeds on the "meta-wiki" and on Wikipedia as well. The controversy began with a discussion of what to do about, and how to react to, this particular troll. I maintained that one should not "feed the troll," and that the troll should be "outed" (it was an anonymous user, but it was not hard to use Google to determine the identity of the troll) and shamed.
There resulted a broader controversy about how to treat problem users generally. There were, as I recall, two main schools of thought. One, to which I adhered and still adhere, was that bona fide trolls should be "named and shamed" and, if they were unresponsive to shaming, they should be removed from the project (by a fair process) sooner rather than later. We held that a collaborative project requires commitment to ethical standards which are--as all ethical standards ultimately are--socially established by pointing out violations of those standards. Hence naming and shaming. A second school of thought held that all Wikipedia contributors, even the most difficult, should be treated respectfully and with so-called WikiLove. Hence trolls were not to be identified as such (since "troll" is a term of abuse), and were to be removed from the project only after a long (and painful) public discussion. For the latter school, it seemed to me, the only really egregious faux pas one could commit in the project was to suggest that there were objective standards that could be enforced via "shaming."
I felt at the time that the prevalence of the second school entailed rejection of both objective standards and rules-based authority. It is impossible to explain why one is removing some partisan screeds from the wiki without, in some way, identifying it as a partisan screed, and pointing out that such productions are inconsistent with the neutrality policy. This will necessarily be received as less than respectful and "loving," especially if one must engage the troll himself in a long, drawn-out dispute; in a very long dispute with any trollish type, it is only a matter of time before some epithet gets bandied about, since they are so darned useful (and accurate) when applied to trollish types. More generally, the very application of rules, or laws, entails a moral judgment, or what for its effectiveness must have the force of a moral judgment. I suppose I agree with those legal theorists who say that there is necessarily, in its core, a moral component to the law. Consequently, the new policy of "WikiLove" handed trolls and other difficult users a very effective weapon for purposes of combatting those who attempted to enforce rules. After all, any forthright declaration that a user is doing something that is clearly against established conventions--posting screeds, falsehoods, nonsense, personal opinion, etc.--is nearly always going to appear disrespectful, because such a declaration involves a moral accusation. The only way to avoid such an appearance of disrespect, perhaps, is to step very lightly and use much flattery and qualifications: "Now don't get me wrong, I think you're doing a good job overall, but it seems to me that in this particular case, your contribution is slightly inconsistent with the neutrality policy." Suppose the offender replies: "So what? I disagree with the neutrality policy." Or: "I disagree. What I wrote is perfectly neutral. Who do you think you are, anyway?" It is a very rare person who can practice "WikiLove" in such a case. In Wikipedia's developing culture, if anyone reacted out of frustration, or merely attempted to apply the law as a moral instrument, as laws typically are applied, he would become the problem, and a much more serious problem, than mere violations of the neutrality policy, say. The result is that, on pain of becoming persona non grata in the community, one had to treat brazen, self-conscious violators of basic policy with particular respect. It was a perfect coup for the resident wiki-anarchists. I again left the project for several months.
In fall of 2002, I had started teaching at a local community college, and with some extra time on my hands, I started editing Wikipedia a little and engaging in mailing list discussions. I think my first new post to Wikipedia-L, from September 1, 2002, was "Why the free encyclopedia movement needs to be more like the free software movement." In it I argued that the free software movement is led and dominated by highly-qualified programmers, and that the "free encyclopedia movement"--that is, Wikipedia, Nupedia, and other newer projects--needs to move in that direction. I suggested that Nupedia be redesigned to release "approved" versions of Wikipedia articles; Wikipedia itself was not to be touched. This proposal met with a very cool reception. After a few months of discussion, Jimmy himself was "intending to revive Nupedia in the near future" and "thinking very much along the lines of what is being discussed here." Unfortunately, this never happened.
By November or December, I think, I proposed, and Magnus Manske very helpfully coded, an expert-controlled approval process for Wikipedia that was in fact to be independent of both Nupedia and Wikipedia. It would not have affected the Wikipedia editorial process. It would have lived in a separate namespace or domain, as an independent add-on project for Wikipedia. Without explaining the details, expert reviewers, the recruitment of which I would organize, would examine Wikipedia articles and approve or disapprove of particular versions of those articles. We set up a mailing list, Sifter-L (archives no longer online, apparently), which for several weeks discussed policy issues.
There was not a great deal of support for the proposal on Wikipedia-L. There was little or no excitement that the new project might bring into Wikipedia a fresh crop of subject area specialists. But that was fine as far as I was concerned, since the project was to operate independently of Wikipedia. Still, I had the very distinct sense that any specialists arriving on the scene would not necessarily be met with open arms--particularly if before approving an article they wished to make whatever changes to articles that they felt necessary. There were even a few Wikipedians who made it clear that experts should not expect to be treated any differently than anyone else, even when writing about their areas of expertise.
I then considered whether the interaction between Wikipedians and the new reviewers might be a problem after all. Surely, I thought, most specialists would want to edit even very good articles before approving them (in the independent system). This would require that the reviewers interact with Wikipedians. Wikipedia's culture had become such that disrespect of expertise was tolerated, and, again, trolls were merely warned, but very politely (in keeping with the policy of WikiLove), that they please ought to stop their inflammatory behavior. Trolls would certainly find ripe targets in expert reviewers, I thought. I recalled that patient, well-educated Wikipedians like J. Hoffmann Kemp and Michael Tinkler had been driven off the project not only by trolls but by some of the more abrasive and disrespectful regulars. I then considered: could I in good conscience really ask academics, who are very busy, to engage in this activity that would probably annoy most of them and do nothing to contribute to their academic careers? Recruiting for Nupedia was very easy by comparison, and caused me no such pangs of conscience.
I believe it was this problem that finally prompted me, in I believe January of 2003, to inform Jimmy as follows (by private e-mail): I was breaking with the project altogether; the only way he could prevent this, I told him, was that he personally crack down on problem users, and make the project more officially welcoming to experts. I also told him that I did not expect this information to change his mind, and that I did not mean to issue an ultimatum. And in fact our exchange did not change his mind. I concluded that we had a fundamental philosophical disagreement about how the project should be run. I respected and still respect his view. That is where matters ended, and it was then that I broke with Wikipedia altogether.
Some final attempts to save Nupedia
Nevertheless, I was interested in pursuing Nupedia's development. It still seemed rescuable to me.
I recall two incidents in which I tried to have Nupedia revived, in 2002 or 2003, but I don't recall exactly. First, I approached Jimmy with the offer to try to find a buyer/managing organization for Nupedia. The suggestion was that, since Bomis did not have enough money to support it, and since Jimmy did not appear to have any specific intentions with the project other than to let it run on the system set up in 2000-1, I might be able to find a university or other organization that would take on the responsibility. I do not recall the details, but we did not pursue this possibility. Second, and later, I offered to buy Nupedia myself--that is, the domain name, the membership list, and whatever other proprietary material Bomis might have controlled. I wanted to start it up again as a simpler, more streamlined, but still fully peer-reviewed project; I thought, moreover, that if I owned it I might be able to give it to a suitable sponsoring educational or nonprofit institution. Jimmy seemed cool to the idea, and did not ask for any specific offers.
Perhaps it is, therefore, not entirely accurate to say that Nupedia died due to the inefficiency of its system. To some extent it was also allowed to die, even after it was clear that its former editor-in-chief expressed an interest in continuing the project under an entirely different system. The result was that, without a leader or organization that could support its mission, Nupedia died a slow death. The server it lived on had some trouble in 2003, and as a result the website went offline. For whatever reason, the website was never brought up again after that.
I obviously cannot speak for Jimmy, but I will say that, if he was worried that Nupedia would essentially fork Wikipedia--again, I don't claim that he had that concern--then it seems to me that such a concern would not have justified letting Nupedia wither untended. The projects, Wikipedia and Nupedia, were naturally complementary parts of a single, symbiotic whole. That at least is how I always regarded them, indeed, from the very founding of Wikipedia. From the founding of Wikipedia, I always thought Wikipedia without Nupedia would have been unreliable, and that Nupedia without Wikipedia would have been unproductive. Together they were to be an "unstoppable high-quality article-creation juggernaut."
It is still disappointing to me, that we made plans and promises to thousands of Nupedians, including hundreds of extremely well-qualified people, some of them leaders in their fields. We spent many thousands of person-hours, all told, on the project. I apologize to those people, and I can only hope that they will find some future open content encyclopedia project worthy of their participation, one that will show the world the potential that Nupedia had.
Conclusions
I have some advice for anyone who would like to start new projects on the model of Wikipedia.
You can learn from Wikipedia's success; so, first and most importantly, see above for considerations about why Wikipedia works.
But you can also learn from our mistakes. The following primarily concerns project governance, because governance issues are, in my opinion, the primary failing of Wikipedia. Bear in mind, also, that these are only rough guidelines, for those who are starting projects that have enough resemblance to Wikipedia. These are not perfectly general rules:
- If you intend to create a very large, complex project, establish early on that there will be some non-negotiable policy. Wikis and collaborative projects necessarily build communities, and once a community becomes large enough, it absolutely must have rules to keep order and to keep people at work on the mission of the project. "Force of personality" might be enough to make a small group of people hang together; for better or worse, however, clearly enunciated rules are needed to make larger groups of people hang together.
- There is some policy that, with forethought, can be easily predicted will be necessary. Articulate this policy as soon as possible. Indeed, consider making a project charter to make it clear from the beginning what the basic principles governing the project will be. This will help the community to run more smoothly and allow participants to self-select correctly.
- Establish any necessary authority early and clearly. Managers should not be afraid to enforce the project charter by removing people from the project; as soon as it becomes necessary, it should be done. Standards that are not enforced in any way do not exist in any robust sense. Do not tolerate deliberate disruption from those who oppose your aims; tell them to start their own project; there's a potentially infinite amount of cyberspace.
- As any disagreements among project managers are apt to be publicly visible in a collaborative project, and as this is apt to undermine the (very important) moral authority of at least one manager, make sure management is on the same page from the beginning--preferably before launch. This requires a great deal of thinking through issues together.
- In knowledge-creation projects, and perhaps many other kinds of projects, make special roles for experts from the very beginning; do not attempt to add those roles later, as an afterthought. Specialists are one of your most important resources, and it is irrational not to use them as much as you can. Preferably, design the charter so that they are included and encouraged. Moreover, make the volunteer project management a meritocracy, and not based on longevity but based on the ability to lead and contribute to the project; that is the only condition under which very many of the best qualified people will want to participate.
Another point needs more in-depth development.
Radical and untried new ideas require constant refinement and adaptation in order to succeed; the first proposal is very rarely the best, and project designers must learn from their mistakes and constantly redesign better projects. Nupedia's Advisory Board failed to admit to inherent flaws in its system, and its delay in admission shut the window of opportunity to its improvement. And it seems to me that the Wikipedia community fell into a mistake by thinking that just one or two features--the wiki feature and the neutrality policy and a few other things--explained Wikipedia's success, and that those features can thus be applied with no significant changes to new projects. But there is no substitute for constant creativity and problem-solving--nor for honesty about what problems need solving. The honesty to recognize problems and creativity in solving them are, after all, what made Wikipedia succeed in the first place.
This is a crucial point: if you use a tool or model from another project, think through very carefully how that tool or model should be adapted. Do not assume that you need to use every feature, or every aspect of the surrounding culture, that you are borrowing. Wikipedia borrowed rather too much from (1) the culture of wikis, (2) unmoderated online discussions, and (3) free-wheeling online culture generally. To be sure, Wikipedia is also a product of those cultures, and works as well as it does largely because of what it borrowed from those cultures. But it also shares some of its more serious current flaws with such cultures. Those planning new projects, or wanting to overhaul old ones, might well bear in mind that a certain cultural context, including the context that has grown up around a tool, just might not be right for that project. Let me elaborate.
(1) Consider first the culture of wikis. On the one hand, I said we wanted to determine the best rules, and experience would help us determine that; so we had no rules to begin with. On the other hand, one might add that another reason we began without rules was that we were partaking in the extremely uncontrolled, free-wheeling nature of "traditional" wikis. I think that's right. But there is an excellent reason why an encyclopedia project should not partake in that extremely uncontrolled nature of wiki culture, and why it should adopt actually enforceable rules: unlike traditional wikis, encyclopedia projects have a very specific aim, with very specific constraints, and efficient work toward that aim, within those constraints, practically requires the adoption of enforceable rules. The mere fact that most wikis, when Wikipedia was created, did not have enforceable rules hardly meant that one could not innovate further, and create one that did have rules.
(2) Moreover, Jimmy and I and most of the first participants on Wikipedia were veterans of unmoderated Internet discussion groups, and hence, naturally, we could appreciate the advantages of letting a virtual community develop in the absence of any real (enforcement) authority. In unmoderated forums there is often found a sense, among some participants, that any attempt to oust a particularly troublesome user amounts to unjustifiable censorship. The result is that the existence of many unmoderated forums online has created a small army of people militantly opposed to the slightest restriction on speech, who feel that they do and should have a right to say whatever they like, wherever they like, online. Any attempt to create and enforce rules for Internet projects, when that small army is ready to cry "censorship," will seem daring or even outrageous in many contexts online. But there is an excellent reason why such anarchy is inappropriate for many projects, including encyclopedia projects, even one that is self-policing like a wiki: there simply must be a way to enforce rules in order for rules to be effective. Given that encyclopedia project development happens almost entirely using words, nearly any rules will also be restrictions on speech. Anyone who advocates many enforceable rules on a collaborative project, in the cultural context of an Internet filled with so many unmoderated discussion groups, can be made to seem reactionary. But this is only a result of that cultural context; in any other context, the existence of rules would be perfectly natural and unobjectionable.
(3) Finally, and generally speaking, the Internet is a great leveller. Since social interaction can proceed among complete strangers who cannot so much as see each other, things that seem to matter in many "meatspace" discussions, such as age, social status, and level of education, are often dismissed as unimportant online. Many Internet forums, chatrooms, and blogs are populated by people who are identified by only a "handle," and any suggestion that communication should be restricted or in any way altered in accordance with "expertise" or "authority" is likely to be met with outrage, in most forums. But there are several excellent and obvious reasons why expertise does need special consideration in an encyclopedia project, and in other collaborative projects. First, there are many subjects that dilettantes cannot write about credibly; I, for example, could not write very credibly about astronomy or speleology, but I have a passing interest in both. If I am working only with other dilettantes, our articles are apt to remain amateurish at best; we can fill in the gaps in each other's knowledge, and do research, but the results will remain problematic until someone with more knowledge of the subject contributes. Second, there are very many specialized subjects about which no one but experts has any significant knowledge at all. Third, it is only the opinions of experts that will be trusted by most of the public as authoritative in determining whether an article is generally reliable or not. Moreover, the standards of public credibility are not likely to be changed by the widespread use of Wikipedia or by online debate about the reliability of Wikipedia. Like them or hate them, those are the facts. But if one points these facts out online, culturally "levelled" as it is, particularly in forums or projects like Wikipedia which go out of their way to ignore individual differences among people, one finds a frosty reception at best.
Consider, if you will, that it was because Wikipedia was started in the context of the ingrained cultures of wikis, of unmoderated discussion forums, and of the levelling, anti-elitist influence of the Internet at large, that it was very difficult for us to exercise the maximal amount of creativity that a maximally successful project would require. In establishing a new cultural context, we were deeply constrained by the old. Now, to be sure, I have said above and many times elsewhere that Wikipedia did not have to adopt the particular conjunction of policies that it did. But it is not surprising that it did adopt its particular conjunction of policies, considering the conjunction of influences on its development. So it would have required much more explanation and persuasion, and indeed, much more struggle, for us to, for example, have persuaded potential participants that some persons, even in a wiki environment, should have special rights that others do not. So powerful is the influence of cultural context that there are quite a few people whose lack of imagination is such that they believe I simply must not understand "why Wikipedia works" if I am willing to suggest that it does not have to work in precisely the way it does work. Constantly-reinforced cultural habits die very hard indeed, and place very strong constraints upon what can be imagined, and what bare possibilities seem even worth thinking about.
But it was our willingness to exercise our creativity and follow our imagination, and create what is, to some extent, a new kind of culture, that led to Wikipedia's success. For the overall project of creating open content encyclopedias--and indeed, for the fantastic collaborative Internet that has yet to be created--to reach its full potential, the process of identifying mistakes honestly and creatively seeking solutions must be ramped up and continued unabated.
Many thanks to Larry Sanger and to O'Reilly for this memoir. -
The Early History of Nupedia and Wikipedia, Part II
Today, read the continuation of Larry Sanger's account of the early history of Nupedia and Wikipedia (below), in which Sanger talks about the difficulties of governance in a large, free-wheeling project, some final attempts to save Nupedia, and how he came to resign from the organization. (And if you missed it, you might want to start with yesterday's installment.)Contents:
Why Wikipedia started working
A series of controversies
The governance challenge
My resignation and final few months with the project
Some final attempts to save Nupedia
ConclusionsWhy Wikipedia started working
This is a good place to explain why Wikipedia actually got started and why it worked (and still does work, at least as well as it does). The explanation involves a combination of quite a few factors, some borrowed from the open source movement, some borrowed from wiki software and culture, and some more idiosyncratic:
- Open content license. We promised contributors that their work would always remain free for others to read. This, as is well known, motivates people to work for the good of the world--and for the many people who would like to teach the whole world, that's a pretty strong motivation.
- Focus on the encyclopedia. We said that we were creating an encyclopedia, not a dictionary, etc., and we encouraged people to stick to creating the encyclopedia and not use the project as a debate forum.
- Openness. Anyone could contribute. Everyone was specifically made to feel welcome. (E.g., we encouraged the habit of writing on new contributors' user pages, "Welcome to Wikipedia!" etc.) There was no sense that someone would be turned away for not being bright enough, or not being a good enough writer, or whatever.
- Ease of editing. Wikis are pretty easy for most people to figure out. In other collaborative systems (like Nupedia), you have to learn all about the system first. Wikipedia had an almost flat learning curve.
- Collaborate radically; don't sign articles. Radical collaboration, in which (in principle) anyone can edit any part of anyone else's work, is one of the great innovations of the open source software movement. On Wikipedia, radical collaboration made it possible for work to move forward on all fronts at the same time, to avoid the big bottleneck that is the individual author, and to burnish articles on popular topics to a fine luster.
- Offer unedited, unapproved content for further development. This is required if one wishes to collaborate radically. We encouraged putting up their unfinished drafts--as long as they were at least roughly correct--with the idea that they can only improve if there are others collaborating. This is a classic principle of open source software. It helped get Wikipedia started and helped keep it moving. This is why so many original drafts of Wikipedia articles were basically garbage (no offense to anyone--some of my own drafts were sometimes garbage), and also why it is surprising to the uninitiated that many articles have turned out very well indeed.
- Neutrality. A firm neutrality policy made it possible for people of widely divergent opinions to work together, without constantly fighting. It's a way to keep the peace.
- Start with a core of good people. I think it was essential that we began the project with a core group of intelligent good writers who understood what an encyclopedia should look like, and who were basically decent human beings.
- Enjoy the Google effect. We had little to do with this, but had Google not sent us an increasing amount of traffic each time they spidered the growing website, we would not have grown nearly as fast as we did. (See below.)
That's pretty much it. The focus on the encyclopedia provided the task and the open content license provided a natural motivation: people work hard if they believe they are teaching the world stuff. Openness and ease of editing made it easy for new people to join in and get to work. Collaboration helped move work forward quickly and efficiently, and posting unedited drafts made collaboration possible. The fact that we started with a core of good people from Nupedia meant that the project could develop a functional, cooperative community. Neutrality made it easy for people to work together with relatively little conflict. And the Google effect provided a steady supply of "fresh blood"--who in turn supplied increasing amounts of content.
Probably, all or nearly all other project rules were either optional, or straightforward applications of these principles. The project probably would still have succeeded nicely even if it had moderated or tweaked some of the above principles. For instance, radical openness, that is, being open even to those who brazenly flouted and disrespected the project's mission, was surely not necessary; after all, without them, the project would have been more welcoming to the many people who felt they could not work with such difficult people. And if we had required people to sign in, that would not have made very much difference (although it probably would have made some in the beginning; the project wouldn't have grown as fast). Of course we didn't have to use the GNU FDL for the license. Certainly, we did not need to set the community up initially as an anarchy governed by some vague consensus: instead, we could have adopted a charter from the very start. The project could have been managed quite differently; there could have been specially-designated and well-qualified editors. The project could have officially encouraged and deferred to experts. An article approval process could have been adopted without threatening the principle of posting unedited content for collaboration. Certainly, many of the later bells and whistles--the arbitration committee, a three-revert rule, having administrators with the particular configuration of rights they have, etc.--were not absolutely necessary to adopt in the precise forms they took. These differences would not have threatened the basic principles that made the project work, listed above.
So the basic principles that explain why Wikipedia could start working--and still does work--are relatively simple, few in number, and above all general. The more specific principles that Wikipedia wound up with was a matter of historical accident. There was a great deal of "wiggle room." Those intent on studying or replicating the Wikipedia model would do well to bear that in mind.
A series of controversies
So much for the very early history of Wikipedia; the next phase involved rapid growth and some serious internal controversies over policy and authority. If Wikipedia's basic policy was settled upon in the first nine months, its culture was solidified into something closer to its present form in the next nine.
The project continued to grow. We had 6000 articles by July 8; 8000 by August 7; 11,200 by September 9; and 13,000 by October 4. Consulting the website logs, we noted a Google effect: each time Google spidered the website, more pages would be indexed; the greater the number of pages indexed, the more people arrived at the project; the more people involved in the project, the more pages there were to index. In addition to this source of new contributors, Wikipedia was Slashdotted several times, and had large influxes of new users particularly after two articles I wrote for Kuro5hin were posted on Slashdot: "Britannica or Nupedia? The Future of Free Encyclopedias (July 25, 2001) and Wikipedia is wide open. Why is it growing so fast? Why isn't it full of nonsense? (September 24, 2001).
This growth brought difficult challenges, challenges that perhaps I did not sufficiently anticipate and plan for. Some of our earliest contributors were academics and other highly-qualified people, and it seems to me that they were slowly worn down and driven away by having to deal with difficult people on the project. I hope they will not mind that I mention their names, but the two that stick in my mind are J. Hoffman Kemp and Michael Tinkler, a couple of Ph.D. historians. They helped to set what I think was a good precedent for the project in that they wrote about their own areas of expertise, and they contributed under their own, real names. The latter has the salutary effect of making the contributor more serious and more apt to take responsibility for his or her contributions. They are also very nice people, but did not "suffer fools gladly," as the phrase goes. Consequently, they wound up in some pretty silly disputes that would have driven less patient people away instantly. So there was a growing problem: persistent and difficult contributors tend to drive away many better, more valuable contributors; Kemp and Tinkler were only two examples. There were many more who quietly came and quietly left. Short of removing the problem contributors altogether--which we did only in the very worst cases--there was no easy solution, under the system as we had set it up. And I am sorry to have to admit that those aspects of the system that led to this problem were as much my responsibility as anyone else's. Obviously, I would not design the system the same way if given the chance again.
As a result, I grew both more protective of the project and increasingly sensitive to abuse of the system. As I tried to exercise what little authority I claimed, as a corrective to such abuse, many newer arrivals on the scene made great sport of challenging my authority. One of the earliest challenges happened in late summer, 2001. The front page of Wikipedia--then open to anyone to edit, like any other page on the project--was occasionally vandalized with infantile graffiti. Someone then tried to make an archive of the vandalism that had been done to the front page of Wikipedia. I maintained that to make such an archive would be to encourage such vandalism, so I deleted the archive. This occasioned much debate. Then a user made the archive a "subpage" of his own user page--and user pages were generally held to be the bailiwick of the user. Consequently I deleted that subpage, which occasioned a further hue and cry that, perhaps, I was abusing my authority. The vandalism-enshrining user in question proceeded to create a "deleted pages" page, on which the deleted vandalism archives were listed, as if to accuse me of trying to act without public scrutiny; but this was, of course, perfectly acceptable to me. At the time, I thought that this controversy was just as silly as it will sound to most people reading this; I thought that I needed only to "put my foot down" a little harder and, as had happened for the first six months of the project, participants would fall into line. What I did not realize was that this was to be only the first in a long series of controversies, the ultimate upshot of which was to undermine my own moral authority over the project and to make the project as safe as possible for the most abusive and contentious contributors.
Throughout this and other early controversies, much of the debate about project policy was conducted on the wiki itself. Other debates were conducted on mailing lists, Wikipedia-L and then later, for the English language project, WikiEN-L. In addition, people had taken to putting their own essays on Wikipedia, as subpages of their user pages. These too were occasioning debate. It seemed to me, and many other contributors, that this debate was distracting the community from our main goal: to create an encyclopedia. Consequently I proposed that we move the debate to another wiki that was to be created specifically for that purpose--what became known as the "meta wiki." This proposal was very widely supported, so we set it up.
As it happened, the meta-wiki became even more uncontrolled than Wikipedia itself, and for many months was continually infested with contributions by people that can only be called "trolls." That epithet came to be discouraged, however, for reasons soon to be explained. The existence of trolls was a problem we felt we should tolerate--and deal with only verbally, not with harsh penalties--for the sake of encouraging the broadest amount of participation. In the first years, only the worst trolls were ever expelled from the project. I do not know whether this policy has been changed as a result of the operation of the much-later installed Arbitration Committee.
The reasons the meta-wiki became (at least temporarily) more uncontrolled are not far to seek. First, it had no specific purpose, other than to host project debate and essays that do not belong on the main wiki--which was not enough to make anyone care very much about it. Second, because many people did not care what happened on the meta-wiki, they did not do the very necessary weeding that takes place on Wikipedia; besides, as the meta-wiki was a repository of opinion, people felt less comfortable editing or deleting what was, after all, only opinion.
What happened was that project policy discussions moved almost exclusively to the project mailing lists. There is a reason why this was a superior solution to having much debate on an uncontrolled, "unmoderated" wiki. On a wiki, contributions exist in perpetuity, as it were, or until they are deleted or radically changed; consequently, anyone new to a discussion sees the first contribution first. So whoever starts a new page for discussion also, to a great extent, sets the tone and agenda of the discussion. Moreover, nasty, heated exchanges live on forever on a wiki, festering like an open wound, unless deliberately toned down afterwards; if the same exchange takes place on a mailing list, it slips mercifully and quietly into the archives.
At about the same time that we decided to start the meta-wiki, and soon after the vandalism archive affair, I was thinking a great deal about Wikipedia's apparent anarchy, and I wrote an essay titled, "Is Wikipedia an experiment in anarchy?" This and the discussion that ensued tended to ossify positions with regard to the authority issue: I and a few others agreed that Jimmy and I should have special authority within the system, to settle policy issues that needed settling. Jimmy was relatively quiet about this issue; this, I think, was probably because his authority was generally not in question, but mine was, because I was "in the trenches" and continuing to encourage good habits and solidify policy positions.
By November or December of 2001, Wikipedia was growing so fast and the subject of regular news reporting, even by the likes of The New York Times and MIT's Technology Review; after the two major Slashdottings earlier in the year, we knew that large influxes of members could have a tendency to change the nature of the project, and not necessarily for the better. If there were some major news coverage--an evening news story in the U.S., for example--there might be many new people who would need to be taught about Wikipedia's standards and positive cultural aspects. So I proposed what I thought was a humorously-named "Wikipedia Militia" which would manage new (and very welcome) "invasions" by new contributors. By this time, however, there was a small core group of people who were constantly on the watch for anything that smacked the least bit of authoritarianism; consequently, the name, and various aspects of how the proposal was presented, were vigorously debated. Eventually, we switched to "The Wikipedia Welcoming Committee" and finally, the "Volunteer Fire Department"--which eventually, it seems, fell into disuse.
The governance challenge
After the September Slashdotting, I composed a page originally called "Our Replies to Our Critics" (and now called "Replies to Common Objections"), in which I addressed the problem that "cranks and partisans" might abuse the system:
Moreover--and this is something that you might not be able to understand very well if you haven't actually experienced it--there is a fair bit of (mostly friendly) peer pressure, and community standards are constantly being reinforced. The cranks and partisans, etc., are not simply outgunned. They also receive considerable opprobrium if they abuse the system.
This reflects very well the conception I had in September 2001 of Wikipedia's culture; the reply above was as much hopeful and prescriptive as descriptive. But it turned out to be only partly true. As difficult users began to have more of a "run of the place," in late 2001 and 2002, opprobrium was in fact meted out only piecemeal and inconsistently. It seemed that participation in the community was becoming increasingly a struggle over principles, rather than a shared effort toward shared goals. Any attempt to enforce what should have been set policy--neutrality, no original research, and no wholesale deletion without explanation--was frequently if not usually met with resistance. It was difficult to claim the moral high ground in a dispute, because the basic project principles were constantly coming under attack. Consequently, Wikipedia's environment was not cooperative but instead competitive, and the competition often concerned what sort of community Wikipedia should be: radically anarchical and uncontrolled, or instead more singlemindedly devoted to building an encyclopedia. Sadly, few among those who would love to work on Wikipedia could thrive in such a protean environment.
It is one thing to lack any equivalent to "police" and "courts" that can quickly and effectively eliminate abuse; such enforcement systems were rarely entertained in Wikipedia's early years, because according to the wiki ideal, users can effectively police each other. It is another thing altogether to lack a community ethos that is unified in its commitment to its basic ideals, so that the community's champions could claim a moral high ground. So why was there no such unified community ethos and no uncontroversial "moral high ground"? I think it was a simple consequence of the fact that the community was to be largely self-organizing and to set its own policy by consensus. Any loud minority, even a persistent minority of one person, can remove the appearance of consensus. In fact, I recall that (in October 2002, after I resigned) I felt compelled by ongoing controversies to request that Jimmy declare that certain policies were in fact non-negotiable, which he did. Unfortunately, this declaration was too little, too late, in my opinion.
By late 2001, I had gained both friends and detractors. I think I had become, within the project, a symbol of opposition to anarchism, of the enforcement of standards, and consequently of the exercise of authority in a radically open project. But I was still trying to manage the project as I always had--by force of personality and "moral" authority. So when people arrived who clearly and openly disrespected established policy, I was, in my frustration, very short with them; and when the project continued to try to establish new policies, my role in articulating those policies and actually establishing them (attempting to express a "consensus") was challenged. This undermined what moral authority I had. I felt my job was on the line, and the project continued in turmoil day in and day out; from my point of view, fires were spreading everywhere, and as I had become a somewhat controversial figure, I did not have quite enough allies to help me put them out. Consequently I was rather too peremptory and short with some users. This, however, exacerbated the problem, because the attitude could not be backed up by punishment; harsh words from a leader are empty threats if unenforceable; I thereby handed my anti-authoritarian "wiki-anarchist" opponents an advantage, because--ironically--they were able to portray me as dictatorial, when I was anything but. I came to the view, finally and belatedly, that it would be better to "ignore the trolls." But as it turns out, this is particularly hard to do on a wiki, because, again, unlike on an e-mail list, trollish contributions do not just disappear into the archives; they sit out in the open, as available as the first day they appeared, and "festering." Attempts to delete or radically edit such contributions were often met by reposting the earlier, problem version: the ability to do that is a necessary feature of collaboration. Persistent trolls could, thus, be a serious problem, particularly if they were able to draw a sympathetic audience. And there was often an audience of sympathizers: contributors who philosophically were opposed to nearly any exercise of authority, but who were not trolls themselves.
It is surely very ironic that it was I personally who (initially) so strongly supported the lack any enforceable rules in the community. Some legal theorists would maintain that a community that lacks enforceable rules lacks any law at all. In retrospect it is clear that there was a fundamental problem with my role in the system: to have real authority, I needed both to be able to enforce the rules and, for both fairness and the perception of fairness, there needed to be clear rules from the beginning. But, by my own design, I had very early on rejected the label "editor-in-chief" and much real enforcement authority; a year into the game, it would have been difficult if not impossible to claim enforcement authority over active but problem users. Moreover, I was the author of the "ignore all rules" rule. My early rejection of any enforcement authority, my attempt to portray myself and behave as just another user who happened to have some special moral authority in the project, and my rejection of rules--these were all clearly mistakes on my part. They did, I think, help the project get off the ground; but I really needed a more subtle and forward-looking understanding of how an extremely open, decentralized project might work.
In retrospect, I wish I had taken Teddy Roosevelt's advice: "Speak softly and carry a big stick." Since my "stick" was very small, I suppose I felt compelled to "speak loudly," which I regret. (This was not such a problem, by the way, on Nupedia; partly, that was because there were not nearly as many problem users on Nupedia, but partly it was because there was clear enforcement authority.) As it turns out, it was Jimmy who spoke softly and carried the big stick; he first exercised "enforcement authority." Since he was relatively silent throughout these controversies, he was the "good cop," and I was the "bad cop": that, in fact, is precisely how he (privately) described our relationship. Eventually, I became sick of this arrangement. Because Jimmy had remained relatively toward the background in the early days of the project, and showed that he was willing to exercise enforcement authority upon occasion, he was never so ripe for attack as I was.
Perhaps the root cause of the governance problem was that we did not realize well enough that a community would form, nor did we think carefully about what this entailed. For months I denied that Wikipedia was a community, claiming that it was, instead, only an encyclopedia project, and that there should not be any serious governance problems if people would simply stick to the task of making an encyclopedia. This was strictly wishful thinking. In fact, Wikipedia was from the beginning and is both a community and an encyclopedia project. And for a community attempting to achieve something, to be serious, effective, and fair, a charter seems necessary. In short, a collaborative community would do well to think of itself as a polity with everything that that entails: a representative legislative, a competent and fair judiciary, and an effective executive, all defined in advance by a charter. There are special requirements of nearly every serious community, however, best served by relevant experts; and so I think a prominent role for the relevant experts should be written into the charter. I would recommend all of this to anyone launching a serious online community. But indeed, in January 2001, we were in both "uncharted" and "unchartered" territory. The world, I think, will be able to benefit from this and our other initial mistakes.
But in fairness to ourselves, it was a good idea to allow the community to decide by experience and consensus what article content rules to endorse. This allowed us to generate a very sensible set of article content rules. To be clear, I think it was not such a good idea to apply the same thinking to the organization of the community itself; we should have acknowledged that a community would form, that it would have certain persistent and difficult issues that would need to be solved, and that a lack of any effective founding community charter might result in chaos.
My resignation and final few months with the project
Throughout the governance controversy, I was preparing for my wedding, which happened December 1, 2001. A few days after I arrived back from my honeymoon, I was informed that I should probably start looking for another job, because Bomis was having to lay off most of its workers; they had 10-12 workers at the end of 2000, and by the beginning of 2002 they were back to their original 4-5. My salary was reduced in December and then halved in January. This seemed inevitable because Wikipedia was not bringing in any money at all for Bomis, even if Wikipedia was becoming even more of a publicly-recognized, if still modest success. Our first anniversary came just before we announced having 20,000 articles, and I was invited to talk about the project at Stanford on January 16 (text here; you might notice that I was still plugging the notion of using Nupedia to vet Wikipedia articles, as an answer to the objection that Wikipedia articles are unreliable).
I was officially laid off at the beginning of February, which I announced a few weeks later. I had continued on as a volunteer; Wikipedia and Nupedia were, after all, volunteer projects. But I was laboring in the aftermath of the governance controversies of the previous fall and winter, which promised to make the job of a volunteer project leader even more difficult. Moreover, I had to look for a real job. So throughout the month of February I considered resigning altogether.
But Jimmy had told me the previous December that Bomis would start trying to sell ads on Wikipedia in order to pay for my job. Even in that horrible market for Internet advertising, there were already enough pageviews on Wikipedia that advertising proceeds might have provided me a very meager living. We knew that this would be extremely controversial, because so many of the people who are involved in open source and open content projects absolutely hate the idea of advertising on the web pages of free projects, even to support project organizers. In fact, when this advertising plan was announced, in late February of 2002, the Spanish Wikipedia was forked (something I urged them not to do).
Bomis was not successful in selling any ads for Wikipedia anyway--you might recall that early 2002 was at about the very bottom of the market for Internet advertising. I also had some hope that we might, finally, set up the project's managing nonprofit, which we had discussed doing for a long time (and which eventually did come into being: Wikimedia). The job of setting up the nonprofit was left to me, but ongoing controversies seemed to eat up any time I had for Wikipedia, and frankly I had no idea where to begin. So, after a month without pay, I announced my general resignation; I completely stayed away from the project for a few months.
Just by the way, Wikipedia's offshoot projects--a dictionary, a textbook project, a quotation project, a public domain book repository, etc.--were all started in 2002 or later, and I cannot claim any credit for them. I did supply the name "Wiktionary" in April 2001, more or less on a whim. I quickly disavowed any responsibility for leading any such project, and it seems the Wiktionary project did not start up for another year and a half (December 12, 2002). My view now is that Webster's and the OED are quite good enough as far as English dictionaries go, and there will always be excellent free dictionaries in every language online. To try to develop a dictionary by collaboration among random Internet users, particularly in a completely uncontrolled wiki format, now strikes me as a nonstarter. I confess I am now puzzled why I didn't think so instantly; it was no doubt because I simply was throwing out ideas as they occurred to me, and also because we had too many dictionary definition-type entries in Wikipedia. (So why not give people a place to put their dictionary definitions?--Perhaps that's what I was thinking, but it hardly seems like a good justification for starting a project.) But Jimmy's first reaction was properly skeptical regarding the use of wikis and Ruth Ifcher made a stronger criticism very nicely. Dictionaries, even more than encyclopedias, must be extremely reliable to be even minimally usable; without direct oversight by linguists, a public dictionary project seems pointless. As to the other projects, they are mostly conducted using wikis and according to some of the basic founding principles of Wikipedia. But other sorts of project--for example, textbook projects, quotation repositories, and archives--necessarily require quite different specifications from those of an encyclopedia. For example, the fact that the wiki format works for encyclopedia development hardly means that it is appropriate for the hosting of public domain books. Since the same texts are available in many other places online, such as the wonderful Project Gutenberg, why would anyone choose to read The Iliad on a wiki, which could have been subtly changed by any random passer-by, without any oversight by someone who had access to an authoritative text? There is a fact about the way the text actually reads; so is editing via wiki software more apt to increase or reduce the number of errors over other systems, such as Project Gutenberg's? I do not mean to dismiss any such efforts. I simply think that considerable thought needs to be put into exactly how those other projects should be organized: the wiki format is not a magic pill that somehow makes all problems go away. Wiki is just one software paradigm, which must be adapted, supplemented, changed, or replaced in order to solve the unique set of problems a project faces.
In the spring, a controversy erupted. Caring as I did--and as I still do--about the future of free encyclopedias, I felt compelled to get involved. The controversy featured a troll who was putting up huge numbers of screeds on the "meta-wiki" and on Wikipedia as well. The controversy began with a discussion of what to do about, and how to react to, this particular troll. I maintained that one should not "feed the troll," and that the troll should be "outed" (it was an anonymous user, but it was not hard to use Google to determine the identity of the troll) and shamed.
There resulted a broader controversy about how to treat problem users generally. There were, as I recall, two main schools of thought. One, to which I adhered and still adhere, was that bona fide trolls should be "named and shamed" and, if they were unresponsive to shaming, they should be removed from the project (by a fair process) sooner rather than later. We held that a collaborative project requires commitment to ethical standards which are--as all ethical standards ultimately are--socially established by pointing out violations of those standards. Hence naming and shaming. A second school of thought held that all Wikipedia contributors, even the most difficult, should be treated respectfully and with so-called WikiLove. Hence trolls were not to be identified as such (since "troll" is a term of abuse), and were to be removed from the project only after a long (and painful) public discussion. For the latter school, it seemed to me, the only really egregious faux pas one could commit in the project was to suggest that there were objective standards that could be enforced via "shaming."
I felt at the time that the prevalence of the second school entailed rejection of both objective standards and rules-based authority. It is impossible to explain why one is removing some partisan screeds from the wiki without, in some way, identifying it as a partisan screed, and pointing out that such productions are inconsistent with the neutrality policy. This will necessarily be received as less than respectful and "loving," especially if one must engage the troll himself in a long, drawn-out dispute; in a very long dispute with any trollish type, it is only a matter of time before some epithet gets bandied about, since they are so darned useful (and accurate) when applied to trollish types. More generally, the very application of rules, or laws, entails a moral judgment, or what for its effectiveness must have the force of a moral judgment. I suppose I agree with those legal theorists who say that there is necessarily, in its core, a moral component to the law. Consequently, the new policy of "WikiLove" handed trolls and other difficult users a very effective weapon for purposes of combatting those who attempted to enforce rules. After all, any forthright declaration that a user is doing something that is clearly against established conventions--posting screeds, falsehoods, nonsense, personal opinion, etc.--is nearly always going to appear disrespectful, because such a declaration involves a moral accusation. The only way to avoid such an appearance of disrespect, perhaps, is to step very lightly and use much flattery and qualifications: "Now don't get me wrong, I think you're doing a good job overall, but it seems to me that in this particular case, your contribution is slightly inconsistent with the neutrality policy." Suppose the offender replies: "So what? I disagree with the neutrality policy." Or: "I disagree. What I wrote is perfectly neutral. Who do you think you are, anyway?" It is a very rare person who can practice "WikiLove" in such a case. In Wikipedia's developing culture, if anyone reacted out of frustration, or merely attempted to apply the law as a moral instrument, as laws typically are applied, he would become the problem, and a much more serious problem, than mere violations of the neutrality policy, say. The result is that, on pain of becoming persona non grata in the community, one had to treat brazen, self-conscious violators of basic policy with particular respect. It was a perfect coup for the resident wiki-anarchists. I again left the project for several months.
In fall of 2002, I had started teaching at a local community college, and with some extra time on my hands, I started editing Wikipedia a little and engaging in mailing list discussions. I think my first new post to Wikipedia-L, from September 1, 2002, was "Why the free encyclopedia movement needs to be more like the free software movement." In it I argued that the free software movement is led and dominated by highly-qualified programmers, and that the "free encyclopedia movement"--that is, Wikipedia, Nupedia, and other newer projects--needs to move in that direction. I suggested that Nupedia be redesigned to release "approved" versions of Wikipedia articles; Wikipedia itself was not to be touched. This proposal met with a very cool reception. After a few months of discussion, Jimmy himself was "intending to revive Nupedia in the near future" and "thinking very much along the lines of what is being discussed here." Unfortunately, this never happened.
By November or December, I think, I proposed, and Magnus Manske very helpfully coded, an expert-controlled approval process for Wikipedia that was in fact to be independent of both Nupedia and Wikipedia. It would not have affected the Wikipedia editorial process. It would have lived in a separate namespace or domain, as an independent add-on project for Wikipedia. Without explaining the details, expert reviewers, the recruitment of which I would organize, would examine Wikipedia articles and approve or disapprove of particular versions of those articles. We set up a mailing list, Sifter-L (archives no longer online, apparently), which for several weeks discussed policy issues.
There was not a great deal of support for the proposal on Wikipedia-L. There was little or no excitement that the new project might bring into Wikipedia a fresh crop of subject area specialists. But that was fine as far as I was concerned, since the project was to operate independently of Wikipedia. Still, I had the very distinct sense that any specialists arriving on the scene would not necessarily be met with open arms--particularly if before approving an article they wished to make whatever changes to articles that they felt necessary. There were even a few Wikipedians who made it clear that experts should not expect to be treated any differently than anyone else, even when writing about their areas of expertise.
I then considered whether the interaction between Wikipedians and the new reviewers might be a problem after all. Surely, I thought, most specialists would want to edit even very good articles before approving them (in the independent system). This would require that the reviewers interact with Wikipedians. Wikipedia's culture had become such that disrespect of expertise was tolerated, and, again, trolls were merely warned, but very politely (in keeping with the policy of WikiLove), that they please ought to stop their inflammatory behavior. Trolls would certainly find ripe targets in expert reviewers, I thought. I recalled that patient, well-educated Wikipedians like J. Hoffmann Kemp and Michael Tinkler had been driven off the project not only by trolls but by some of the more abrasive and disrespectful regulars. I then considered: could I in good conscience really ask academics, who are very busy, to engage in this activity that would probably annoy most of them and do nothing to contribute to their academic careers? Recruiting for Nupedia was very easy by comparison, and caused me no such pangs of conscience.
I believe it was this problem that finally prompted me, in I believe January of 2003, to inform Jimmy as follows (by private e-mail): I was breaking with the project altogether; the only way he could prevent this, I told him, was that he personally crack down on problem users, and make the project more officially welcoming to experts. I also told him that I did not expect this information to change his mind, and that I did not mean to issue an ultimatum. And in fact our exchange did not change his mind. I concluded that we had a fundamental philosophical disagreement about how the project should be run. I respected and still respect his view. That is where matters ended, and it was then that I broke with Wikipedia altogether.
Some final attempts to save Nupedia
Nevertheless, I was interested in pursuing Nupedia's development. It still seemed rescuable to me.
I recall two incidents in which I tried to have Nupedia revived, in 2002 or 2003, but I don't recall exactly. First, I approached Jimmy with the offer to try to find a buyer/managing organization for Nupedia. The suggestion was that, since Bomis did not have enough money to support it, and since Jimmy did not appear to have any specific intentions with the project other than to let it run on the system set up in 2000-1, I might be able to find a university or other organization that would take on the responsibility. I do not recall the details, but we did not pursue this possibility. Second, and later, I offered to buy Nupedia myself--that is, the domain name, the membership list, and whatever other proprietary material Bomis might have controlled. I wanted to start it up again as a simpler, more streamlined, but still fully peer-reviewed project; I thought, moreover, that if I owned it I might be able to give it to a suitable sponsoring educational or nonprofit institution. Jimmy seemed cool to the idea, and did not ask for any specific offers.
Perhaps it is, therefore, not entirely accurate to say that Nupedia died due to the inefficiency of its system. To some extent it was also allowed to die, even after it was clear that its former editor-in-chief expressed an interest in continuing the project under an entirely different system. The result was that, without a leader or organization that could support its mission, Nupedia died a slow death. The server it lived on had some trouble in 2003, and as a result the website went offline. For whatever reason, the website was never brought up again after that.
I obviously cannot speak for Jimmy, but I will say that, if he was worried that Nupedia would essentially fork Wikipedia--again, I don't claim that he had that concern--then it seems to me that such a concern would not have justified letting Nupedia wither untended. The projects, Wikipedia and Nupedia, were naturally complementary parts of a single, symbiotic whole. That at least is how I always regarded them, indeed, from the very founding of Wikipedia. From the founding of Wikipedia, I always thought Wikipedia without Nupedia would have been unreliable, and that Nupedia without Wikipedia would have been unproductive. Together they were to be an "unstoppable high-quality article-creation juggernaut."
It is still disappointing to me, that we made plans and promises to thousands of Nupedians, including hundreds of extremely well-qualified people, some of them leaders in their fields. We spent many thousands of person-hours, all told, on the project. I apologize to those people, and I can only hope that they will find some future open content encyclopedia project worthy of their participation, one that will show the world the potential that Nupedia had.
Conclusions
I have some advice for anyone who would like to start new projects on the model of Wikipedia.
You can learn from Wikipedia's success; so, first and most importantly, see above for considerations about why Wikipedia works.
But you can also learn from our mistakes. The following primarily concerns project governance, because governance issues are, in my opinion, the primary failing of Wikipedia. Bear in mind, also, that these are only rough guidelines, for those who are starting projects that have enough resemblance to Wikipedia. These are not perfectly general rules:
- If you intend to create a very large, complex project, establish early on that there will be some non-negotiable policy. Wikis and collaborative projects necessarily build communities, and once a community becomes large enough, it absolutely must have rules to keep order and to keep people at work on the mission of the project. "Force of personality" might be enough to make a small group of people hang together; for better or worse, however, clearly enunciated rules are needed to make larger groups of people hang together.
- There is some policy that, with forethought, can be easily predicted will be necessary. Articulate this policy as soon as possible. Indeed, consider making a project charter to make it clear from the beginning what the basic principles governing the project will be. This will help the community to run more smoothly and allow participants to self-select correctly.
- Establish any necessary authority early and clearly. Managers should not be afraid to enforce the project charter by removing people from the project; as soon as it becomes necessary, it should be done. Standards that are not enforced in any way do not exist in any robust sense. Do not tolerate deliberate disruption from those who oppose your aims; tell them to start their own project; there's a potentially infinite amount of cyberspace.
- As any disagreements among project managers are apt to be publicly visible in a collaborative project, and as this is apt to undermine the (very important) moral authority of at least one manager, make sure management is on the same page from the beginning--preferably before launch. This requires a great deal of thinking through issues together.
- In knowledge-creation projects, and perhaps many other kinds of projects, make special roles for experts from the very beginning; do not attempt to add those roles later, as an afterthought. Specialists are one of your most important resources, and it is irrational not to use them as much as you can. Preferably, design the charter so that they are included and encouraged. Moreover, make the volunteer project management a meritocracy, and not based on longevity but based on the ability to lead and contribute to the project; that is the only condition under which very many of the best qualified people will want to participate.
Another point needs more in-depth development.
Radical and untried new ideas require constant refinement and adaptation in order to succeed; the first proposal is very rarely the best, and project designers must learn from their mistakes and constantly redesign better projects. Nupedia's Advisory Board failed to admit to inherent flaws in its system, and its delay in admission shut the window of opportunity to its improvement. And it seems to me that the Wikipedia community fell into a mistake by thinking that just one or two features--the wiki feature and the neutrality policy and a few other things--explained Wikipedia's success, and that those features can thus be applied with no significant changes to new projects. But there is no substitute for constant creativity and problem-solving--nor for honesty about what problems need solving. The honesty to recognize problems and creativity in solving them are, after all, what made Wikipedia succeed in the first place.
This is a crucial point: if you use a tool or model from another project, think through very carefully how that tool or model should be adapted. Do not assume that you need to use every feature, or every aspect of the surrounding culture, that you are borrowing. Wikipedia borrowed rather too much from (1) the culture of wikis, (2) unmoderated online discussions, and (3) free-wheeling online culture generally. To be sure, Wikipedia is also a product of those cultures, and works as well as it does largely because of what it borrowed from those cultures. But it also shares some of its more serious current flaws with such cultures. Those planning new projects, or wanting to overhaul old ones, might well bear in mind that a certain cultural context, including the context that has grown up around a tool, just might not be right for that project. Let me elaborate.
(1) Consider first the culture of wikis. On the one hand, I said we wanted to determine the best rules, and experience would help us determine that; so we had no rules to begin with. On the other hand, one might add that another reason we began without rules was that we were partaking in the extremely uncontrolled, free-wheeling nature of "traditional" wikis. I think that's right. But there is an excellent reason why an encyclopedia project should not partake in that extremely uncontrolled nature of wiki culture, and why it should adopt actually enforceable rules: unlike traditional wikis, encyclopedia projects have a very specific aim, with very specific constraints, and efficient work toward that aim, within those constraints, practically requires the adoption of enforceable rules. The mere fact that most wikis, when Wikipedia was created, did not have enforceable rules hardly meant that one could not innovate further, and create one that did have rules.
(2) Moreover, Jimmy and I and most of the first participants on Wikipedia were veterans of unmoderated Internet discussion groups, and hence, naturally, we could appreciate the advantages of letting a virtual community develop in the absence of any real (enforcement) authority. In unmoderated forums there is often found a sense, among some participants, that any attempt to oust a particularly troublesome user amounts to unjustifiable censorship. The result is that the existence of many unmoderated forums online has created a small army of people militantly opposed to the slightest restriction on speech, who feel that they do and should have a right to say whatever they like, wherever they like, online. Any attempt to create and enforce rules for Internet projects, when that small army is ready to cry "censorship," will seem daring or even outrageous in many contexts online. But there is an excellent reason why such anarchy is inappropriate for many projects, including encyclopedia projects, even one that is self-policing like a wiki: there simply must be a way to enforce rules in order for rules to be effective. Given that encyclopedia project development happens almost entirely using words, nearly any rules will also be restrictions on speech. Anyone who advocates many enforceable rules on a collaborative project, in the cultural context of an Internet filled with so many unmoderated discussion groups, can be made to seem reactionary. But this is only a result of that cultural context; in any other context, the existence of rules would be perfectly natural and unobjectionable.
(3) Finally, and generally speaking, the Internet is a great leveller. Since social interaction can proceed among complete strangers who cannot so much as see each other, things that seem to matter in many "meatspace" discussions, such as age, social status, and level of education, are often dismissed as unimportant online. Many Internet forums, chatrooms, and blogs are populated by people who are identified by only a "handle," and any suggestion that communication should be restricted or in any way altered in accordance with "expertise" or "authority" is likely to be met with outrage, in most forums. But there are several excellent and obvious reasons why expertise does need special consideration in an encyclopedia project, and in other collaborative projects. First, there are many subjects that dilettantes cannot write about credibly; I, for example, could not write very credibly about astronomy or speleology, but I have a passing interest in both. If I am working only with other dilettantes, our articles are apt to remain amateurish at best; we can fill in the gaps in each other's knowledge, and do research, but the results will remain problematic until someone with more knowledge of the subject contributes. Second, there are very many specialized subjects about which no one but experts has any significant knowledge at all. Third, it is only the opinions of experts that will be trusted by most of the public as authoritative in determining whether an article is generally reliable or not. Moreover, the standards of public credibility are not likely to be changed by the widespread use of Wikipedia or by online debate about the reliability of Wikipedia. Like them or hate them, those are the facts. But if one points these facts out online, culturally "levelled" as it is, particularly in forums or projects like Wikipedia which go out of their way to ignore individual differences among people, one finds a frosty reception at best.
Consider, if you will, that it was because Wikipedia was started in the context of the ingrained cultures of wikis, of unmoderated discussion forums, and of the levelling, anti-elitist influence of the Internet at large, that it was very difficult for us to exercise the maximal amount of creativity that a maximally successful project would require. In establishing a new cultural context, we were deeply constrained by the old. Now, to be sure, I have said above and many times elsewhere that Wikipedia did not have to adopt the particular conjunction of policies that it did. But it is not surprising that it did adopt its particular conjunction of policies, considering the conjunction of influences on its development. So it would have required much more explanation and persuasion, and indeed, much more struggle, for us to, for example, have persuaded potential participants that some persons, even in a wiki environment, should have special rights that others do not. So powerful is the influence of cultural context that there are quite a few people whose lack of imagination is such that they believe I simply must not understand "why Wikipedia works" if I am willing to suggest that it does not have to work in precisely the way it does work. Constantly-reinforced cultural habits die very hard indeed, and place very strong constraints upon what can be imagined, and what bare possibilities seem even worth thinking about.
But it was our willingness to exercise our creativity and follow our imagination, and create what is, to some extent, a new kind of culture, that led to Wikipedia's success. For the overall project of creating open content encyclopedias--and indeed, for the fantastic collaborative Internet that has yet to be created--to reach its full potential, the process of identifying mistakes honestly and creatively seeking solutions must be ramped up and continued unabated.
Many thanks to Larry Sanger and to O'Reilly for this memoir. -
The Early History of Nupedia and Wikipedia, Part II
Today, read the continuation of Larry Sanger's account of the early history of Nupedia and Wikipedia (below), in which Sanger talks about the difficulties of governance in a large, free-wheeling project, some final attempts to save Nupedia, and how he came to resign from the organization. (And if you missed it, you might want to start with yesterday's installment.)Contents:
Why Wikipedia started working
A series of controversies
The governance challenge
My resignation and final few months with the project
Some final attempts to save Nupedia
ConclusionsWhy Wikipedia started working
This is a good place to explain why Wikipedia actually got started and why it worked (and still does work, at least as well as it does). The explanation involves a combination of quite a few factors, some borrowed from the open source movement, some borrowed from wiki software and culture, and some more idiosyncratic:
- Open content license. We promised contributors that their work would always remain free for others to read. This, as is well known, motivates people to work for the good of the world--and for the many people who would like to teach the whole world, that's a pretty strong motivation.
- Focus on the encyclopedia. We said that we were creating an encyclopedia, not a dictionary, etc., and we encouraged people to stick to creating the encyclopedia and not use the project as a debate forum.
- Openness. Anyone could contribute. Everyone was specifically made to feel welcome. (E.g., we encouraged the habit of writing on new contributors' user pages, "Welcome to Wikipedia!" etc.) There was no sense that someone would be turned away for not being bright enough, or not being a good enough writer, or whatever.
- Ease of editing. Wikis are pretty easy for most people to figure out. In other collaborative systems (like Nupedia), you have to learn all about the system first. Wikipedia had an almost flat learning curve.
- Collaborate radically; don't sign articles. Radical collaboration, in which (in principle) anyone can edit any part of anyone else's work, is one of the great innovations of the open source software movement. On Wikipedia, radical collaboration made it possible for work to move forward on all fronts at the same time, to avoid the big bottleneck that is the individual author, and to burnish articles on popular topics to a fine luster.
- Offer unedited, unapproved content for further development. This is required if one wishes to collaborate radically. We encouraged putting up their unfinished drafts--as long as they were at least roughly correct--with the idea that they can only improve if there are others collaborating. This is a classic principle of open source software. It helped get Wikipedia started and helped keep it moving. This is why so many original drafts of Wikipedia articles were basically garbage (no offense to anyone--some of my own drafts were sometimes garbage), and also why it is surprising to the uninitiated that many articles have turned out very well indeed.
- Neutrality. A firm neutrality policy made it possible for people of widely divergent opinions to work together, without constantly fighting. It's a way to keep the peace.
- Start with a core of good people. I think it was essential that we began the project with a core group of intelligent good writers who understood what an encyclopedia should look like, and who were basically decent human beings.
- Enjoy the Google effect. We had little to do with this, but had Google not sent us an increasing amount of traffic each time they spidered the growing website, we would not have grown nearly as fast as we did. (See below.)
That's pretty much it. The focus on the encyclopedia provided the task and the open content license provided a natural motivation: people work hard if they believe they are teaching the world stuff. Openness and ease of editing made it easy for new people to join in and get to work. Collaboration helped move work forward quickly and efficiently, and posting unedited drafts made collaboration possible. The fact that we started with a core of good people from Nupedia meant that the project could develop a functional, cooperative community. Neutrality made it easy for people to work together with relatively little conflict. And the Google effect provided a steady supply of "fresh blood"--who in turn supplied increasing amounts of content.
Probably, all or nearly all other project rules were either optional, or straightforward applications of these principles. The project probably would still have succeeded nicely even if it had moderated or tweaked some of the above principles. For instance, radical openness, that is, being open even to those who brazenly flouted and disrespected the project's mission, was surely not necessary; after all, without them, the project would have been more welcoming to the many people who felt they could not work with such difficult people. And if we had required people to sign in, that would not have made very much difference (although it probably would have made some in the beginning; the project wouldn't have grown as fast). Of course we didn't have to use the GNU FDL for the license. Certainly, we did not need to set the community up initially as an anarchy governed by some vague consensus: instead, we could have adopted a charter from the very start. The project could have been managed quite differently; there could have been specially-designated and well-qualified editors. The project could have officially encouraged and deferred to experts. An article approval process could have been adopted without threatening the principle of posting unedited content for collaboration. Certainly, many of the later bells and whistles--the arbitration committee, a three-revert rule, having administrators with the particular configuration of rights they have, etc.--were not absolutely necessary to adopt in the precise forms they took. These differences would not have threatened the basic principles that made the project work, listed above.
So the basic principles that explain why Wikipedia could start working--and still does work--are relatively simple, few in number, and above all general. The more specific principles that Wikipedia wound up with was a matter of historical accident. There was a great deal of "wiggle room." Those intent on studying or replicating the Wikipedia model would do well to bear that in mind.
A series of controversies
So much for the very early history of Wikipedia; the next phase involved rapid growth and some serious internal controversies over policy and authority. If Wikipedia's basic policy was settled upon in the first nine months, its culture was solidified into something closer to its present form in the next nine.
The project continued to grow. We had 6000 articles by July 8; 8000 by August 7; 11,200 by September 9; and 13,000 by October 4. Consulting the website logs, we noted a Google effect: each time Google spidered the website, more pages would be indexed; the greater the number of pages indexed, the more people arrived at the project; the more people involved in the project, the more pages there were to index. In addition to this source of new contributors, Wikipedia was Slashdotted several times, and had large influxes of new users particularly after two articles I wrote for Kuro5hin were posted on Slashdot: "Britannica or Nupedia? The Future of Free Encyclopedias (July 25, 2001) and Wikipedia is wide open. Why is it growing so fast? Why isn't it full of nonsense? (September 24, 2001).
This growth brought difficult challenges, challenges that perhaps I did not sufficiently anticipate and plan for. Some of our earliest contributors were academics and other highly-qualified people, and it seems to me that they were slowly worn down and driven away by having to deal with difficult people on the project. I hope they will not mind that I mention their names, but the two that stick in my mind are J. Hoffman Kemp and Michael Tinkler, a couple of Ph.D. historians. They helped to set what I think was a good precedent for the project in that they wrote about their own areas of expertise, and they contributed under their own, real names. The latter has the salutary effect of making the contributor more serious and more apt to take responsibility for his or her contributions. They are also very nice people, but did not "suffer fools gladly," as the phrase goes. Consequently, they wound up in some pretty silly disputes that would have driven less patient people away instantly. So there was a growing problem: persistent and difficult contributors tend to drive away many better, more valuable contributors; Kemp and Tinkler were only two examples. There were many more who quietly came and quietly left. Short of removing the problem contributors altogether--which we did only in the very worst cases--there was no easy solution, under the system as we had set it up. And I am sorry to have to admit that those aspects of the system that led to this problem were as much my responsibility as anyone else's. Obviously, I would not design the system the same way if given the chance again.
As a result, I grew both more protective of the project and increasingly sensitive to abuse of the system. As I tried to exercise what little authority I claimed, as a corrective to such abuse, many newer arrivals on the scene made great sport of challenging my authority. One of the earliest challenges happened in late summer, 2001. The front page of Wikipedia--then open to anyone to edit, like any other page on the project--was occasionally vandalized with infantile graffiti. Someone then tried to make an archive of the vandalism that had been done to the front page of Wikipedia. I maintained that to make such an archive would be to encourage such vandalism, so I deleted the archive. This occasioned much debate. Then a user made the archive a "subpage" of his own user page--and user pages were generally held to be the bailiwick of the user. Consequently I deleted that subpage, which occasioned a further hue and cry that, perhaps, I was abusing my authority. The vandalism-enshrining user in question proceeded to create a "deleted pages" page, on which the deleted vandalism archives were listed, as if to accuse me of trying to act without public scrutiny; but this was, of course, perfectly acceptable to me. At the time, I thought that this controversy was just as silly as it will sound to most people reading this; I thought that I needed only to "put my foot down" a little harder and, as had happened for the first six months of the project, participants would fall into line. What I did not realize was that this was to be only the first in a long series of controversies, the ultimate upshot of which was to undermine my own moral authority over the project and to make the project as safe as possible for the most abusive and contentious contributors.
Throughout this and other early controversies, much of the debate about project policy was conducted on the wiki itself. Other debates were conducted on mailing lists, Wikipedia-L and then later, for the English language project, WikiEN-L. In addition, people had taken to putting their own essays on Wikipedia, as subpages of their user pages. These too were occasioning debate. It seemed to me, and many other contributors, that this debate was distracting the community from our main goal: to create an encyclopedia. Consequently I proposed that we move the debate to another wiki that was to be created specifically for that purpose--what became known as the "meta wiki." This proposal was very widely supported, so we set it up.
As it happened, the meta-wiki became even more uncontrolled than Wikipedia itself, and for many months was continually infested with contributions by people that can only be called "trolls." That epithet came to be discouraged, however, for reasons soon to be explained. The existence of trolls was a problem we felt we should tolerate--and deal with only verbally, not with harsh penalties--for the sake of encouraging the broadest amount of participation. In the first years, only the worst trolls were ever expelled from the project. I do not know whether this policy has been changed as a result of the operation of the much-later installed Arbitration Committee.
The reasons the meta-wiki became (at least temporarily) more uncontrolled are not far to seek. First, it had no specific purpose, other than to host project debate and essays that do not belong on the main wiki--which was not enough to make anyone care very much about it. Second, because many people did not care what happened on the meta-wiki, they did not do the very necessary weeding that takes place on Wikipedia; besides, as the meta-wiki was a repository of opinion, people felt less comfortable editing or deleting what was, after all, only opinion.
What happened was that project policy discussions moved almost exclusively to the project mailing lists. There is a reason why this was a superior solution to having much debate on an uncontrolled, "unmoderated" wiki. On a wiki, contributions exist in perpetuity, as it were, or until they are deleted or radically changed; consequently, anyone new to a discussion sees the first contribution first. So whoever starts a new page for discussion also, to a great extent, sets the tone and agenda of the discussion. Moreover, nasty, heated exchanges live on forever on a wiki, festering like an open wound, unless deliberately toned down afterwards; if the same exchange takes place on a mailing list, it slips mercifully and quietly into the archives.
At about the same time that we decided to start the meta-wiki, and soon after the vandalism archive affair, I was thinking a great deal about Wikipedia's apparent anarchy, and I wrote an essay titled, "Is Wikipedia an experiment in anarchy?" This and the discussion that ensued tended to ossify positions with regard to the authority issue: I and a few others agreed that Jimmy and I should have special authority within the system, to settle policy issues that needed settling. Jimmy was relatively quiet about this issue; this, I think, was probably because his authority was generally not in question, but mine was, because I was "in the trenches" and continuing to encourage good habits and solidify policy positions.
By November or December of 2001, Wikipedia was growing so fast and the subject of regular news reporting, even by the likes of The New York Times and MIT's Technology Review; after the two major Slashdottings earlier in the year, we knew that large influxes of members could have a tendency to change the nature of the project, and not necessarily for the better. If there were some major news coverage--an evening news story in the U.S., for example--there might be many new people who would need to be taught about Wikipedia's standards and positive cultural aspects. So I proposed what I thought was a humorously-named "Wikipedia Militia" which would manage new (and very welcome) "invasions" by new contributors. By this time, however, there was a small core group of people who were constantly on the watch for anything that smacked the least bit of authoritarianism; consequently, the name, and various aspects of how the proposal was presented, were vigorously debated. Eventually, we switched to "The Wikipedia Welcoming Committee" and finally, the "Volunteer Fire Department"--which eventually, it seems, fell into disuse.
The governance challenge
After the September Slashdotting, I composed a page originally called "Our Replies to Our Critics" (and now called "Replies to Common Objections"), in which I addressed the problem that "cranks and partisans" might abuse the system:
Moreover--and this is something that you might not be able to understand very well if you haven't actually experienced it--there is a fair bit of (mostly friendly) peer pressure, and community standards are constantly being reinforced. The cranks and partisans, etc., are not simply outgunned. They also receive considerable opprobrium if they abuse the system.
This reflects very well the conception I had in September 2001 of Wikipedia's culture; the reply above was as much hopeful and prescriptive as descriptive. But it turned out to be only partly true. As difficult users began to have more of a "run of the place," in late 2001 and 2002, opprobrium was in fact meted out only piecemeal and inconsistently. It seemed that participation in the community was becoming increasingly a struggle over principles, rather than a shared effort toward shared goals. Any attempt to enforce what should have been set policy--neutrality, no original research, and no wholesale deletion without explanation--was frequently if not usually met with resistance. It was difficult to claim the moral high ground in a dispute, because the basic project principles were constantly coming under attack. Consequently, Wikipedia's environment was not cooperative but instead competitive, and the competition often concerned what sort of community Wikipedia should be: radically anarchical and uncontrolled, or instead more singlemindedly devoted to building an encyclopedia. Sadly, few among those who would love to work on Wikipedia could thrive in such a protean environment.
It is one thing to lack any equivalent to "police" and "courts" that can quickly and effectively eliminate abuse; such enforcement systems were rarely entertained in Wikipedia's early years, because according to the wiki ideal, users can effectively police each other. It is another thing altogether to lack a community ethos that is unified in its commitment to its basic ideals, so that the community's champions could claim a moral high ground. So why was there no such unified community ethos and no uncontroversial "moral high ground"? I think it was a simple consequence of the fact that the community was to be largely self-organizing and to set its own policy by consensus. Any loud minority, even a persistent minority of one person, can remove the appearance of consensus. In fact, I recall that (in October 2002, after I resigned) I felt compelled by ongoing controversies to request that Jimmy declare that certain policies were in fact non-negotiable, which he did. Unfortunately, this declaration was too little, too late, in my opinion.
By late 2001, I had gained both friends and detractors. I think I had become, within the project, a symbol of opposition to anarchism, of the enforcement of standards, and consequently of the exercise of authority in a radically open project. But I was still trying to manage the project as I always had--by force of personality and "moral" authority. So when people arrived who clearly and openly disrespected established policy, I was, in my frustration, very short with them; and when the project continued to try to establish new policies, my role in articulating those policies and actually establishing them (attempting to express a "consensus") was challenged. This undermined what moral authority I had. I felt my job was on the line, and the project continued in turmoil day in and day out; from my point of view, fires were spreading everywhere, and as I had become a somewhat controversial figure, I did not have quite enough allies to help me put them out. Consequently I was rather too peremptory and short with some users. This, however, exacerbated the problem, because the attitude could not be backed up by punishment; harsh words from a leader are empty threats if unenforceable; I thereby handed my anti-authoritarian "wiki-anarchist" opponents an advantage, because--ironically--they were able to portray me as dictatorial, when I was anything but. I came to the view, finally and belatedly, that it would be better to "ignore the trolls." But as it turns out, this is particularly hard to do on a wiki, because, again, unlike on an e-mail list, trollish contributions do not just disappear into the archives; they sit out in the open, as available as the first day they appeared, and "festering." Attempts to delete or radically edit such contributions were often met by reposting the earlier, problem version: the ability to do that is a necessary feature of collaboration. Persistent trolls could, thus, be a serious problem, particularly if they were able to draw a sympathetic audience. And there was often an audience of sympathizers: contributors who philosophically were opposed to nearly any exercise of authority, but who were not trolls themselves.
It is surely very ironic that it was I personally who (initially) so strongly supported the lack any enforceable rules in the community. Some legal theorists would maintain that a community that lacks enforceable rules lacks any law at all. In retrospect it is clear that there was a fundamental problem with my role in the system: to have real authority, I needed both to be able to enforce the rules and, for both fairness and the perception of fairness, there needed to be clear rules from the beginning. But, by my own design, I had very early on rejected the label "editor-in-chief" and much real enforcement authority; a year into the game, it would have been difficult if not impossible to claim enforcement authority over active but problem users. Moreover, I was the author of the "ignore all rules" rule. My early rejection of any enforcement authority, my attempt to portray myself and behave as just another user who happened to have some special moral authority in the project, and my rejection of rules--these were all clearly mistakes on my part. They did, I think, help the project get off the ground; but I really needed a more subtle and forward-looking understanding of how an extremely open, decentralized project might work.
In retrospect, I wish I had taken Teddy Roosevelt's advice: "Speak softly and carry a big stick." Since my "stick" was very small, I suppose I felt compelled to "speak loudly," which I regret. (This was not such a problem, by the way, on Nupedia; partly, that was because there were not nearly as many problem users on Nupedia, but partly it was because there was clear enforcement authority.) As it turns out, it was Jimmy who spoke softly and carried the big stick; he first exercised "enforcement authority." Since he was relatively silent throughout these controversies, he was the "good cop," and I was the "bad cop": that, in fact, is precisely how he (privately) described our relationship. Eventually, I became sick of this arrangement. Because Jimmy had remained relatively toward the background in the early days of the project, and showed that he was willing to exercise enforcement authority upon occasion, he was never so ripe for attack as I was.
Perhaps the root cause of the governance problem was that we did not realize well enough that a community would form, nor did we think carefully about what this entailed. For months I denied that Wikipedia was a community, claiming that it was, instead, only an encyclopedia project, and that there should not be any serious governance problems if people would simply stick to the task of making an encyclopedia. This was strictly wishful thinking. In fact, Wikipedia was from the beginning and is both a community and an encyclopedia project. And for a community attempting to achieve something, to be serious, effective, and fair, a charter seems necessary. In short, a collaborative community would do well to think of itself as a polity with everything that that entails: a representative legislative, a competent and fair judiciary, and an effective executive, all defined in advance by a charter. There are special requirements of nearly every serious community, however, best served by relevant experts; and so I think a prominent role for the relevant experts should be written into the charter. I would recommend all of this to anyone launching a serious online community. But indeed, in January 2001, we were in both "uncharted" and "unchartered" territory. The world, I think, will be able to benefit from this and our other initial mistakes.
But in fairness to ourselves, it was a good idea to allow the community to decide by experience and consensus what article content rules to endorse. This allowed us to generate a very sensible set of article content rules. To be clear, I think it was not such a good idea to apply the same thinking to the organization of the community itself; we should have acknowledged that a community would form, that it would have certain persistent and difficult issues that would need to be solved, and that a lack of any effective founding community charter might result in chaos.
My resignation and final few months with the project
Throughout the governance controversy, I was preparing for my wedding, which happened December 1, 2001. A few days after I arrived back from my honeymoon, I was informed that I should probably start looking for another job, because Bomis was having to lay off most of its workers; they had 10-12 workers at the end of 2000, and by the beginning of 2002 they were back to their original 4-5. My salary was reduced in December and then halved in January. This seemed inevitable because Wikipedia was not bringing in any money at all for Bomis, even if Wikipedia was becoming even more of a publicly-recognized, if still modest success. Our first anniversary came just before we announced having 20,000 articles, and I was invited to talk about the project at Stanford on January 16 (text here; you might notice that I was still plugging the notion of using Nupedia to vet Wikipedia articles, as an answer to the objection that Wikipedia articles are unreliable).
I was officially laid off at the beginning of February, which I announced a few weeks later. I had continued on as a volunteer; Wikipedia and Nupedia were, after all, volunteer projects. But I was laboring in the aftermath of the governance controversies of the previous fall and winter, which promised to make the job of a volunteer project leader even more difficult. Moreover, I had to look for a real job. So throughout the month of February I considered resigning altogether.
But Jimmy had told me the previous December that Bomis would start trying to sell ads on Wikipedia in order to pay for my job. Even in that horrible market for Internet advertising, there were already enough pageviews on Wikipedia that advertising proceeds might have provided me a very meager living. We knew that this would be extremely controversial, because so many of the people who are involved in open source and open content projects absolutely hate the idea of advertising on the web pages of free projects, even to support project organizers. In fact, when this advertising plan was announced, in late February of 2002, the Spanish Wikipedia was forked (something I urged them not to do).
Bomis was not successful in selling any ads for Wikipedia anyway--you might recall that early 2002 was at about the very bottom of the market for Internet advertising. I also had some hope that we might, finally, set up the project's managing nonprofit, which we had discussed doing for a long time (and which eventually did come into being: Wikimedia). The job of setting up the nonprofit was left to me, but ongoing controversies seemed to eat up any time I had for Wikipedia, and frankly I had no idea where to begin. So, after a month without pay, I announced my general resignation; I completely stayed away from the project for a few months.
Just by the way, Wikipedia's offshoot projects--a dictionary, a textbook project, a quotation project, a public domain book repository, etc.--were all started in 2002 or later, and I cannot claim any credit for them. I did supply the name "Wiktionary" in April 2001, more or less on a whim. I quickly disavowed any responsibility for leading any such project, and it seems the Wiktionary project did not start up for another year and a half (December 12, 2002). My view now is that Webster's and the OED are quite good enough as far as English dictionaries go, and there will always be excellent free dictionaries in every language online. To try to develop a dictionary by collaboration among random Internet users, particularly in a completely uncontrolled wiki format, now strikes me as a nonstarter. I confess I am now puzzled why I didn't think so instantly; it was no doubt because I simply was throwing out ideas as they occurred to me, and also because we had too many dictionary definition-type entries in Wikipedia. (So why not give people a place to put their dictionary definitions?--Perhaps that's what I was thinking, but it hardly seems like a good justification for starting a project.) But Jimmy's first reaction was properly skeptical regarding the use of wikis and Ruth Ifcher made a stronger criticism very nicely. Dictionaries, even more than encyclopedias, must be extremely reliable to be even minimally usable; without direct oversight by linguists, a public dictionary project seems pointless. As to the other projects, they are mostly conducted using wikis and according to some of the basic founding principles of Wikipedia. But other sorts of project--for example, textbook projects, quotation repositories, and archives--necessarily require quite different specifications from those of an encyclopedia. For example, the fact that the wiki format works for encyclopedia development hardly means that it is appropriate for the hosting of public domain books. Since the same texts are available in many other places online, such as the wonderful Project Gutenberg, why would anyone choose to read The Iliad on a wiki, which could have been subtly changed by any random passer-by, without any oversight by someone who had access to an authoritative text? There is a fact about the way the text actually reads; so is editing via wiki software more apt to increase or reduce the number of errors over other systems, such as Project Gutenberg's? I do not mean to dismiss any such efforts. I simply think that considerable thought needs to be put into exactly how those other projects should be organized: the wiki format is not a magic pill that somehow makes all problems go away. Wiki is just one software paradigm, which must be adapted, supplemented, changed, or replaced in order to solve the unique set of problems a project faces.
In the spring, a controversy erupted. Caring as I did--and as I still do--about the future of free encyclopedias, I felt compelled to get involved. The controversy featured a troll who was putting up huge numbers of screeds on the "meta-wiki" and on Wikipedia as well. The controversy began with a discussion of what to do about, and how to react to, this particular troll. I maintained that one should not "feed the troll," and that the troll should be "outed" (it was an anonymous user, but it was not hard to use Google to determine the identity of the troll) and shamed.
There resulted a broader controversy about how to treat problem users generally. There were, as I recall, two main schools of thought. One, to which I adhered and still adhere, was that bona fide trolls should be "named and shamed" and, if they were unresponsive to shaming, they should be removed from the project (by a fair process) sooner rather than later. We held that a collaborative project requires commitment to ethical standards which are--as all ethical standards ultimately are--socially established by pointing out violations of those standards. Hence naming and shaming. A second school of thought held that all Wikipedia contributors, even the most difficult, should be treated respectfully and with so-called WikiLove. Hence trolls were not to be identified as such (since "troll" is a term of abuse), and were to be removed from the project only after a long (and painful) public discussion. For the latter school, it seemed to me, the only really egregious faux pas one could commit in the project was to suggest that there were objective standards that could be enforced via "shaming."
I felt at the time that the prevalence of the second school entailed rejection of both objective standards and rules-based authority. It is impossible to explain why one is removing some partisan screeds from the wiki without, in some way, identifying it as a partisan screed, and pointing out that such productions are inconsistent with the neutrality policy. This will necessarily be received as less than respectful and "loving," especially if one must engage the troll himself in a long, drawn-out dispute; in a very long dispute with any trollish type, it is only a matter of time before some epithet gets bandied about, since they are so darned useful (and accurate) when applied to trollish types. More generally, the very application of rules, or laws, entails a moral judgment, or what for its effectiveness must have the force of a moral judgment. I suppose I agree with those legal theorists who say that there is necessarily, in its core, a moral component to the law. Consequently, the new policy of "WikiLove" handed trolls and other difficult users a very effective weapon for purposes of combatting those who attempted to enforce rules. After all, any forthright declaration that a user is doing something that is clearly against established conventions--posting screeds, falsehoods, nonsense, personal opinion, etc.--is nearly always going to appear disrespectful, because such a declaration involves a moral accusation. The only way to avoid such an appearance of disrespect, perhaps, is to step very lightly and use much flattery and qualifications: "Now don't get me wrong, I think you're doing a good job overall, but it seems to me that in this particular case, your contribution is slightly inconsistent with the neutrality policy." Suppose the offender replies: "So what? I disagree with the neutrality policy." Or: "I disagree. What I wrote is perfectly neutral. Who do you think you are, anyway?" It is a very rare person who can practice "WikiLove" in such a case. In Wikipedia's developing culture, if anyone reacted out of frustration, or merely attempted to apply the law as a moral instrument, as laws typically are applied, he would become the problem, and a much more serious problem, than mere violations of the neutrality policy, say. The result is that, on pain of becoming persona non grata in the community, one had to treat brazen, self-conscious violators of basic policy with particular respect. It was a perfect coup for the resident wiki-anarchists. I again left the project for several months.
In fall of 2002, I had started teaching at a local community college, and with some extra time on my hands, I started editing Wikipedia a little and engaging in mailing list discussions. I think my first new post to Wikipedia-L, from September 1, 2002, was "Why the free encyclopedia movement needs to be more like the free software movement." In it I argued that the free software movement is led and dominated by highly-qualified programmers, and that the "free encyclopedia movement"--that is, Wikipedia, Nupedia, and other newer projects--needs to move in that direction. I suggested that Nupedia be redesigned to release "approved" versions of Wikipedia articles; Wikipedia itself was not to be touched. This proposal met with a very cool reception. After a few months of discussion, Jimmy himself was "intending to revive Nupedia in the near future" and "thinking very much along the lines of what is being discussed here." Unfortunately, this never happened.
By November or December, I think, I proposed, and Magnus Manske very helpfully coded, an expert-controlled approval process for Wikipedia that was in fact to be independent of both Nupedia and Wikipedia. It would not have affected the Wikipedia editorial process. It would have lived in a separate namespace or domain, as an independent add-on project for Wikipedia. Without explaining the details, expert reviewers, the recruitment of which I would organize, would examine Wikipedia articles and approve or disapprove of particular versions of those articles. We set up a mailing list, Sifter-L (archives no longer online, apparently), which for several weeks discussed policy issues.
There was not a great deal of support for the proposal on Wikipedia-L. There was little or no excitement that the new project might bring into Wikipedia a fresh crop of subject area specialists. But that was fine as far as I was concerned, since the project was to operate independently of Wikipedia. Still, I had the very distinct sense that any specialists arriving on the scene would not necessarily be met with open arms--particularly if before approving an article they wished to make whatever changes to articles that they felt necessary. There were even a few Wikipedians who made it clear that experts should not expect to be treated any differently than anyone else, even when writing about their areas of expertise.
I then considered whether the interaction between Wikipedians and the new reviewers might be a problem after all. Surely, I thought, most specialists would want to edit even very good articles before approving them (in the independent system). This would require that the reviewers interact with Wikipedians. Wikipedia's culture had become such that disrespect of expertise was tolerated, and, again, trolls were merely warned, but very politely (in keeping with the policy of WikiLove), that they please ought to stop their inflammatory behavior. Trolls would certainly find ripe targets in expert reviewers, I thought. I recalled that patient, well-educated Wikipedians like J. Hoffmann Kemp and Michael Tinkler had been driven off the project not only by trolls but by some of the more abrasive and disrespectful regulars. I then considered: could I in good conscience really ask academics, who are very busy, to engage in this activity that would probably annoy most of them and do nothing to contribute to their academic careers? Recruiting for Nupedia was very easy by comparison, and caused me no such pangs of conscience.
I believe it was this problem that finally prompted me, in I believe January of 2003, to inform Jimmy as follows (by private e-mail): I was breaking with the project altogether; the only way he could prevent this, I told him, was that he personally crack down on problem users, and make the project more officially welcoming to experts. I also told him that I did not expect this information to change his mind, and that I did not mean to issue an ultimatum. And in fact our exchange did not change his mind. I concluded that we had a fundamental philosophical disagreement about how the project should be run. I respected and still respect his view. That is where matters ended, and it was then that I broke with Wikipedia altogether.
Some final attempts to save Nupedia
Nevertheless, I was interested in pursuing Nupedia's development. It still seemed rescuable to me.
I recall two incidents in which I tried to have Nupedia revived, in 2002 or 2003, but I don't recall exactly. First, I approached Jimmy with the offer to try to find a buyer/managing organization for Nupedia. The suggestion was that, since Bomis did not have enough money to support it, and since Jimmy did not appear to have any specific intentions with the project other than to let it run on the system set up in 2000-1, I might be able to find a university or other organization that would take on the responsibility. I do not recall the details, but we did not pursue this possibility. Second, and later, I offered to buy Nupedia myself--that is, the domain name, the membership list, and whatever other proprietary material Bomis might have controlled. I wanted to start it up again as a simpler, more streamlined, but still fully peer-reviewed project; I thought, moreover, that if I owned it I might be able to give it to a suitable sponsoring educational or nonprofit institution. Jimmy seemed cool to the idea, and did not ask for any specific offers.
Perhaps it is, therefore, not entirely accurate to say that Nupedia died due to the inefficiency of its system. To some extent it was also allowed to die, even after it was clear that its former editor-in-chief expressed an interest in continuing the project under an entirely different system. The result was that, without a leader or organization that could support its mission, Nupedia died a slow death. The server it lived on had some trouble in 2003, and as a result the website went offline. For whatever reason, the website was never brought up again after that.
I obviously cannot speak for Jimmy, but I will say that, if he was worried that Nupedia would essentially fork Wikipedia--again, I don't claim that he had that concern--then it seems to me that such a concern would not have justified letting Nupedia wither untended. The projects, Wikipedia and Nupedia, were naturally complementary parts of a single, symbiotic whole. That at least is how I always regarded them, indeed, from the very founding of Wikipedia. From the founding of Wikipedia, I always thought Wikipedia without Nupedia would have been unreliable, and that Nupedia without Wikipedia would have been unproductive. Together they were to be an "unstoppable high-quality article-creation juggernaut."
It is still disappointing to me, that we made plans and promises to thousands of Nupedians, including hundreds of extremely well-qualified people, some of them leaders in their fields. We spent many thousands of person-hours, all told, on the project. I apologize to those people, and I can only hope that they will find some future open content encyclopedia project worthy of their participation, one that will show the world the potential that Nupedia had.
Conclusions
I have some advice for anyone who would like to start new projects on the model of Wikipedia.
You can learn from Wikipedia's success; so, first and most importantly, see above for considerations about why Wikipedia works.
But you can also learn from our mistakes. The following primarily concerns project governance, because governance issues are, in my opinion, the primary failing of Wikipedia. Bear in mind, also, that these are only rough guidelines, for those who are starting projects that have enough resemblance to Wikipedia. These are not perfectly general rules:
- If you intend to create a very large, complex project, establish early on that there will be some non-negotiable policy. Wikis and collaborative projects necessarily build communities, and once a community becomes large enough, it absolutely must have rules to keep order and to keep people at work on the mission of the project. "Force of personality" might be enough to make a small group of people hang together; for better or worse, however, clearly enunciated rules are needed to make larger groups of people hang together.
- There is some policy that, with forethought, can be easily predicted will be necessary. Articulate this policy as soon as possible. Indeed, consider making a project charter to make it clear from the beginning what the basic principles governing the project will be. This will help the community to run more smoothly and allow participants to self-select correctly.
- Establish any necessary authority early and clearly. Managers should not be afraid to enforce the project charter by removing people from the project; as soon as it becomes necessary, it should be done. Standards that are not enforced in any way do not exist in any robust sense. Do not tolerate deliberate disruption from those who oppose your aims; tell them to start their own project; there's a potentially infinite amount of cyberspace.
- As any disagreements among project managers are apt to be publicly visible in a collaborative project, and as this is apt to undermine the (very important) moral authority of at least one manager, make sure management is on the same page from the beginning--preferably before launch. This requires a great deal of thinking through issues together.
- In knowledge-creation projects, and perhaps many other kinds of projects, make special roles for experts from the very beginning; do not attempt to add those roles later, as an afterthought. Specialists are one of your most important resources, and it is irrational not to use them as much as you can. Preferably, design the charter so that they are included and encouraged. Moreover, make the volunteer project management a meritocracy, and not based on longevity but based on the ability to lead and contribute to the project; that is the only condition under which very many of the best qualified people will want to participate.
Another point needs more in-depth development.
Radical and untried new ideas require constant refinement and adaptation in order to succeed; the first proposal is very rarely the best, and project designers must learn from their mistakes and constantly redesign better projects. Nupedia's Advisory Board failed to admit to inherent flaws in its system, and its delay in admission shut the window of opportunity to its improvement. And it seems to me that the Wikipedia community fell into a mistake by thinking that just one or two features--the wiki feature and the neutrality policy and a few other things--explained Wikipedia's success, and that those features can thus be applied with no significant changes to new projects. But there is no substitute for constant creativity and problem-solving--nor for honesty about what problems need solving. The honesty to recognize problems and creativity in solving them are, after all, what made Wikipedia succeed in the first place.
This is a crucial point: if you use a tool or model from another project, think through very carefully how that tool or model should be adapted. Do not assume that you need to use every feature, or every aspect of the surrounding culture, that you are borrowing. Wikipedia borrowed rather too much from (1) the culture of wikis, (2) unmoderated online discussions, and (3) free-wheeling online culture generally. To be sure, Wikipedia is also a product of those cultures, and works as well as it does largely because of what it borrowed from those cultures. But it also shares some of its more serious current flaws with such cultures. Those planning new projects, or wanting to overhaul old ones, might well bear in mind that a certain cultural context, including the context that has grown up around a tool, just might not be right for that project. Let me elaborate.
(1) Consider first the culture of wikis. On the one hand, I said we wanted to determine the best rules, and experience would help us determine that; so we had no rules to begin with. On the other hand, one might add that another reason we began without rules was that we were partaking in the extremely uncontrolled, free-wheeling nature of "traditional" wikis. I think that's right. But there is an excellent reason why an encyclopedia project should not partake in that extremely uncontrolled nature of wiki culture, and why it should adopt actually enforceable rules: unlike traditional wikis, encyclopedia projects have a very specific aim, with very specific constraints, and efficient work toward that aim, within those constraints, practically requires the adoption of enforceable rules. The mere fact that most wikis, when Wikipedia was created, did not have enforceable rules hardly meant that one could not innovate further, and create one that did have rules.
(2) Moreover, Jimmy and I and most of the first participants on Wikipedia were veterans of unmoderated Internet discussion groups, and hence, naturally, we could appreciate the advantages of letting a virtual community develop in the absence of any real (enforcement) authority. In unmoderated forums there is often found a sense, among some participants, that any attempt to oust a particularly troublesome user amounts to unjustifiable censorship. The result is that the existence of many unmoderated forums online has created a small army of people militantly opposed to the slightest restriction on speech, who feel that they do and should have a right to say whatever they like, wherever they like, online. Any attempt to create and enforce rules for Internet projects, when that small army is ready to cry "censorship," will seem daring or even outrageous in many contexts online. But there is an excellent reason why such anarchy is inappropriate for many projects, including encyclopedia projects, even one that is self-policing like a wiki: there simply must be a way to enforce rules in order for rules to be effective. Given that encyclopedia project development happens almost entirely using words, nearly any rules will also be restrictions on speech. Anyone who advocates many enforceable rules on a collaborative project, in the cultural context of an Internet filled with so many unmoderated discussion groups, can be made to seem reactionary. But this is only a result of that cultural context; in any other context, the existence of rules would be perfectly natural and unobjectionable.
(3) Finally, and generally speaking, the Internet is a great leveller. Since social interaction can proceed among complete strangers who cannot so much as see each other, things that seem to matter in many "meatspace" discussions, such as age, social status, and level of education, are often dismissed as unimportant online. Many Internet forums, chatrooms, and blogs are populated by people who are identified by only a "handle," and any suggestion that communication should be restricted or in any way altered in accordance with "expertise" or "authority" is likely to be met with outrage, in most forums. But there are several excellent and obvious reasons why expertise does need special consideration in an encyclopedia project, and in other collaborative projects. First, there are many subjects that dilettantes cannot write about credibly; I, for example, could not write very credibly about astronomy or speleology, but I have a passing interest in both. If I am working only with other dilettantes, our articles are apt to remain amateurish at best; we can fill in the gaps in each other's knowledge, and do research, but the results will remain problematic until someone with more knowledge of the subject contributes. Second, there are very many specialized subjects about which no one but experts has any significant knowledge at all. Third, it is only the opinions of experts that will be trusted by most of the public as authoritative in determining whether an article is generally reliable or not. Moreover, the standards of public credibility are not likely to be changed by the widespread use of Wikipedia or by online debate about the reliability of Wikipedia. Like them or hate them, those are the facts. But if one points these facts out online, culturally "levelled" as it is, particularly in forums or projects like Wikipedia which go out of their way to ignore individual differences among people, one finds a frosty reception at best.
Consider, if you will, that it was because Wikipedia was started in the context of the ingrained cultures of wikis, of unmoderated discussion forums, and of the levelling, anti-elitist influence of the Internet at large, that it was very difficult for us to exercise the maximal amount of creativity that a maximally successful project would require. In establishing a new cultural context, we were deeply constrained by the old. Now, to be sure, I have said above and many times elsewhere that Wikipedia did not have to adopt the particular conjunction of policies that it did. But it is not surprising that it did adopt its particular conjunction of policies, considering the conjunction of influences on its development. So it would have required much more explanation and persuasion, and indeed, much more struggle, for us to, for example, have persuaded potential participants that some persons, even in a wiki environment, should have special rights that others do not. So powerful is the influence of cultural context that there are quite a few people whose lack of imagination is such that they believe I simply must not understand "why Wikipedia works" if I am willing to suggest that it does not have to work in precisely the way it does work. Constantly-reinforced cultural habits die very hard indeed, and place very strong constraints upon what can be imagined, and what bare possibilities seem even worth thinking about.
But it was our willingness to exercise our creativity and follow our imagination, and create what is, to some extent, a new kind of culture, that led to Wikipedia's success. For the overall project of creating open content encyclopedias--and indeed, for the fantastic collaborative Internet that has yet to be created--to reach its full potential, the process of identifying mistakes honestly and creatively seeking solutions must be ramped up and continued unabated.
Many thanks to Larry Sanger and to O'Reilly for this memoir. -
The Early History of Nupedia and Wikipedia, Part II
Today, read the continuation of Larry Sanger's account of the early history of Nupedia and Wikipedia (below), in which Sanger talks about the difficulties of governance in a large, free-wheeling project, some final attempts to save Nupedia, and how he came to resign from the organization. (And if you missed it, you might want to start with yesterday's installment.)Contents:
Why Wikipedia started working
A series of controversies
The governance challenge
My resignation and final few months with the project
Some final attempts to save Nupedia
ConclusionsWhy Wikipedia started working
This is a good place to explain why Wikipedia actually got started and why it worked (and still does work, at least as well as it does). The explanation involves a combination of quite a few factors, some borrowed from the open source movement, some borrowed from wiki software and culture, and some more idiosyncratic:
- Open content license. We promised contributors that their work would always remain free for others to read. This, as is well known, motivates people to work for the good of the world--and for the many people who would like to teach the whole world, that's a pretty strong motivation.
- Focus on the encyclopedia. We said that we were creating an encyclopedia, not a dictionary, etc., and we encouraged people to stick to creating the encyclopedia and not use the project as a debate forum.
- Openness. Anyone could contribute. Everyone was specifically made to feel welcome. (E.g., we encouraged the habit of writing on new contributors' user pages, "Welcome to Wikipedia!" etc.) There was no sense that someone would be turned away for not being bright enough, or not being a good enough writer, or whatever.
- Ease of editing. Wikis are pretty easy for most people to figure out. In other collaborative systems (like Nupedia), you have to learn all about the system first. Wikipedia had an almost flat learning curve.
- Collaborate radically; don't sign articles. Radical collaboration, in which (in principle) anyone can edit any part of anyone else's work, is one of the great innovations of the open source software movement. On Wikipedia, radical collaboration made it possible for work to move forward on all fronts at the same time, to avoid the big bottleneck that is the individual author, and to burnish articles on popular topics to a fine luster.
- Offer unedited, unapproved content for further development. This is required if one wishes to collaborate radically. We encouraged putting up their unfinished drafts--as long as they were at least roughly correct--with the idea that they can only improve if there are others collaborating. This is a classic principle of open source software. It helped get Wikipedia started and helped keep it moving. This is why so many original drafts of Wikipedia articles were basically garbage (no offense to anyone--some of my own drafts were sometimes garbage), and also why it is surprising to the uninitiated that many articles have turned out very well indeed.
- Neutrality. A firm neutrality policy made it possible for people of widely divergent opinions to work together, without constantly fighting. It's a way to keep the peace.
- Start with a core of good people. I think it was essential that we began the project with a core group of intelligent good writers who understood what an encyclopedia should look like, and who were basically decent human beings.
- Enjoy the Google effect. We had little to do with this, but had Google not sent us an increasing amount of traffic each time they spidered the growing website, we would not have grown nearly as fast as we did. (See below.)
That's pretty much it. The focus on the encyclopedia provided the task and the open content license provided a natural motivation: people work hard if they believe they are teaching the world stuff. Openness and ease of editing made it easy for new people to join in and get to work. Collaboration helped move work forward quickly and efficiently, and posting unedited drafts made collaboration possible. The fact that we started with a core of good people from Nupedia meant that the project could develop a functional, cooperative community. Neutrality made it easy for people to work together with relatively little conflict. And the Google effect provided a steady supply of "fresh blood"--who in turn supplied increasing amounts of content.
Probably, all or nearly all other project rules were either optional, or straightforward applications of these principles. The project probably would still have succeeded nicely even if it had moderated or tweaked some of the above principles. For instance, radical openness, that is, being open even to those who brazenly flouted and disrespected the project's mission, was surely not necessary; after all, without them, the project would have been more welcoming to the many people who felt they could not work with such difficult people. And if we had required people to sign in, that would not have made very much difference (although it probably would have made some in the beginning; the project wouldn't have grown as fast). Of course we didn't have to use the GNU FDL for the license. Certainly, we did not need to set the community up initially as an anarchy governed by some vague consensus: instead, we could have adopted a charter from the very start. The project could have been managed quite differently; there could have been specially-designated and well-qualified editors. The project could have officially encouraged and deferred to experts. An article approval process could have been adopted without threatening the principle of posting unedited content for collaboration. Certainly, many of the later bells and whistles--the arbitration committee, a three-revert rule, having administrators with the particular configuration of rights they have, etc.--were not absolutely necessary to adopt in the precise forms they took. These differences would not have threatened the basic principles that made the project work, listed above.
So the basic principles that explain why Wikipedia could start working--and still does work--are relatively simple, few in number, and above all general. The more specific principles that Wikipedia wound up with was a matter of historical accident. There was a great deal of "wiggle room." Those intent on studying or replicating the Wikipedia model would do well to bear that in mind.
A series of controversies
So much for the very early history of Wikipedia; the next phase involved rapid growth and some serious internal controversies over policy and authority. If Wikipedia's basic policy was settled upon in the first nine months, its culture was solidified into something closer to its present form in the next nine.
The project continued to grow. We had 6000 articles by July 8; 8000 by August 7; 11,200 by September 9; and 13,000 by October 4. Consulting the website logs, we noted a Google effect: each time Google spidered the website, more pages would be indexed; the greater the number of pages indexed, the more people arrived at the project; the more people involved in the project, the more pages there were to index. In addition to this source of new contributors, Wikipedia was Slashdotted several times, and had large influxes of new users particularly after two articles I wrote for Kuro5hin were posted on Slashdot: "Britannica or Nupedia? The Future of Free Encyclopedias (July 25, 2001) and Wikipedia is wide open. Why is it growing so fast? Why isn't it full of nonsense? (September 24, 2001).
This growth brought difficult challenges, challenges that perhaps I did not sufficiently anticipate and plan for. Some of our earliest contributors were academics and other highly-qualified people, and it seems to me that they were slowly worn down and driven away by having to deal with difficult people on the project. I hope they will not mind that I mention their names, but the two that stick in my mind are J. Hoffman Kemp and Michael Tinkler, a couple of Ph.D. historians. They helped to set what I think was a good precedent for the project in that they wrote about their own areas of expertise, and they contributed under their own, real names. The latter has the salutary effect of making the contributor more serious and more apt to take responsibility for his or her contributions. They are also very nice people, but did not "suffer fools gladly," as the phrase goes. Consequently, they wound up in some pretty silly disputes that would have driven less patient people away instantly. So there was a growing problem: persistent and difficult contributors tend to drive away many better, more valuable contributors; Kemp and Tinkler were only two examples. There were many more who quietly came and quietly left. Short of removing the problem contributors altogether--which we did only in the very worst cases--there was no easy solution, under the system as we had set it up. And I am sorry to have to admit that those aspects of the system that led to this problem were as much my responsibility as anyone else's. Obviously, I would not design the system the same way if given the chance again.
As a result, I grew both more protective of the project and increasingly sensitive to abuse of the system. As I tried to exercise what little authority I claimed, as a corrective to such abuse, many newer arrivals on the scene made great sport of challenging my authority. One of the earliest challenges happened in late summer, 2001. The front page of Wikipedia--then open to anyone to edit, like any other page on the project--was occasionally vandalized with infantile graffiti. Someone then tried to make an archive of the vandalism that had been done to the front page of Wikipedia. I maintained that to make such an archive would be to encourage such vandalism, so I deleted the archive. This occasioned much debate. Then a user made the archive a "subpage" of his own user page--and user pages were generally held to be the bailiwick of the user. Consequently I deleted that subpage, which occasioned a further hue and cry that, perhaps, I was abusing my authority. The vandalism-enshrining user in question proceeded to create a "deleted pages" page, on which the deleted vandalism archives were listed, as if to accuse me of trying to act without public scrutiny; but this was, of course, perfectly acceptable to me. At the time, I thought that this controversy was just as silly as it will sound to most people reading this; I thought that I needed only to "put my foot down" a little harder and, as had happened for the first six months of the project, participants would fall into line. What I did not realize was that this was to be only the first in a long series of controversies, the ultimate upshot of which was to undermine my own moral authority over the project and to make the project as safe as possible for the most abusive and contentious contributors.
Throughout this and other early controversies, much of the debate about project policy was conducted on the wiki itself. Other debates were conducted on mailing lists, Wikipedia-L and then later, for the English language project, WikiEN-L. In addition, people had taken to putting their own essays on Wikipedia, as subpages of their user pages. These too were occasioning debate. It seemed to me, and many other contributors, that this debate was distracting the community from our main goal: to create an encyclopedia. Consequently I proposed that we move the debate to another wiki that was to be created specifically for that purpose--what became known as the "meta wiki." This proposal was very widely supported, so we set it up.
As it happened, the meta-wiki became even more uncontrolled than Wikipedia itself, and for many months was continually infested with contributions by people that can only be called "trolls." That epithet came to be discouraged, however, for reasons soon to be explained. The existence of trolls was a problem we felt we should tolerate--and deal with only verbally, not with harsh penalties--for the sake of encouraging the broadest amount of participation. In the first years, only the worst trolls were ever expelled from the project. I do not know whether this policy has been changed as a result of the operation of the much-later installed Arbitration Committee.
The reasons the meta-wiki became (at least temporarily) more uncontrolled are not far to seek. First, it had no specific purpose, other than to host project debate and essays that do not belong on the main wiki--which was not enough to make anyone care very much about it. Second, because many people did not care what happened on the meta-wiki, they did not do the very necessary weeding that takes place on Wikipedia; besides, as the meta-wiki was a repository of opinion, people felt less comfortable editing or deleting what was, after all, only opinion.
What happened was that project policy discussions moved almost exclusively to the project mailing lists. There is a reason why this was a superior solution to having much debate on an uncontrolled, "unmoderated" wiki. On a wiki, contributions exist in perpetuity, as it were, or until they are deleted or radically changed; consequently, anyone new to a discussion sees the first contribution first. So whoever starts a new page for discussion also, to a great extent, sets the tone and agenda of the discussion. Moreover, nasty, heated exchanges live on forever on a wiki, festering like an open wound, unless deliberately toned down afterwards; if the same exchange takes place on a mailing list, it slips mercifully and quietly into the archives.
At about the same time that we decided to start the meta-wiki, and soon after the vandalism archive affair, I was thinking a great deal about Wikipedia's apparent anarchy, and I wrote an essay titled, "Is Wikipedia an experiment in anarchy?" This and the discussion that ensued tended to ossify positions with regard to the authority issue: I and a few others agreed that Jimmy and I should have special authority within the system, to settle policy issues that needed settling. Jimmy was relatively quiet about this issue; this, I think, was probably because his authority was generally not in question, but mine was, because I was "in the trenches" and continuing to encourage good habits and solidify policy positions.
By November or December of 2001, Wikipedia was growing so fast and the subject of regular news reporting, even by the likes of The New York Times and MIT's Technology Review; after the two major Slashdottings earlier in the year, we knew that large influxes of members could have a tendency to change the nature of the project, and not necessarily for the better. If there were some major news coverage--an evening news story in the U.S., for example--there might be many new people who would need to be taught about Wikipedia's standards and positive cultural aspects. So I proposed what I thought was a humorously-named "Wikipedia Militia" which would manage new (and very welcome) "invasions" by new contributors. By this time, however, there was a small core group of people who were constantly on the watch for anything that smacked the least bit of authoritarianism; consequently, the name, and various aspects of how the proposal was presented, were vigorously debated. Eventually, we switched to "The Wikipedia Welcoming Committee" and finally, the "Volunteer Fire Department"--which eventually, it seems, fell into disuse.
The governance challenge
After the September Slashdotting, I composed a page originally called "Our Replies to Our Critics" (and now called "Replies to Common Objections"), in which I addressed the problem that "cranks and partisans" might abuse the system:
Moreover--and this is something that you might not be able to understand very well if you haven't actually experienced it--there is a fair bit of (mostly friendly) peer pressure, and community standards are constantly being reinforced. The cranks and partisans, etc., are not simply outgunned. They also receive considerable opprobrium if they abuse the system.
This reflects very well the conception I had in September 2001 of Wikipedia's culture; the reply above was as much hopeful and prescriptive as descriptive. But it turned out to be only partly true. As difficult users began to have more of a "run of the place," in late 2001 and 2002, opprobrium was in fact meted out only piecemeal and inconsistently. It seemed that participation in the community was becoming increasingly a struggle over principles, rather than a shared effort toward shared goals. Any attempt to enforce what should have been set policy--neutrality, no original research, and no wholesale deletion without explanation--was frequently if not usually met with resistance. It was difficult to claim the moral high ground in a dispute, because the basic project principles were constantly coming under attack. Consequently, Wikipedia's environment was not cooperative but instead competitive, and the competition often concerned what sort of community Wikipedia should be: radically anarchical and uncontrolled, or instead more singlemindedly devoted to building an encyclopedia. Sadly, few among those who would love to work on Wikipedia could thrive in such a protean environment.
It is one thing to lack any equivalent to "police" and "courts" that can quickly and effectively eliminate abuse; such enforcement systems were rarely entertained in Wikipedia's early years, because according to the wiki ideal, users can effectively police each other. It is another thing altogether to lack a community ethos that is unified in its commitment to its basic ideals, so that the community's champions could claim a moral high ground. So why was there no such unified community ethos and no uncontroversial "moral high ground"? I think it was a simple consequence of the fact that the community was to be largely self-organizing and to set its own policy by consensus. Any loud minority, even a persistent minority of one person, can remove the appearance of consensus. In fact, I recall that (in October 2002, after I resigned) I felt compelled by ongoing controversies to request that Jimmy declare that certain policies were in fact non-negotiable, which he did. Unfortunately, this declaration was too little, too late, in my opinion.
By late 2001, I had gained both friends and detractors. I think I had become, within the project, a symbol of opposition to anarchism, of the enforcement of standards, and consequently of the exercise of authority in a radically open project. But I was still trying to manage the project as I always had--by force of personality and "moral" authority. So when people arrived who clearly and openly disrespected established policy, I was, in my frustration, very short with them; and when the project continued to try to establish new policies, my role in articulating those policies and actually establishing them (attempting to express a "consensus") was challenged. This undermined what moral authority I had. I felt my job was on the line, and the project continued in turmoil day in and day out; from my point of view, fires were spreading everywhere, and as I had become a somewhat controversial figure, I did not have quite enough allies to help me put them out. Consequently I was rather too peremptory and short with some users. This, however, exacerbated the problem, because the attitude could not be backed up by punishment; harsh words from a leader are empty threats if unenforceable; I thereby handed my anti-authoritarian "wiki-anarchist" opponents an advantage, because--ironically--they were able to portray me as dictatorial, when I was anything but. I came to the view, finally and belatedly, that it would be better to "ignore the trolls." But as it turns out, this is particularly hard to do on a wiki, because, again, unlike on an e-mail list, trollish contributions do not just disappear into the archives; they sit out in the open, as available as the first day they appeared, and "festering." Attempts to delete or radically edit such contributions were often met by reposting the earlier, problem version: the ability to do that is a necessary feature of collaboration. Persistent trolls could, thus, be a serious problem, particularly if they were able to draw a sympathetic audience. And there was often an audience of sympathizers: contributors who philosophically were opposed to nearly any exercise of authority, but who were not trolls themselves.
It is surely very ironic that it was I personally who (initially) so strongly supported the lack any enforceable rules in the community. Some legal theorists would maintain that a community that lacks enforceable rules lacks any law at all. In retrospect it is clear that there was a fundamental problem with my role in the system: to have real authority, I needed both to be able to enforce the rules and, for both fairness and the perception of fairness, there needed to be clear rules from the beginning. But, by my own design, I had very early on rejected the label "editor-in-chief" and much real enforcement authority; a year into the game, it would have been difficult if not impossible to claim enforcement authority over active but problem users. Moreover, I was the author of the "ignore all rules" rule. My early rejection of any enforcement authority, my attempt to portray myself and behave as just another user who happened to have some special moral authority in the project, and my rejection of rules--these were all clearly mistakes on my part. They did, I think, help the project get off the ground; but I really needed a more subtle and forward-looking understanding of how an extremely open, decentralized project might work.
In retrospect, I wish I had taken Teddy Roosevelt's advice: "Speak softly and carry a big stick." Since my "stick" was very small, I suppose I felt compelled to "speak loudly," which I regret. (This was not such a problem, by the way, on Nupedia; partly, that was because there were not nearly as many problem users on Nupedia, but partly it was because there was clear enforcement authority.) As it turns out, it was Jimmy who spoke softly and carried the big stick; he first exercised "enforcement authority." Since he was relatively silent throughout these controversies, he was the "good cop," and I was the "bad cop": that, in fact, is precisely how he (privately) described our relationship. Eventually, I became sick of this arrangement. Because Jimmy had remained relatively toward the background in the early days of the project, and showed that he was willing to exercise enforcement authority upon occasion, he was never so ripe for attack as I was.
Perhaps the root cause of the governance problem was that we did not realize well enough that a community would form, nor did we think carefully about what this entailed. For months I denied that Wikipedia was a community, claiming that it was, instead, only an encyclopedia project, and that there should not be any serious governance problems if people would simply stick to the task of making an encyclopedia. This was strictly wishful thinking. In fact, Wikipedia was from the beginning and is both a community and an encyclopedia project. And for a community attempting to achieve something, to be serious, effective, and fair, a charter seems necessary. In short, a collaborative community would do well to think of itself as a polity with everything that that entails: a representative legislative, a competent and fair judiciary, and an effective executive, all defined in advance by a charter. There are special requirements of nearly every serious community, however, best served by relevant experts; and so I think a prominent role for the relevant experts should be written into the charter. I would recommend all of this to anyone launching a serious online community. But indeed, in January 2001, we were in both "uncharted" and "unchartered" territory. The world, I think, will be able to benefit from this and our other initial mistakes.
But in fairness to ourselves, it was a good idea to allow the community to decide by experience and consensus what article content rules to endorse. This allowed us to generate a very sensible set of article content rules. To be clear, I think it was not such a good idea to apply the same thinking to the organization of the community itself; we should have acknowledged that a community would form, that it would have certain persistent and difficult issues that would need to be solved, and that a lack of any effective founding community charter might result in chaos.
My resignation and final few months with the project
Throughout the governance controversy, I was preparing for my wedding, which happened December 1, 2001. A few days after I arrived back from my honeymoon, I was informed that I should probably start looking for another job, because Bomis was having to lay off most of its workers; they had 10-12 workers at the end of 2000, and by the beginning of 2002 they were back to their original 4-5. My salary was reduced in December and then halved in January. This seemed inevitable because Wikipedia was not bringing in any money at all for Bomis, even if Wikipedia was becoming even more of a publicly-recognized, if still modest success. Our first anniversary came just before we announced having 20,000 articles, and I was invited to talk about the project at Stanford on January 16 (text here; you might notice that I was still plugging the notion of using Nupedia to vet Wikipedia articles, as an answer to the objection that Wikipedia articles are unreliable).
I was officially laid off at the beginning of February, which I announced a few weeks later. I had continued on as a volunteer; Wikipedia and Nupedia were, after all, volunteer projects. But I was laboring in the aftermath of the governance controversies of the previous fall and winter, which promised to make the job of a volunteer project leader even more difficult. Moreover, I had to look for a real job. So throughout the month of February I considered resigning altogether.
But Jimmy had told me the previous December that Bomis would start trying to sell ads on Wikipedia in order to pay for my job. Even in that horrible market for Internet advertising, there were already enough pageviews on Wikipedia that advertising proceeds might have provided me a very meager living. We knew that this would be extremely controversial, because so many of the people who are involved in open source and open content projects absolutely hate the idea of advertising on the web pages of free projects, even to support project organizers. In fact, when this advertising plan was announced, in late February of 2002, the Spanish Wikipedia was forked (something I urged them not to do).
Bomis was not successful in selling any ads for Wikipedia anyway--you might recall that early 2002 was at about the very bottom of the market for Internet advertising. I also had some hope that we might, finally, set up the project's managing nonprofit, which we had discussed doing for a long time (and which eventually did come into being: Wikimedia). The job of setting up the nonprofit was left to me, but ongoing controversies seemed to eat up any time I had for Wikipedia, and frankly I had no idea where to begin. So, after a month without pay, I announced my general resignation; I completely stayed away from the project for a few months.
Just by the way, Wikipedia's offshoot projects--a dictionary, a textbook project, a quotation project, a public domain book repository, etc.--were all started in 2002 or later, and I cannot claim any credit for them. I did supply the name "Wiktionary" in April 2001, more or less on a whim. I quickly disavowed any responsibility for leading any such project, and it seems the Wiktionary project did not start up for another year and a half (December 12, 2002). My view now is that Webster's and the OED are quite good enough as far as English dictionaries go, and there will always be excellent free dictionaries in every language online. To try to develop a dictionary by collaboration among random Internet users, particularly in a completely uncontrolled wiki format, now strikes me as a nonstarter. I confess I am now puzzled why I didn't think so instantly; it was no doubt because I simply was throwing out ideas as they occurred to me, and also because we had too many dictionary definition-type entries in Wikipedia. (So why not give people a place to put their dictionary definitions?--Perhaps that's what I was thinking, but it hardly seems like a good justification for starting a project.) But Jimmy's first reaction was properly skeptical regarding the use of wikis and Ruth Ifcher made a stronger criticism very nicely. Dictionaries, even more than encyclopedias, must be extremely reliable to be even minimally usable; without direct oversight by linguists, a public dictionary project seems pointless. As to the other projects, they are mostly conducted using wikis and according to some of the basic founding principles of Wikipedia. But other sorts of project--for example, textbook projects, quotation repositories, and archives--necessarily require quite different specifications from those of an encyclopedia. For example, the fact that the wiki format works for encyclopedia development hardly means that it is appropriate for the hosting of public domain books. Since the same texts are available in many other places online, such as the wonderful Project Gutenberg, why would anyone choose to read The Iliad on a wiki, which could have been subtly changed by any random passer-by, without any oversight by someone who had access to an authoritative text? There is a fact about the way the text actually reads; so is editing via wiki software more apt to increase or reduce the number of errors over other systems, such as Project Gutenberg's? I do not mean to dismiss any such efforts. I simply think that considerable thought needs to be put into exactly how those other projects should be organized: the wiki format is not a magic pill that somehow makes all problems go away. Wiki is just one software paradigm, which must be adapted, supplemented, changed, or replaced in order to solve the unique set of problems a project faces.
In the spring, a controversy erupted. Caring as I did--and as I still do--about the future of free encyclopedias, I felt compelled to get involved. The controversy featured a troll who was putting up huge numbers of screeds on the "meta-wiki" and on Wikipedia as well. The controversy began with a discussion of what to do about, and how to react to, this particular troll. I maintained that one should not "feed the troll," and that the troll should be "outed" (it was an anonymous user, but it was not hard to use Google to determine the identity of the troll) and shamed.
There resulted a broader controversy about how to treat problem users generally. There were, as I recall, two main schools of thought. One, to which I adhered and still adhere, was that bona fide trolls should be "named and shamed" and, if they were unresponsive to shaming, they should be removed from the project (by a fair process) sooner rather than later. We held that a collaborative project requires commitment to ethical standards which are--as all ethical standards ultimately are--socially established by pointing out violations of those standards. Hence naming and shaming. A second school of thought held that all Wikipedia contributors, even the most difficult, should be treated respectfully and with so-called WikiLove. Hence trolls were not to be identified as such (since "troll" is a term of abuse), and were to be removed from the project only after a long (and painful) public discussion. For the latter school, it seemed to me, the only really egregious faux pas one could commit in the project was to suggest that there were objective standards that could be enforced via "shaming."
I felt at the time that the prevalence of the second school entailed rejection of both objective standards and rules-based authority. It is impossible to explain why one is removing some partisan screeds from the wiki without, in some way, identifying it as a partisan screed, and pointing out that such productions are inconsistent with the neutrality policy. This will necessarily be received as less than respectful and "loving," especially if one must engage the troll himself in a long, drawn-out dispute; in a very long dispute with any trollish type, it is only a matter of time before some epithet gets bandied about, since they are so darned useful (and accurate) when applied to trollish types. More generally, the very application of rules, or laws, entails a moral judgment, or what for its effectiveness must have the force of a moral judgment. I suppose I agree with those legal theorists who say that there is necessarily, in its core, a moral component to the law. Consequently, the new policy of "WikiLove" handed trolls and other difficult users a very effective weapon for purposes of combatting those who attempted to enforce rules. After all, any forthright declaration that a user is doing something that is clearly against established conventions--posting screeds, falsehoods, nonsense, personal opinion, etc.--is nearly always going to appear disrespectful, because such a declaration involves a moral accusation. The only way to avoid such an appearance of disrespect, perhaps, is to step very lightly and use much flattery and qualifications: "Now don't get me wrong, I think you're doing a good job overall, but it seems to me that in this particular case, your contribution is slightly inconsistent with the neutrality policy." Suppose the offender replies: "So what? I disagree with the neutrality policy." Or: "I disagree. What I wrote is perfectly neutral. Who do you think you are, anyway?" It is a very rare person who can practice "WikiLove" in such a case. In Wikipedia's developing culture, if anyone reacted out of frustration, or merely attempted to apply the law as a moral instrument, as laws typically are applied, he would become the problem, and a much more serious problem, than mere violations of the neutrality policy, say. The result is that, on pain of becoming persona non grata in the community, one had to treat brazen, self-conscious violators of basic policy with particular respect. It was a perfect coup for the resident wiki-anarchists. I again left the project for several months.
In fall of 2002, I had started teaching at a local community college, and with some extra time on my hands, I started editing Wikipedia a little and engaging in mailing list discussions. I think my first new post to Wikipedia-L, from September 1, 2002, was "Why the free encyclopedia movement needs to be more like the free software movement." In it I argued that the free software movement is led and dominated by highly-qualified programmers, and that the "free encyclopedia movement"--that is, Wikipedia, Nupedia, and other newer projects--needs to move in that direction. I suggested that Nupedia be redesigned to release "approved" versions of Wikipedia articles; Wikipedia itself was not to be touched. This proposal met with a very cool reception. After a few months of discussion, Jimmy himself was "intending to revive Nupedia in the near future" and "thinking very much along the lines of what is being discussed here." Unfortunately, this never happened.
By November or December, I think, I proposed, and Magnus Manske very helpfully coded, an expert-controlled approval process for Wikipedia that was in fact to be independent of both Nupedia and Wikipedia. It would not have affected the Wikipedia editorial process. It would have lived in a separate namespace or domain, as an independent add-on project for Wikipedia. Without explaining the details, expert reviewers, the recruitment of which I would organize, would examine Wikipedia articles and approve or disapprove of particular versions of those articles. We set up a mailing list, Sifter-L (archives no longer online, apparently), which for several weeks discussed policy issues.
There was not a great deal of support for the proposal on Wikipedia-L. There was little or no excitement that the new project might bring into Wikipedia a fresh crop of subject area specialists. But that was fine as far as I was concerned, since the project was to operate independently of Wikipedia. Still, I had the very distinct sense that any specialists arriving on the scene would not necessarily be met with open arms--particularly if before approving an article they wished to make whatever changes to articles that they felt necessary. There were even a few Wikipedians who made it clear that experts should not expect to be treated any differently than anyone else, even when writing about their areas of expertise.
I then considered whether the interaction between Wikipedians and the new reviewers might be a problem after all. Surely, I thought, most specialists would want to edit even very good articles before approving them (in the independent system). This would require that the reviewers interact with Wikipedians. Wikipedia's culture had become such that disrespect of expertise was tolerated, and, again, trolls were merely warned, but very politely (in keeping with the policy of WikiLove), that they please ought to stop their inflammatory behavior. Trolls would certainly find ripe targets in expert reviewers, I thought. I recalled that patient, well-educated Wikipedians like J. Hoffmann Kemp and Michael Tinkler had been driven off the project not only by trolls but by some of the more abrasive and disrespectful regulars. I then considered: could I in good conscience really ask academics, who are very busy, to engage in this activity that would probably annoy most of them and do nothing to contribute to their academic careers? Recruiting for Nupedia was very easy by comparison, and caused me no such pangs of conscience.
I believe it was this problem that finally prompted me, in I believe January of 2003, to inform Jimmy as follows (by private e-mail): I was breaking with the project altogether; the only way he could prevent this, I told him, was that he personally crack down on problem users, and make the project more officially welcoming to experts. I also told him that I did not expect this information to change his mind, and that I did not mean to issue an ultimatum. And in fact our exchange did not change his mind. I concluded that we had a fundamental philosophical disagreement about how the project should be run. I respected and still respect his view. That is where matters ended, and it was then that I broke with Wikipedia altogether.
Some final attempts to save Nupedia
Nevertheless, I was interested in pursuing Nupedia's development. It still seemed rescuable to me.
I recall two incidents in which I tried to have Nupedia revived, in 2002 or 2003, but I don't recall exactly. First, I approached Jimmy with the offer to try to find a buyer/managing organization for Nupedia. The suggestion was that, since Bomis did not have enough money to support it, and since Jimmy did not appear to have any specific intentions with the project other than to let it run on the system set up in 2000-1, I might be able to find a university or other organization that would take on the responsibility. I do not recall the details, but we did not pursue this possibility. Second, and later, I offered to buy Nupedia myself--that is, the domain name, the membership list, and whatever other proprietary material Bomis might have controlled. I wanted to start it up again as a simpler, more streamlined, but still fully peer-reviewed project; I thought, moreover, that if I owned it I might be able to give it to a suitable sponsoring educational or nonprofit institution. Jimmy seemed cool to the idea, and did not ask for any specific offers.
Perhaps it is, therefore, not entirely accurate to say that Nupedia died due to the inefficiency of its system. To some extent it was also allowed to die, even after it was clear that its former editor-in-chief expressed an interest in continuing the project under an entirely different system. The result was that, without a leader or organization that could support its mission, Nupedia died a slow death. The server it lived on had some trouble in 2003, and as a result the website went offline. For whatever reason, the website was never brought up again after that.
I obviously cannot speak for Jimmy, but I will say that, if he was worried that Nupedia would essentially fork Wikipedia--again, I don't claim that he had that concern--then it seems to me that such a concern would not have justified letting Nupedia wither untended. The projects, Wikipedia and Nupedia, were naturally complementary parts of a single, symbiotic whole. That at least is how I always regarded them, indeed, from the very founding of Wikipedia. From the founding of Wikipedia, I always thought Wikipedia without Nupedia would have been unreliable, and that Nupedia without Wikipedia would have been unproductive. Together they were to be an "unstoppable high-quality article-creation juggernaut."
It is still disappointing to me, that we made plans and promises to thousands of Nupedians, including hundreds of extremely well-qualified people, some of them leaders in their fields. We spent many thousands of person-hours, all told, on the project. I apologize to those people, and I can only hope that they will find some future open content encyclopedia project worthy of their participation, one that will show the world the potential that Nupedia had.
Conclusions
I have some advice for anyone who would like to start new projects on the model of Wikipedia.
You can learn from Wikipedia's success; so, first and most importantly, see above for considerations about why Wikipedia works.
But you can also learn from our mistakes. The following primarily concerns project governance, because governance issues are, in my opinion, the primary failing of Wikipedia. Bear in mind, also, that these are only rough guidelines, for those who are starting projects that have enough resemblance to Wikipedia. These are not perfectly general rules:
- If you intend to create a very large, complex project, establish early on that there will be some non-negotiable policy. Wikis and collaborative projects necessarily build communities, and once a community becomes large enough, it absolutely must have rules to keep order and to keep people at work on the mission of the project. "Force of personality" might be enough to make a small group of people hang together; for better or worse, however, clearly enunciated rules are needed to make larger groups of people hang together.
- There is some policy that, with forethought, can be easily predicted will be necessary. Articulate this policy as soon as possible. Indeed, consider making a project charter to make it clear from the beginning what the basic principles governing the project will be. This will help the community to run more smoothly and allow participants to self-select correctly.
- Establish any necessary authority early and clearly. Managers should not be afraid to enforce the project charter by removing people from the project; as soon as it becomes necessary, it should be done. Standards that are not enforced in any way do not exist in any robust sense. Do not tolerate deliberate disruption from those who oppose your aims; tell them to start their own project; there's a potentially infinite amount of cyberspace.
- As any disagreements among project managers are apt to be publicly visible in a collaborative project, and as this is apt to undermine the (very important) moral authority of at least one manager, make sure management is on the same page from the beginning--preferably before launch. This requires a great deal of thinking through issues together.
- In knowledge-creation projects, and perhaps many other kinds of projects, make special roles for experts from the very beginning; do not attempt to add those roles later, as an afterthought. Specialists are one of your most important resources, and it is irrational not to use them as much as you can. Preferably, design the charter so that they are included and encouraged. Moreover, make the volunteer project management a meritocracy, and not based on longevity but based on the ability to lead and contribute to the project; that is the only condition under which very many of the best qualified people will want to participate.
Another point needs more in-depth development.
Radical and untried new ideas require constant refinement and adaptation in order to succeed; the first proposal is very rarely the best, and project designers must learn from their mistakes and constantly redesign better projects. Nupedia's Advisory Board failed to admit to inherent flaws in its system, and its delay in admission shut the window of opportunity to its improvement. And it seems to me that the Wikipedia community fell into a mistake by thinking that just one or two features--the wiki feature and the neutrality policy and a few other things--explained Wikipedia's success, and that those features can thus be applied with no significant changes to new projects. But there is no substitute for constant creativity and problem-solving--nor for honesty about what problems need solving. The honesty to recognize problems and creativity in solving them are, after all, what made Wikipedia succeed in the first place.
This is a crucial point: if you use a tool or model from another project, think through very carefully how that tool or model should be adapted. Do not assume that you need to use every feature, or every aspect of the surrounding culture, that you are borrowing. Wikipedia borrowed rather too much from (1) the culture of wikis, (2) unmoderated online discussions, and (3) free-wheeling online culture generally. To be sure, Wikipedia is also a product of those cultures, and works as well as it does largely because of what it borrowed from those cultures. But it also shares some of its more serious current flaws with such cultures. Those planning new projects, or wanting to overhaul old ones, might well bear in mind that a certain cultural context, including the context that has grown up around a tool, just might not be right for that project. Let me elaborate.
(1) Consider first the culture of wikis. On the one hand, I said we wanted to determine the best rules, and experience would help us determine that; so we had no rules to begin with. On the other hand, one might add that another reason we began without rules was that we were partaking in the extremely uncontrolled, free-wheeling nature of "traditional" wikis. I think that's right. But there is an excellent reason why an encyclopedia project should not partake in that extremely uncontrolled nature of wiki culture, and why it should adopt actually enforceable rules: unlike traditional wikis, encyclopedia projects have a very specific aim, with very specific constraints, and efficient work toward that aim, within those constraints, practically requires the adoption of enforceable rules. The mere fact that most wikis, when Wikipedia was created, did not have enforceable rules hardly meant that one could not innovate further, and create one that did have rules.
(2) Moreover, Jimmy and I and most of the first participants on Wikipedia were veterans of unmoderated Internet discussion groups, and hence, naturally, we could appreciate the advantages of letting a virtual community develop in the absence of any real (enforcement) authority. In unmoderated forums there is often found a sense, among some participants, that any attempt to oust a particularly troublesome user amounts to unjustifiable censorship. The result is that the existence of many unmoderated forums online has created a small army of people militantly opposed to the slightest restriction on speech, who feel that they do and should have a right to say whatever they like, wherever they like, online. Any attempt to create and enforce rules for Internet projects, when that small army is ready to cry "censorship," will seem daring or even outrageous in many contexts online. But there is an excellent reason why such anarchy is inappropriate for many projects, including encyclopedia projects, even one that is self-policing like a wiki: there simply must be a way to enforce rules in order for rules to be effective. Given that encyclopedia project development happens almost entirely using words, nearly any rules will also be restrictions on speech. Anyone who advocates many enforceable rules on a collaborative project, in the cultural context of an Internet filled with so many unmoderated discussion groups, can be made to seem reactionary. But this is only a result of that cultural context; in any other context, the existence of rules would be perfectly natural and unobjectionable.
(3) Finally, and generally speaking, the Internet is a great leveller. Since social interaction can proceed among complete strangers who cannot so much as see each other, things that seem to matter in many "meatspace" discussions, such as age, social status, and level of education, are often dismissed as unimportant online. Many Internet forums, chatrooms, and blogs are populated by people who are identified by only a "handle," and any suggestion that communication should be restricted or in any way altered in accordance with "expertise" or "authority" is likely to be met with outrage, in most forums. But there are several excellent and obvious reasons why expertise does need special consideration in an encyclopedia project, and in other collaborative projects. First, there are many subjects that dilettantes cannot write about credibly; I, for example, could not write very credibly about astronomy or speleology, but I have a passing interest in both. If I am working only with other dilettantes, our articles are apt to remain amateurish at best; we can fill in the gaps in each other's knowledge, and do research, but the results will remain problematic until someone with more knowledge of the subject contributes. Second, there are very many specialized subjects about which no one but experts has any significant knowledge at all. Third, it is only the opinions of experts that will be trusted by most of the public as authoritative in determining whether an article is generally reliable or not. Moreover, the standards of public credibility are not likely to be changed by the widespread use of Wikipedia or by online debate about the reliability of Wikipedia. Like them or hate them, those are the facts. But if one points these facts out online, culturally "levelled" as it is, particularly in forums or projects like Wikipedia which go out of their way to ignore individual differences among people, one finds a frosty reception at best.
Consider, if you will, that it was because Wikipedia was started in the context of the ingrained cultures of wikis, of unmoderated discussion forums, and of the levelling, anti-elitist influence of the Internet at large, that it was very difficult for us to exercise the maximal amount of creativity that a maximally successful project would require. In establishing a new cultural context, we were deeply constrained by the old. Now, to be sure, I have said above and many times elsewhere that Wikipedia did not have to adopt the particular conjunction of policies that it did. But it is not surprising that it did adopt its particular conjunction of policies, considering the conjunction of influences on its development. So it would have required much more explanation and persuasion, and indeed, much more struggle, for us to, for example, have persuaded potential participants that some persons, even in a wiki environment, should have special rights that others do not. So powerful is the influence of cultural context that there are quite a few people whose lack of imagination is such that they believe I simply must not understand "why Wikipedia works" if I am willing to suggest that it does not have to work in precisely the way it does work. Constantly-reinforced cultural habits die very hard indeed, and place very strong constraints upon what can be imagined, and what bare possibilities seem even worth thinking about.
But it was our willingness to exercise our creativity and follow our imagination, and create what is, to some extent, a new kind of culture, that led to Wikipedia's success. For the overall project of creating open content encyclopedias--and indeed, for the fantastic collaborative Internet that has yet to be created--to reach its full potential, the process of identifying mistakes honestly and creatively seeking solutions must be ramped up and continued unabated.
Many thanks to Larry Sanger and to O'Reilly for this memoir. -
The Early History of Nupedia and Wikipedia: A Memoir
Larry Sanger was one of the moving forces behind the pioneering Nupedia project. That makes him one of the people to thank for Wikipedia, which has been enjoying more and more visibility of late. Sanger has prepared a lengthy, informative account of the early history of Nupedia and Wikipedia, including some cogent observations on project management, online legitimacy, dealing with trolls, and other hazards of running a large, collaborative project over the Internet. As Sanger writes, "A virtually identical version of this memoir is due to appear this summer in Open Sources 2.0, published by O'Reilly and edited by Chris DiBona, Danese Cooper, and Mark Stone. The volume is to be a successor to Open Sources: Voices from the Open Source Revolution (1999)." Read on below for the story (continued tomorrow). Update: 04/20 19:19 GMT by T : Here's a link to the continuation of Sanger's memoir.Contents:
Preface
Some recent press reports
Nupedia
The origins of Wikipedia
Wikipedia's first few monthsPreface
An impassioned debate has been raging, particularly since about the summer of 2004, about the merits of Wikipedia and the future of free online encyclopedias. This discussion has not benefitted by much detailed, accurate consideration of the origins of Wikipedia and of its parent project, Nupedia. But it seems to me that those origins are very important -- crucial, even -- to forming a proper judgment of the current state and best future direction of free encyclopedias.
Wikipedia as it stands is a fantastic project; it has produced enormous amounts of content, thousands of excellent articles, and now, after just four years, is getting high-profile, international recognition as a new way of obtaining at least a rough and ready idea about very many topics. Its surprising success may be attributed, briefly, to its free, open, and collaborative nature.
This has been my attitude toward Wikipedia practically since its founding. But a few months ago I wrote an article critical of certain aspects of the Wikipedia project, 'Why Wikipedia Must Jettison Its Anti-Elitism', which occasioned much debate. I have also been quoted, as co-founder of Wikipedia, in many recent news articles about the project, making various other critical remarks. I am afraid I am getting an undeserved reputation as someone who is opposed to everything Wikipedia stands for. This is completely incorrect. In fact, I am one of Wikipedia's strongest supporters. I am partly responsible for bringing it into the world (as I will explain), and I still love it and want only the best for it. But if a better job can be done, a better job should be done. Wikipedia has shown fantastic potential, and it is open content--and so if the project has problems (or features) which will keep it from being the maximally authoritative, broad, and deep reference that I believe could exist, I firmly believe that the world has the right to, and should, improve upon it.
Wikipedia's predecessor, which I was also employed to organize, was Nupedia. Nupedia was to be a highly reliable, peer-reviewed resource that fully appreciated and employed the efforts of subject area experts, as well as the general public. When the more free-wheeling Wikipedia took off, Nupedia was left to wither. It might appear to have died of its own weight and complexity. But, as I will explain, it could have been redesigned and adapted--it could have, as it were, "learned from its mistakes" and from Wikipedia's successes. Thousands of people who had signed up and who wanted to contribute to the Nupedia system were left disappointed. I believe this was unfortunate and unnecessary; I always wanted Nupedia and Wikipedia working together to be not only the world's largest but also the world's most reliable encyclopedia. I hope that this memoir will help to justify this stance. Hopefully, too, I will manage to persuade some people that collaboration between an expert project and a public project is the correct approach to the overall project of creating open content encyclopedias.
I am not writing to request that Nupedia be resuscitated now, as nice as that would be. But I would like to tell the story of Nupedia and the first couple years of Wikipedia, as I remember it. A more complete history of the projects, as opposed to a memoir, must await a careful study of the Nupedia and Wikipedia archives--if early archives of them still exist (I have no idea if they do)--or else these entries from the "Wayback Machine." Interviews with many of those heavily involved in the projects would also help a great deal, so long as interviews were done of people on different side of the disputes that helped to shape the project.
By the way, the "overall project of creating open content encyclopedias" is something of which I have been writing since at least 2001. For example, in July of 2001, while still working on both Wikipedia and Nupedia, I wrote, "if some other open source project proves to be more competitive, then it should and will take the lead in creating a body of free encyclopedic knowledge." Since Wikipedia is open content and hence may be reproduced and improved upon by anyone, I have always been cognizant that it might not end up being the only or best version. My personal devotion has always been to the ideal project as I have envisioned it, not necessarily to particular incarnations of Nupedia or Wikipedia; and I think this attitude is fully consistent with the (very positive) spirit of open source collaboration generally.
This being said, let me also emphasize strongly that, throughout this discussion, I am not suggesting that Wikipedia needs to be replaced with something better. I do, however, think that it needs to be supplemented by a broader, more ambitious, and more inclusive vision of the overall project.
Some recent press reports
The following memoir seems all the more important to publish now because the early history of Nupedia and Wikipedia has been mischaracterized in the press recently. If there were only a few inaccuracies, which made no difference, I would be happy to leave well enough alone. But some of the mischaracterizations I've seen do make a difference, because they give the public the impression that Nupedia failed because it was run by snobbish experts whose standards were too high. As the following should make clear, that is not quite correct. One might also gather from some reports that the idea for Wikipedia sprang fully grown from Jimmy Wales' head. Jimmy, of course, deserves enormous credit for investing in and guiding Wikipedia. But a more refined idea of how Wikipedia originated and evolved is crucial to have, if one wants to appreciate fully why it works now, and why it has the policies that it does have.
For example, in the Nov. 1, 2004 issue of Newsweek, in "It's Like a Blog, But It's a Wiki," reporter Brad Stone writes:
[Jimmy] Wales first tried to rewrite the rules of the reference-book business five years ago with a free online encyclopedia called Nupedia. Anyone could submit articles, but they were vetted in a seven-step review process. After investing thousands of his own dollars and publishing only 24 articles, Wales reconsidered. He scrapped the review process and began using a popular kind of online Web site called a "wiki," which allows its readers to change the content.
This capsule history is, of course, very brief and so should be expected not to have every relevant detail. But some of the claims made here are not just vague, they are actually misleading, and so several clarifications are in order (all of this is elaborated below):- The article makes it sound as if Jimmy were the only person making the relevant decisions. That is incorrect; the Nupedia system (indeed, seven steps) was established via negotiation with Nupedia's volunteer Advisory Board, mostly Ph.D. volunteers, who served as editors and peer reviewers. I articulated our decisions in Nupedia's "Editorial Policy Guidelines." Jimmy started and broadly authorized it all, but as to the details, he really had little to do with them.
- Nupedia's Advisory Board might be surprised to learn that Jimmy (alone!) "scrapped the review process." Jimmy was certainly disappointed with the process (as were many people), and he did not actively support it after 2001 or so. But in fairness to the people actually working on Nupedia, the fact is that work on Nupedia gradually petered out in 2001-2. I in particular was stretched thin--in 2001, I was both chief organizer of Wikipedia and editor-in-chief of Nupedia--and my own slowing work on Nupedia was obvious to all active Nupedia contributors. It might be better to say that Nupedia withered due to neglect--which was largely due to a lack of sufficient funds for paid organizers--which was as much due to the bursting of the Internet bubble as anything else.
- Also, to the best of my knowledge, the "thousands of his own dollars" invested in these projects were, if I am not very mistaken, the dollars of Bomis.com, which is jointly owned by three partners, Jimmy, Tim Shell, and Michael Davis. (The money for Wikipedia now comes from donations.) But again, Jimmy was the prime motivating force within Bomis.
- Moreover, Nupedia had fewer than 24 articles when Wikipedia launched, being not quite a year old at that time. The idea of adapting wiki technology to the task of building an encyclopedia was mine, and my main job in 2001 was managing and developing the community and the rules according to which Wikipedia was run. Jimmy's role, at first, was one of broad vision and oversight; this was the management style he preferred, at least as long as I was involved. But, again, credit goes to Jimmy alone for getting Bomis to invest in the project, and for providing broad oversight of the fantastic and world-changing project of an open content, collaboratively-built encyclopedia. Credit also of course goes to him for overseeing its development after I left, and guiding it to the success that it is today.
A March 2005 Wired Magazine article by Daniel Pink also got a number of things wrong, despite being, in other respects, an excellent article:
With Sanger as editor in chief, Nupedia essentially replicated the One Best way model. He assembled a roster of academics to write articles. (Participants even had to fax in their degrees as proof of their expertise.) And he established a seven-stage process of editing, fact-checking, and peer review. "After 18 months and more than $250,000," Wales said, "we had 12 articles."
This too needs clarifications:Then an employee told Wales about Wiki software. On January 15, 2001, they launched a Wiki-fied version and within a month, they had 200 articles. In a year, they had 18,000. ... Sanger left the project in 2002. "In the Nupedia mode, there was room for an editor in chief," Wales says. "The Wiki model is too distributed for that."
- The "roster of academics" (the aforementioned Nupedia Advisory Board) was not limited to academics; they were experts in their fields, in any case. Moreover, they were editors and peer reviewers; the general public was able to propose and write articles on subjects about which they had some knowledge. (Consult the old assignment policy if you are interested.)
- It is incorrect to say that participants had to fax their degrees as proof of their expertise; we did verify bona fides by matching the names and e-mail addresses of editors and reviewers with a web page--often, but not always, an academic web page. Indeed there was one (but only one) case that I recall in which I asked someone, who had no web page or any other easy way to prove who he was, to fax a degree. Verifying bona fides seemed like a good idea especially when initially building what was to be an academically-respectable project.
- Again, I did not establish the editorial process alone; I had considerable assistance (for which I am still grateful) from Nupedia's excellent Advisory Board.
- And as I wrote on July 25, 2001 for Kuro5hin, "Britannica or Nupedia? The Future of Free Encyclopedias," Nupedia had "just over 20" articles--not 12--after 18 months. We always suspected that we would wind up scrapping our first attempts to design an editorial system, and that we would learn a great deal from those first attempts; and that's essentially what happened. But Nupedia could have evolved, and would have, had we continued working on it.
- The second paragraph begins, "Then an employee told Wales about Wiki software." I don't know how Jimmy first learned about wikis, but as I will explain below, I proposed to him and to the Nupedia community at large that we start a wiki-based encyclopedia.
- The context of the line "Sanger left the project in 2002"--particularly with Jimmy quoted as saying, "In the Nupedia mode there was room for an editor in chief"--makes it sound as if I were let go specifically because I was working only on Nupedia and that I was no longer needed for that. In fact, I was working on Wikipedia far more at the time than Nupedia, and the reason for my departure from both projects was that Bomis was, like virtually all dot-coms, losing money. They could not afford to pay me; I was told that I was the last of several newer Bomis employees to be laid off on account of the tech recession. But Wikipedia indeed was able to continue on without me, and I agreed even at the time that Wikipedia could survive without me, and that it had become essentially "unmanageable" (as I put it--the following memoir should make it clear what I meant by that).
Nupedia
I'm going to begin this memoir with several paragraphs about Nupedia, because the origin of Wikipedia cannot be explained except in that context. Moreover, the Nupedia project itself was very worthwhile, and I think it might have been able to survive, as I will explain. Finally, some errors regarding Nupedia have been passed around (a few examples are above), which are little better than unfounded rumors. It is unfortunate that the thousands of hours of excellent volunteer work done on Nupedia should be thus disrespected or grossly misunderstood. I personally will always be grateful to those initial contributors who believed in the project and our management, worked hard for a completely unproven idea, and laid the groundwork for the growing institution of open content projects.
In 1999, Jimmy Wales wanted to start a free, collaborative encyclopedia. I knew him from several mailing lists back in the mid-90s, and in fact we had already met in person a couple of times. In January 2000, I e-mailed Jimmy and several other Internet acquaintances to get feedback on an idea for what was to be, essentially, a blog. (It was to be a successor to "Sanger and Shannon's Review of Y2K News Reports," a Y2K news summary that I first wrote and then edited.) To my great surprise, Jimmy replied to my e-mail describing his idea of a free encyclopedia, and asking if I might be interested in leading the project. He was specifically interested in finding a philosopher to lead the project, he said. He made it a condition of my employment that I would finish my Ph.D. quickly (whereupon I would get a raise)--which I did, in June 2000. I am still grateful for the extra incentive. I thought he would be a great boss, and indeed he was.
To be clear, the idea of an open source, collaborative encyclopedia, open to contribution by ordinary people, was entirely JimmyÃââs, not mine, and the funding was entirely by Bomis. I was merely a grateful employee; I thought I was very lucky to have a job like that land in my lap. Of course, other people had had the idea; but it was Jimmy's fantastic foresight actually to invest in it. For this the world owes him a considerable debt. The actual development of this encyclopedia was the task he gave me to work on.
So I arrived in San Diego in early February, 2000, to get to work. One of the first things I asked Jimmy is how free a rein I had in designing the project. What were my constraints, and in what areas was I free to exercise my own creativity? He replied, as I clearly recall, that most of the decisions should be mine; and in most respects, as a manager, Jimmy was indeed very hands-off. Nevertheless, I always did consult with him about important decisions, and moreover, I wanted his advice. Now, Jimmy was quite clear that he wanted the project to be in principle open to everyone to develop, just as open source software is (to an extent). Beyond this, however, I believe I was given a pretty free rein. So I spent the first month or so thinking very broadly about different possibilities. I wrote quite a bit (that writing is now all lost--that will teach me not to back up my hard drives) and discussed quite a bit with both Jimmy and one of the other Bomis partners, Tim Shell.
I maintained from the start that something really could not be a credible encyclopedia without oversight by experts. I reasoned that, if the project is open to all, it would require both management by experts and an unusually rigorous process. I now think I was right about the former requirement, but wrong about the latter, which was redundant; I think that the subsequent development of Wikipedia has borne out this assessment. But I fully realize that all of this is a matter of debate. Some will claim that the experience of Wikipedia refuted my original judgment that expert oversight is necessary for a very credible encyclopedia; but I disagree with them. More on this below.
Also, I am fairly sure that one of the first policies that Jimmy and I agreed upon was a "nonbias" or neutrality policy. I know I was extremely insistent upon it from the beginning, because neutrality has been a hobby-horse of mine for a very long time, and one of my guiding principles in writing "Sanger's Review." Neutrality, we agreed, required that articles should not represent any one point of view on controversial subjects, but instead fairly represent all sides. We also agreed in rejecting an alternative that (for a time) Tim and some early Nupedians plugged for: the development, for each encyclopedia topic, of a series of different articles, each written from a different point of view.
I believed, moreover, that a strongly collaborative and open project could not survive if its contributors were not "personally invested" in the project, and that this required some input and management by its users. So I think it was very early on that I decided that Nupedia should have an Advisory Board--editors, and peer reviewers, who would together agree to project policy--and that the public should have a say in the formulation of policy.
An early incarnation of NupediaÃââs Advisory Board was in place by summer of 2000 or so. It was made up of the project's highly-qualified editors and reviewers, mostly Ph.D. professors but also a good many other highly-experienced professionals. Eventually the Advisory Board agreed to an extremely rigorous seven-step system. A lot of the details of the Nupedia policy and processes were, I think, proposed by me, but then tweaked and elaborated by others, and the policy was not published as project policy until we had a quorum of editors and peer reviewers who could fully discuss and approve of a policy statement. But I do not think that we discussed the proposal well enough, and further initial discussion could have made a difference, because, as it turned out, a clear mistake of mine and others was to assume that such a complicated system would be navigated patiently by many volunteers, even if they had clear-enough instructions. That is a mistake I doubt anyone designing volunteer content creation systems will make again; I certainly would not make it again.
I spent a huge amount of time recruiting people for Nupedia, e-mailing new arrivals, posting to mailing lists, giving interviews, etc. I had had some experience publicizing Internet projects when I worked on several philosophy discussion groups as a grad student in the 1990s (I had perpetrated an "Association for Systematic Philosophy" as well as a "Tutorial Manifesto"), and I knew that getting many willing and active participants was difficult but important. I even had an administrative assistant for six months in 2000 and 2001, Liz Campeau, whose sole job was to recruit people to work on Nupedia and then Wikipedia. I think a large part of the reason Wikipedia got off the ground so quickly and so well is that it was started by Nupedians, who were then a very large base of people who wanted to work on an encyclopedia, and who had many definite ideas about how it should be done. Maybe 2,000 Nupedia members were subscribed to the general announcement list in January of 2001, when Wikipedia launched--I forget how many but an old project news page indicates that 2,000 is about right.
We operated the system initially using e-mail and mailing lists, while planning and finalizing process details. That lasted from spring through fall 2000. I think our first article ("atonality" by Christoph Hust), that made it entirely through the system, was published in June or July of 2000. To move the system to a completely web-based one, there was, of course, a great deal of design and programming to do. So in fall of 2000 I worked a lot with a specifically-hired programmer (Toan Vo) and the Bomis sysadmin (Jason Richey) to transfer the system from a clunky mailing list system to the web. But by the time the web-based system was ready--I think December of 2000, just a month before Wikipedia got started--it had become obvious to Jimmy and me that the seven-step editorial process would move too slowly, even when managed on the web. But Magnus Manske later, in 2001, made some very nice additions to the Nupedia system.
Some institutional traditions begin easily but die hard. So, in 2001, it was only after many months and uncomfortable comparison of Nupedia with the thriving, younger Wikipedia, that Nupedia's Advisory Board was willing to consider a simpler system seriously. That was because Nupedia editors and peer reviewers had a very strong commitment to rigor and reliability, as did I. Moreover, as Wikipedia became increasingly successful in 2001, Jimmy asked me to spend more and more time on it, which I did; Nupedia suffered from neglect. But by the summer of 2001, I was able to propose, get accepted (with very lukewarm support), and install something we called the Nupedia Chalkboard, a wiki which was to be closely managed by Nupedia's staff. It was to be both a simpler way to develop encyclopedia articles for Nupedia, and a way to import articles from Wikipedia. No doubt due to lingering disdain for the wiki idea--which at the time was still very much unproven--the Chalkboard went largely unused. The general public simply used Wikipedia if they wanted to write articles in a wiki format, while perhaps most Nupedia editors and peer reviewers were not persuaded that the Chalkboard was necessary or useful.
By early winter, 2001, Nupedia had published approved versions of only about 25 articles, although there were many more (I vaguely recall over 150 drafts) at various stages in process. I was finally able to persuade the Advisory Board to move the system to a much simpler two-step process, virtually identical to that used to run many academic journals: articles would be submitted to an editor; the editor would, if the article seemed good enough, forward it to a reviewer for acceptance or rejection; if accepted, the article would be posted. We also were thinking of various ways of allowing public comment on or moderated editing of posted articles. I believe this new, simpler system would have produced thousands of articles for Nupedia very quickly. The general public on Nupedia was certainly interested and motivated, and I think it was finally becoming generally accepted by the Advisory Board that the complexity of the system was the main reason that they were not starting articles and getting them through the system.
But, unfortunately, Nupedia's new system was never adopted when it should have been--the winter of 2001-2--because at the same time, Wikipedia was demanding as much attention as I could give it, and I had little time to implement the new Nupedia system. I am quite sure we could have started the new Nupedia system in early 2002, if we had made the time. But Bomis lost the ability to pay me and, newly unemployed, I did not have the time to lead Nupedia as a volunteer. I did not entirely lose hope on Nupedia, however, as I will explain below.
The origins of Wikipedia
In the fall of 2000, Jimmy and I were very well agreed that Nupedia's slow productivity was probably going to be an ongoing problem and that there needed to be a way, moreover, in which ordinary, uncredentialed people could participate more easily. Uncredentialed people could (and did) participate in Nupedia, particularly as writers and copyeditors, but it was pretty painful for most of them to get articles through the elaborate system. So there seemed to be a huge fund of talent, motivated to work on an encyclopedia but not motivated enough to work on Nupedia, going to waste.
It was my job to solve these problems. I wrote multiple detailed proposals for a simpler, more open editing system--two or three, at least--and I ran them by Jimmy, and I think his reply to all of them was that it would require too much programming and he couldn't afford to pay more high-priced programmers (they were very high-priced at the time, you will recall, and we already had Toan and Jason working quite a bit on Nupedia's new web-based system). Now, of course, I fully realize that we could have found a way to enlist volunteers to develop the system. Jimmy and I both probably knew that at the time; I'm not sure why we didn't pursue it.
So it was while I was thinking hard about how to create a more open system, that would require minimal programming to set up, that I had dinner with an old Internet friend of mine, Ben Kovitz. Ben had moved to town for a new job and we were out at a Pacific Beach Mexican restaurant on January 2, 2001, talking about jobs, techie stuff, and philosophy, no doubt. (Ben, Jimmy, and I were all active on those philosophy mailing lists in the mid-90s and we all knew each other.) So Ben explained the idea of Ward Cunningham's WikiWikiWeb to me. Instantly I was considering whether wiki would work as a more open and simple editorial system for a free, collaborative encyclopedia, and it seemed exactly right. And the more I thought about it, without even having seen a wiki, the more it seemed obviously right. So I'm sure it was that very evening or the following morning that I wrote a proposal--unfortunately, lost now--in which I said that this might solve the problem and that we ought to try it. After he had nixed my several earlier proposals, and given that setting up a wiki would be very simple and require hiring no programmer, Jimmy could scarcely refuse. I vaguely recall that he liked the idea but was initially skeptical--properly so, as I was, despite my excitement.
Wiki advocates often used to point out (and I'm sure some still do) that Wikipedia is nonstandard as a wiki. This is partly because we began just with the very basic wiki concept and not so much of the culture. Wiki culture is very distinctive. I cannot hope to explain even the highlights briefly, so I will not try; I will simply give a few notions. Wiki pages can be started and edited by anyone, but, in "Thread Mode" (as in "the thread of this discussion") the dialogue can become complex. In that case, or when consensus is reached, or when positions have hardened, it is considered a good idea to "refactor" pages (a term borrowed from programming), i.e., to rewrite them, but honestly, taking into account the highlights of the dialogue. Then the dialogue might be represented as in "Document Mode." Opinions are very welcome on a typical wiki. There are many other collective habits that make up typical wiki culture; these are only a few.
But I denied the necessity of organizing Wikipedia according to these precise principles. To be sure, a few other participants wanted Wikipedia to adopt wiki culture wholesale, so that it would be "just another wiki," and they had some small influence over the direction of the project; but speaking for myself, I viewed wiki software as simply a tool, a way to organize people who want to collaborate. I saw no necessity whatsoever of partaking in all aspects of the idiosyncratic culture that happened to be associated with the advent of this very generally-applicable tool, since we were engaged in a very specific sort of project, with very specific requirements. This caused some consternation among some wiki advocates, who appeared to think that Wikipedia should, or inevitably would, become just another wiki, somehow necessarily partaking of typical wiki culture. Ward Cunningham's prediction, when Jimmy asked him whether wiki software "could successfully generate a useful encyclopedia," was: "Yes, but in the end it wouldn't be an encyclopedia. It would be a wiki." As I said in reply: "Wikipedia has a totally different culture from this wiki, because it's pretty singlemindedly aimed at creating an encyclopedia. It's already rather useful as an encyclopedia, and we expect it will only get better."
Typical wiki culture aside, wiki software does encourage, but does not strictly require, extreme openness and de-centralization: openness, since (as the software is typically designed) page changes are logged and publicly viewable, and (again, only typically) pages may be further changed by anyone; de-centralization, because in order for work to be done, there is no need for a person or body to assign work, but rather, work can proceed as and when people want to do it. Wiki software also discourages (or at least does not facilitate) the exercise of authority, since work proceeds at will on any page, and on any large, active wiki it would be too much work for any single overseer or limited group of overseers to keep up. These all became features of Wikipedia.
My initial idea was that the wiki would be set up as part of Nupedia; it was to be a way for the public to develop a stream of content that could be fed into the Nupedia process. I think I got some of the basic pages written--how wikis work, what our general plan was, and so forth--over the next few days. I wrote a general proposal for the Nupedia community, and the Nupedia wiki went live January 10. The first encyclopedia articles for what was to become Wikipedia were written then. It turned out, however, that a clear majority of the Nupedia Advisory Board wanted to have nothing to do with a wiki. Again, their commitment was to rigor and reliability, a concern I shared with them and continue to have. Still, perhaps some of those people are kicking themselves now. They (some of them) evidently thought that a wiki could not resemble an encyclopedia at all, that it would be too informal and unstructured, as the original WikiWikiWeb was (and is), to be associated with Nupedia. They of course were perfectly reasonable to doubt that it would turn into the fantastic source of content that it did. Who could reasonably guess that it would work? But it did work, and now the world knows better.
Wikipedia's first few months
So we decided to relaunch the wiki under its own domain name. I came up with the name "Wikipedia," a silly name for what was at first a very silly project, and the newly independent project was launched at Wikipedia.com on January 15, 2001. It was a ".com" at first because, at the time, we were contemplating selling ads to pay for me, programmers, and servers. It was easy to deprecate ".com" in favor of ".org" in 2002, after Jimmy was able to assure users that Wikipedia would never (at least I think he said, or clearly implied, "never") run ads to support the project.
I took it to be one of my main jobs to promote Wikipedia, and this resulted in a steady influx of new participants. I wrote on the Wikipedia announcement page January 24, "Wikipedia has definitely taken [on] a life of its own; new people are arriving every day and the project seems to be getting only more popular. Long live Wikipedia!" By the end of January we reportedly and approximately had 600 articles; there were 1300 in March, 2300 in April, and 3900 in May. Not only was the project growing steadily, the rate of growth was increasing.
Wikipedia started with a handful of people, many from Nupedia. The influence of Nupedians was, I think, pretty important early on; I think, especially, of the tireless Magnus Manske (who worked on the software for both projects), our resident stickler Ruth Ifcher, and the very smart poker-playing programmer Lee Daniel Crocker--to name a few. All of these people, and several other Nupedia borrowings, had a good understanding of the requirements of good encyclopedia articles, and they were good writers and very smart. The direction that Wikipedia ought to go in was pretty obvious to myself and them, in terms of what sort of content we wanted. But what we did not have worked out in advance was how the community should be organized, and (not surprisingly) that turned out to be the thorniest problem. But the facts that the project started with these good people, and that we were able to adopt, explain, and promote good habits and policies to newer people, partly accounts for why the project was able to develop a robust, functional community and eventually to succeed. As to project leadership or management, we began with me, Jimmy, and Tim Shell; but Tim stopped participating so much after the first few months.
But the many rank-and-file users did the heavy lifting, and if there had not been a reasonable consensus among them about what the project should look like, it just wouldn't have happened. In any collaborative project, it is the contributors who are responsible for the outcome. Those early adopters should feel proud of themselves, because they were absolutely instrumental in shaping a thing of beauty and usefulness.
I recall saying casually, but repeatedly, in the project's first nine months or so, that experts and specialists should be given some particular respect when writing in their areas of expertise. They should be deferred to, I thought, unless there were some clear evidence of bias. (I recall an interesting discussion with a Polish scientist, Piotr Wozniak, about this issue when we came to a small disagreement about the "sleep" article.) So, in those first months, deference to expertise was a policy that at least I usually insisted upon, but not strongly or clearly enough. It was nearly a year after the project began that I finally articulated this view reasonably clearly as a policy to consider. Perhaps this was because, indeed, most users did make a practice of deferring to experts up to that time. "This is just common sense," as I wrote, "but sometimes common sense needs to be spelled out!" What I now think is that that point of common sense needed to be spelled out quite a bit sooner and more forcefully, because in the long run, it was not adopted as official project policy, as it could have been.
Some questions have been raised about the origin of Wikipedia policies. The tale is interesting and instructive, and one of the main themes of this memoir. We began with no (or few) policies in particular and said that the community would determine--through a sort of vague consensus, based on its experience working together--what the policies would be. The very first entry on a "rules to consider" page was the "Ignore All Rules" rule (to wit: "If rules make you nervous and depressed, and not desirous of participating in the wiki, then ignore them entirely and go about your business"). This is a "rule" that, current Wikipedians might be surprised to learn, I personally proposed. The reason was that I thought we needed experience with how wikis should work, and even more importantly at that point we needed participants more than we needed rules. As the project grew and the requirements of its success became increasingly obvious, I became ambivalent about this particular "rule" and then rejected it altogether. As one participant later commented, "this rule is the essence of Wikipedia." That was certainly never my view; I always thought of the rule as being a temporary and humorous injunction to participants to add content rather than be distracted by (then) relatively inconsequential issues about how exactly articles should be formatted, etc. In a similar spirit, I proposed that contributors be bold in updating pages (the current version is much expanded, as it should be).
I also, for similar reasons, specifically disavowed any title; I was organizing the project but I did not want to present myself as editor-in-chief. I wanted people to feel comfortable adding information without having to consult anything like an editor. Participation was more important, I felt. (Others referred to me, later, as Wikipedia's editor.)
As we set it up, Wikipedia did have some minimal wiki cultural features: it was wide open, extremely decentralized, and (provisionally anyway) featured very little attempt to exercise authority. Insofar as I was able to organize it at all, I guided the project through force of personality and what "moral authority" I had as co-founder of the project. Jimmy and I agreed early on that, at least in the beginning, we should not eject anyone from the project except perhaps in the most extreme cases. Our first forcible expulsion (which Jimmy performed) did not occur for many months, despite the presence of difficult characters from nearly the beginning of the project. Again, we were learning: we wished to tolerate all sorts of contributors in order to be well-situated to adopt the wisest policies. But--and in hindsight this should have seemed perfectly predictable--this provisional "hands off" management policy had the effect of creating a difficult-to-change tradition, the tradition of making the project extremely tolerant of disruptive (uncooperative, "trolling") behavior. And as it turned out, particularly with the large waves of new contributors from the summer and fall of 2001, the project became very resistant to any changes in this policy. I suspect that the cultures of online communities generally are established pretty quickly and then very resistant to change, because they are self-selecting; that was certainly the case with Wikipedia, anyway.
So I could only attempt to shame any troublemakers into compliance; without recourse to any genuine punitive action, that was the most I could do. In about the first eight months of the project, this was usually sufficient for me to do my job. After that, however, my job got increasingly difficult, as I will explain.
So Wikipedia began as a good-natured anarchy, a sort of Rousseauian state of digital nature. I always took Wikipedia's anarchy to be provisional and purely for purposes of determining what the best rules, and the nature of its authority, should be. What I, and other Wikipedians, failed to realize is that our initial anarchy would be taken by the next wave of contributors as the very essence of the project--how Wikipedia was "meant" to be--even though Wikipedia could have become anything we the contributors chose to make it.
This point bears some emphasis: Wikipedia became what it is today because, having been seeded with great people with a fairly clear idea of what they wanted to achieve, we proceeded to make a series of free decisions that determined the policy of the project and culture of its supporting community. Wikipedia's system is neither the only way to run a wiki, nor the only way to run an open content encyclopedia. Its particular conjunction of policies is in no way natural, "organic," or necessary. It is instead artificial, a result of a series of free choices, and we could have chosen differently in many cases; and choosing differently on some issues might have led to a project better than the one that exists today.
Though it began as an anarchy, there were quite a few policies that were settled upon, more or less, within the first six months or so. This required some struggle, especially on my part, particularly because, since the project was a wiki, some participants thought that there should be no rules at all. (Enforceable rules were regarded as "anti-wiki," which was supposed to be a bad thing.) But it was made clear from the beginning that we intended Wikipedia to be an encyclopedia, and so we were able to plug for at least those rules that would help define and sustain the project as an encyclopedia.
For instance, throughout the early months, people added various content that seemed less than encyclopedic in various ways. Many people seemed to confuse encyclopedia articles with dictionary entries, and eventually I wrote a page called "Wikipedia is not a dictionary." (I am surprised to discover that this page still exists as of this writing, with a good deal of its original content.) As people found new ways not to write encyclopedia articles, I started "What Wikipedia is not": I and others would note on an article's discussion page that some certain content did not belong in an encyclopedia, and then underscored the point by adding an entry to the "What Wikipedia is not" page. To take another example, Wikipedia was not to be a place for publishing original research. In fact, this is a policy that had been settled upon and even enforced in Nupedia days; enforcing it actually led to the departure of Nupedia's erstwhile Classics editor sometime in 2001.
Many of our first controversies were over these restrictions. At the time, I had enough influence within the community to get these policies generally accepted. And if we had not decided on these restrictions, Wikipedia might well have ended up, like many wikis, as nothing in particular. But since we insisted that it was an encyclopedia, even though it was just a blank wiki and a group of people to begin with, it became an encyclopedia. There is something very profound about that. I also like to think that we helped to show the world the potential that wikis have.
Another policy that was instituted early on was the nonbias or neutrality policy. This was borrowed from the Nupedia project and made a Rule to Consider--in a very early version, the policy was put this way:
Avoid bias: Since this is an encyclopedia, after a fashion, it would be best if you represented your controversial views either (1) not at all, (2) on *Debate, *Talk, or *Discussion pages linked from the bottom of the page that you're tempted to grace, or (3) represented in a fact-stating fashion, i.e., which attributes a particular opinion to a particular person or group, rather than asserting the opinion as fact. (3) is strongly preferred.
Jimmy then started a specialized policy page he called "Neutral Point of View" (here is the current version). I confess I don't much like this name as a name for the policy, because it implies that to write neutrally, or without bias, is actually to express a point of view, and, as the definite article is used, a single point of view at that. "Neutrality", "neutral", and "neutrally" are better to use for the noun, adjective, and adverb. But the acronym "NPOV" came to be used for all three, by Wikipedians wanting to seem hip, and then the unfortunate "POV" came to be used when the perfectly good English word "biased" would do.
In addition to these, I recall suggesting a number of other rules--no doubt most matters of historical fact, along these lines, can be verified in archives. I believe I am responsible for the original formulations of a lot of the article naming conventions, as well as the conventions of bolding the title of the article, starting articles with full sentences, making article titles uncapitalized, and much else. I think these policies were just a matter of common sense for anyone who understood what a good encyclopedia should be like. And of course I was not the only person proposing conventions. Moreover, actual project policy, or community habits, succeeded in being established only by being followed and supported by a majority of participants. It was then, we said, that there was a "rough consensus" in favor of the policy. And consensus, we said, is required for a policy actually to be considered project policy. For our purposes, a "consensus" appeared to consist of (1) widespread common practice, (2) many vocal defenders, and (3) virtually no detractors.
But that way of settling upon policy proposals--viz., by alleged consensus--did not scale, in my opinion. After about nine months or so, there were so many contributors, and especially brand new contributors, that nothing like a consensus could be reached, for the simple reason that condition (3) above was never achievable: there would after that always be somebody who insisted on expressing disagreement. There was, then, a non-scaling policy adoption procedure, and a crying need to continue to adopt sensible policies. This led to some pretty serious problems in the community, which I will relate below. But first, something more positive.
It's a cliff-hanger; you'll have to wait until tomorrow to read about what made Wikipedia start to work. -
The Early History of Nupedia and Wikipedia: A Memoir
Larry Sanger was one of the moving forces behind the pioneering Nupedia project. That makes him one of the people to thank for Wikipedia, which has been enjoying more and more visibility of late. Sanger has prepared a lengthy, informative account of the early history of Nupedia and Wikipedia, including some cogent observations on project management, online legitimacy, dealing with trolls, and other hazards of running a large, collaborative project over the Internet. As Sanger writes, "A virtually identical version of this memoir is due to appear this summer in Open Sources 2.0, published by O'Reilly and edited by Chris DiBona, Danese Cooper, and Mark Stone. The volume is to be a successor to Open Sources: Voices from the Open Source Revolution (1999)." Read on below for the story (continued tomorrow). Update: 04/20 19:19 GMT by T : Here's a link to the continuation of Sanger's memoir.Contents:
Preface
Some recent press reports
Nupedia
The origins of Wikipedia
Wikipedia's first few monthsPreface
An impassioned debate has been raging, particularly since about the summer of 2004, about the merits of Wikipedia and the future of free online encyclopedias. This discussion has not benefitted by much detailed, accurate consideration of the origins of Wikipedia and of its parent project, Nupedia. But it seems to me that those origins are very important -- crucial, even -- to forming a proper judgment of the current state and best future direction of free encyclopedias.
Wikipedia as it stands is a fantastic project; it has produced enormous amounts of content, thousands of excellent articles, and now, after just four years, is getting high-profile, international recognition as a new way of obtaining at least a rough and ready idea about very many topics. Its surprising success may be attributed, briefly, to its free, open, and collaborative nature.
This has been my attitude toward Wikipedia practically since its founding. But a few months ago I wrote an article critical of certain aspects of the Wikipedia project, 'Why Wikipedia Must Jettison Its Anti-Elitism', which occasioned much debate. I have also been quoted, as co-founder of Wikipedia, in many recent news articles about the project, making various other critical remarks. I am afraid I am getting an undeserved reputation as someone who is opposed to everything Wikipedia stands for. This is completely incorrect. In fact, I am one of Wikipedia's strongest supporters. I am partly responsible for bringing it into the world (as I will explain), and I still love it and want only the best for it. But if a better job can be done, a better job should be done. Wikipedia has shown fantastic potential, and it is open content--and so if the project has problems (or features) which will keep it from being the maximally authoritative, broad, and deep reference that I believe could exist, I firmly believe that the world has the right to, and should, improve upon it.
Wikipedia's predecessor, which I was also employed to organize, was Nupedia. Nupedia was to be a highly reliable, peer-reviewed resource that fully appreciated and employed the efforts of subject area experts, as well as the general public. When the more free-wheeling Wikipedia took off, Nupedia was left to wither. It might appear to have died of its own weight and complexity. But, as I will explain, it could have been redesigned and adapted--it could have, as it were, "learned from its mistakes" and from Wikipedia's successes. Thousands of people who had signed up and who wanted to contribute to the Nupedia system were left disappointed. I believe this was unfortunate and unnecessary; I always wanted Nupedia and Wikipedia working together to be not only the world's largest but also the world's most reliable encyclopedia. I hope that this memoir will help to justify this stance. Hopefully, too, I will manage to persuade some people that collaboration between an expert project and a public project is the correct approach to the overall project of creating open content encyclopedias.
I am not writing to request that Nupedia be resuscitated now, as nice as that would be. But I would like to tell the story of Nupedia and the first couple years of Wikipedia, as I remember it. A more complete history of the projects, as opposed to a memoir, must await a careful study of the Nupedia and Wikipedia archives--if early archives of them still exist (I have no idea if they do)--or else these entries from the "Wayback Machine." Interviews with many of those heavily involved in the projects would also help a great deal, so long as interviews were done of people on different side of the disputes that helped to shape the project.
By the way, the "overall project of creating open content encyclopedias" is something of which I have been writing since at least 2001. For example, in July of 2001, while still working on both Wikipedia and Nupedia, I wrote, "if some other open source project proves to be more competitive, then it should and will take the lead in creating a body of free encyclopedic knowledge." Since Wikipedia is open content and hence may be reproduced and improved upon by anyone, I have always been cognizant that it might not end up being the only or best version. My personal devotion has always been to the ideal project as I have envisioned it, not necessarily to particular incarnations of Nupedia or Wikipedia; and I think this attitude is fully consistent with the (very positive) spirit of open source collaboration generally.
This being said, let me also emphasize strongly that, throughout this discussion, I am not suggesting that Wikipedia needs to be replaced with something better. I do, however, think that it needs to be supplemented by a broader, more ambitious, and more inclusive vision of the overall project.
Some recent press reports
The following memoir seems all the more important to publish now because the early history of Nupedia and Wikipedia has been mischaracterized in the press recently. If there were only a few inaccuracies, which made no difference, I would be happy to leave well enough alone. But some of the mischaracterizations I've seen do make a difference, because they give the public the impression that Nupedia failed because it was run by snobbish experts whose standards were too high. As the following should make clear, that is not quite correct. One might also gather from some reports that the idea for Wikipedia sprang fully grown from Jimmy Wales' head. Jimmy, of course, deserves enormous credit for investing in and guiding Wikipedia. But a more refined idea of how Wikipedia originated and evolved is crucial to have, if one wants to appreciate fully why it works now, and why it has the policies that it does have.
For example, in the Nov. 1, 2004 issue of Newsweek, in "It's Like a Blog, But It's a Wiki," reporter Brad Stone writes:
[Jimmy] Wales first tried to rewrite the rules of the reference-book business five years ago with a free online encyclopedia called Nupedia. Anyone could submit articles, but they were vetted in a seven-step review process. After investing thousands of his own dollars and publishing only 24 articles, Wales reconsidered. He scrapped the review process and began using a popular kind of online Web site called a "wiki," which allows its readers to change the content.
This capsule history is, of course, very brief and so should be expected not to have every relevant detail. But some of the claims made here are not just vague, they are actually misleading, and so several clarifications are in order (all of this is elaborated below):- The article makes it sound as if Jimmy were the only person making the relevant decisions. That is incorrect; the Nupedia system (indeed, seven steps) was established via negotiation with Nupedia's volunteer Advisory Board, mostly Ph.D. volunteers, who served as editors and peer reviewers. I articulated our decisions in Nupedia's "Editorial Policy Guidelines." Jimmy started and broadly authorized it all, but as to the details, he really had little to do with them.
- Nupedia's Advisory Board might be surprised to learn that Jimmy (alone!) "scrapped the review process." Jimmy was certainly disappointed with the process (as were many people), and he did not actively support it after 2001 or so. But in fairness to the people actually working on Nupedia, the fact is that work on Nupedia gradually petered out in 2001-2. I in particular was stretched thin--in 2001, I was both chief organizer of Wikipedia and editor-in-chief of Nupedia--and my own slowing work on Nupedia was obvious to all active Nupedia contributors. It might be better to say that Nupedia withered due to neglect--which was largely due to a lack of sufficient funds for paid organizers--which was as much due to the bursting of the Internet bubble as anything else.
- Also, to the best of my knowledge, the "thousands of his own dollars" invested in these projects were, if I am not very mistaken, the dollars of Bomis.com, which is jointly owned by three partners, Jimmy, Tim Shell, and Michael Davis. (The money for Wikipedia now comes from donations.) But again, Jimmy was the prime motivating force within Bomis.
- Moreover, Nupedia had fewer than 24 articles when Wikipedia launched, being not quite a year old at that time. The idea of adapting wiki technology to the task of building an encyclopedia was mine, and my main job in 2001 was managing and developing the community and the rules according to which Wikipedia was run. Jimmy's role, at first, was one of broad vision and oversight; this was the management style he preferred, at least as long as I was involved. But, again, credit goes to Jimmy alone for getting Bomis to invest in the project, and for providing broad oversight of the fantastic and world-changing project of an open content, collaboratively-built encyclopedia. Credit also of course goes to him for overseeing its development after I left, and guiding it to the success that it is today.
A March 2005 Wired Magazine article by Daniel Pink also got a number of things wrong, despite being, in other respects, an excellent article:
With Sanger as editor in chief, Nupedia essentially replicated the One Best way model. He assembled a roster of academics to write articles. (Participants even had to fax in their degrees as proof of their expertise.) And he established a seven-stage process of editing, fact-checking, and peer review. "After 18 months and more than $250,000," Wales said, "we had 12 articles."
This too needs clarifications:Then an employee told Wales about Wiki software. On January 15, 2001, they launched a Wiki-fied version and within a month, they had 200 articles. In a year, they had 18,000. ... Sanger left the project in 2002. "In the Nupedia mode, there was room for an editor in chief," Wales says. "The Wiki model is too distributed for that."
- The "roster of academics" (the aforementioned Nupedia Advisory Board) was not limited to academics; they were experts in their fields, in any case. Moreover, they were editors and peer reviewers; the general public was able to propose and write articles on subjects about which they had some knowledge. (Consult the old assignment policy if you are interested.)
- It is incorrect to say that participants had to fax their degrees as proof of their expertise; we did verify bona fides by matching the names and e-mail addresses of editors and reviewers with a web page--often, but not always, an academic web page. Indeed there was one (but only one) case that I recall in which I asked someone, who had no web page or any other easy way to prove who he was, to fax a degree. Verifying bona fides seemed like a good idea especially when initially building what was to be an academically-respectable project.
- Again, I did not establish the editorial process alone; I had considerable assistance (for which I am still grateful) from Nupedia's excellent Advisory Board.
- And as I wrote on July 25, 2001 for Kuro5hin, "Britannica or Nupedia? The Future of Free Encyclopedias," Nupedia had "just over 20" articles--not 12--after 18 months. We always suspected that we would wind up scrapping our first attempts to design an editorial system, and that we would learn a great deal from those first attempts; and that's essentially what happened. But Nupedia could have evolved, and would have, had we continued working on it.
- The second paragraph begins, "Then an employee told Wales about Wiki software." I don't know how Jimmy first learned about wikis, but as I will explain below, I proposed to him and to the Nupedia community at large that we start a wiki-based encyclopedia.
- The context of the line "Sanger left the project in 2002"--particularly with Jimmy quoted as saying, "In the Nupedia mode there was room for an editor in chief"--makes it sound as if I were let go specifically because I was working only on Nupedia and that I was no longer needed for that. In fact, I was working on Wikipedia far more at the time than Nupedia, and the reason for my departure from both projects was that Bomis was, like virtually all dot-coms, losing money. They could not afford to pay me; I was told that I was the last of several newer Bomis employees to be laid off on account of the tech recession. But Wikipedia indeed was able to continue on without me, and I agreed even at the time that Wikipedia could survive without me, and that it had become essentially "unmanageable" (as I put it--the following memoir should make it clear what I meant by that).
Nupedia
I'm going to begin this memoir with several paragraphs about Nupedia, because the origin of Wikipedia cannot be explained except in that context. Moreover, the Nupedia project itself was very worthwhile, and I think it might have been able to survive, as I will explain. Finally, some errors regarding Nupedia have been passed around (a few examples are above), which are little better than unfounded rumors. It is unfortunate that the thousands of hours of excellent volunteer work done on Nupedia should be thus disrespected or grossly misunderstood. I personally will always be grateful to those initial contributors who believed in the project and our management, worked hard for a completely unproven idea, and laid the groundwork for the growing institution of open content projects.
In 1999, Jimmy Wales wanted to start a free, collaborative encyclopedia. I knew him from several mailing lists back in the mid-90s, and in fact we had already met in person a couple of times. In January 2000, I e-mailed Jimmy and several other Internet acquaintances to get feedback on an idea for what was to be, essentially, a blog. (It was to be a successor to "Sanger and Shannon's Review of Y2K News Reports," a Y2K news summary that I first wrote and then edited.) To my great surprise, Jimmy replied to my e-mail describing his idea of a free encyclopedia, and asking if I might be interested in leading the project. He was specifically interested in finding a philosopher to lead the project, he said. He made it a condition of my employment that I would finish my Ph.D. quickly (whereupon I would get a raise)--which I did, in June 2000. I am still grateful for the extra incentive. I thought he would be a great boss, and indeed he was.
To be clear, the idea of an open source, collaborative encyclopedia, open to contribution by ordinary people, was entirely JimmyÃââs, not mine, and the funding was entirely by Bomis. I was merely a grateful employee; I thought I was very lucky to have a job like that land in my lap. Of course, other people had had the idea; but it was Jimmy's fantastic foresight actually to invest in it. For this the world owes him a considerable debt. The actual development of this encyclopedia was the task he gave me to work on.
So I arrived in San Diego in early February, 2000, to get to work. One of the first things I asked Jimmy is how free a rein I had in designing the project. What were my constraints, and in what areas was I free to exercise my own creativity? He replied, as I clearly recall, that most of the decisions should be mine; and in most respects, as a manager, Jimmy was indeed very hands-off. Nevertheless, I always did consult with him about important decisions, and moreover, I wanted his advice. Now, Jimmy was quite clear that he wanted the project to be in principle open to everyone to develop, just as open source software is (to an extent). Beyond this, however, I believe I was given a pretty free rein. So I spent the first month or so thinking very broadly about different possibilities. I wrote quite a bit (that writing is now all lost--that will teach me not to back up my hard drives) and discussed quite a bit with both Jimmy and one of the other Bomis partners, Tim Shell.
I maintained from the start that something really could not be a credible encyclopedia without oversight by experts. I reasoned that, if the project is open to all, it would require both management by experts and an unusually rigorous process. I now think I was right about the former requirement, but wrong about the latter, which was redundant; I think that the subsequent development of Wikipedia has borne out this assessment. But I fully realize that all of this is a matter of debate. Some will claim that the experience of Wikipedia refuted my original judgment that expert oversight is necessary for a very credible encyclopedia; but I disagree with them. More on this below.
Also, I am fairly sure that one of the first policies that Jimmy and I agreed upon was a "nonbias" or neutrality policy. I know I was extremely insistent upon it from the beginning, because neutrality has been a hobby-horse of mine for a very long time, and one of my guiding principles in writing "Sanger's Review." Neutrality, we agreed, required that articles should not represent any one point of view on controversial subjects, but instead fairly represent all sides. We also agreed in rejecting an alternative that (for a time) Tim and some early Nupedians plugged for: the development, for each encyclopedia topic, of a series of different articles, each written from a different point of view.
I believed, moreover, that a strongly collaborative and open project could not survive if its contributors were not "personally invested" in the project, and that this required some input and management by its users. So I think it was very early on that I decided that Nupedia should have an Advisory Board--editors, and peer reviewers, who would together agree to project policy--and that the public should have a say in the formulation of policy.
An early incarnation of NupediaÃââs Advisory Board was in place by summer of 2000 or so. It was made up of the project's highly-qualified editors and reviewers, mostly Ph.D. professors but also a good many other highly-experienced professionals. Eventually the Advisory Board agreed to an extremely rigorous seven-step system. A lot of the details of the Nupedia policy and processes were, I think, proposed by me, but then tweaked and elaborated by others, and the policy was not published as project policy until we had a quorum of editors and peer reviewers who could fully discuss and approve of a policy statement. But I do not think that we discussed the proposal well enough, and further initial discussion could have made a difference, because, as it turned out, a clear mistake of mine and others was to assume that such a complicated system would be navigated patiently by many volunteers, even if they had clear-enough instructions. That is a mistake I doubt anyone designing volunteer content creation systems will make again; I certainly would not make it again.
I spent a huge amount of time recruiting people for Nupedia, e-mailing new arrivals, posting to mailing lists, giving interviews, etc. I had had some experience publicizing Internet projects when I worked on several philosophy discussion groups as a grad student in the 1990s (I had perpetrated an "Association for Systematic Philosophy" as well as a "Tutorial Manifesto"), and I knew that getting many willing and active participants was difficult but important. I even had an administrative assistant for six months in 2000 and 2001, Liz Campeau, whose sole job was to recruit people to work on Nupedia and then Wikipedia. I think a large part of the reason Wikipedia got off the ground so quickly and so well is that it was started by Nupedians, who were then a very large base of people who wanted to work on an encyclopedia, and who had many definite ideas about how it should be done. Maybe 2,000 Nupedia members were subscribed to the general announcement list in January of 2001, when Wikipedia launched--I forget how many but an old project news page indicates that 2,000 is about right.
We operated the system initially using e-mail and mailing lists, while planning and finalizing process details. That lasted from spring through fall 2000. I think our first article ("atonality" by Christoph Hust), that made it entirely through the system, was published in June or July of 2000. To move the system to a completely web-based one, there was, of course, a great deal of design and programming to do. So in fall of 2000 I worked a lot with a specifically-hired programmer (Toan Vo) and the Bomis sysadmin (Jason Richey) to transfer the system from a clunky mailing list system to the web. But by the time the web-based system was ready--I think December of 2000, just a month before Wikipedia got started--it had become obvious to Jimmy and me that the seven-step editorial process would move too slowly, even when managed on the web. But Magnus Manske later, in 2001, made some very nice additions to the Nupedia system.
Some institutional traditions begin easily but die hard. So, in 2001, it was only after many months and uncomfortable comparison of Nupedia with the thriving, younger Wikipedia, that Nupedia's Advisory Board was willing to consider a simpler system seriously. That was because Nupedia editors and peer reviewers had a very strong commitment to rigor and reliability, as did I. Moreover, as Wikipedia became increasingly successful in 2001, Jimmy asked me to spend more and more time on it, which I did; Nupedia suffered from neglect. But by the summer of 2001, I was able to propose, get accepted (with very lukewarm support), and install something we called the Nupedia Chalkboard, a wiki which was to be closely managed by Nupedia's staff. It was to be both a simpler way to develop encyclopedia articles for Nupedia, and a way to import articles from Wikipedia. No doubt due to lingering disdain for the wiki idea--which at the time was still very much unproven--the Chalkboard went largely unused. The general public simply used Wikipedia if they wanted to write articles in a wiki format, while perhaps most Nupedia editors and peer reviewers were not persuaded that the Chalkboard was necessary or useful.
By early winter, 2001, Nupedia had published approved versions of only about 25 articles, although there were many more (I vaguely recall over 150 drafts) at various stages in process. I was finally able to persuade the Advisory Board to move the system to a much simpler two-step process, virtually identical to that used to run many academic journals: articles would be submitted to an editor; the editor would, if the article seemed good enough, forward it to a reviewer for acceptance or rejection; if accepted, the article would be posted. We also were thinking of various ways of allowing public comment on or moderated editing of posted articles. I believe this new, simpler system would have produced thousands of articles for Nupedia very quickly. The general public on Nupedia was certainly interested and motivated, and I think it was finally becoming generally accepted by the Advisory Board that the complexity of the system was the main reason that they were not starting articles and getting them through the system.
But, unfortunately, Nupedia's new system was never adopted when it should have been--the winter of 2001-2--because at the same time, Wikipedia was demanding as much attention as I could give it, and I had little time to implement the new Nupedia system. I am quite sure we could have started the new Nupedia system in early 2002, if we had made the time. But Bomis lost the ability to pay me and, newly unemployed, I did not have the time to lead Nupedia as a volunteer. I did not entirely lose hope on Nupedia, however, as I will explain below.
The origins of Wikipedia
In the fall of 2000, Jimmy and I were very well agreed that Nupedia's slow productivity was probably going to be an ongoing problem and that there needed to be a way, moreover, in which ordinary, uncredentialed people could participate more easily. Uncredentialed people could (and did) participate in Nupedia, particularly as writers and copyeditors, but it was pretty painful for most of them to get articles through the elaborate system. So there seemed to be a huge fund of talent, motivated to work on an encyclopedia but not motivated enough to work on Nupedia, going to waste.
It was my job to solve these problems. I wrote multiple detailed proposals for a simpler, more open editing system--two or three, at least--and I ran them by Jimmy, and I think his reply to all of them was that it would require too much programming and he couldn't afford to pay more high-priced programmers (they were very high-priced at the time, you will recall, and we already had Toan and Jason working quite a bit on Nupedia's new web-based system). Now, of course, I fully realize that we could have found a way to enlist volunteers to develop the system. Jimmy and I both probably knew that at the time; I'm not sure why we didn't pursue it.
So it was while I was thinking hard about how to create a more open system, that would require minimal programming to set up, that I had dinner with an old Internet friend of mine, Ben Kovitz. Ben had moved to town for a new job and we were out at a Pacific Beach Mexican restaurant on January 2, 2001, talking about jobs, techie stuff, and philosophy, no doubt. (Ben, Jimmy, and I were all active on those philosophy mailing lists in the mid-90s and we all knew each other.) So Ben explained the idea of Ward Cunningham's WikiWikiWeb to me. Instantly I was considering whether wiki would work as a more open and simple editorial system for a free, collaborative encyclopedia, and it seemed exactly right. And the more I thought about it, without even having seen a wiki, the more it seemed obviously right. So I'm sure it was that very evening or the following morning that I wrote a proposal--unfortunately, lost now--in which I said that this might solve the problem and that we ought to try it. After he had nixed my several earlier proposals, and given that setting up a wiki would be very simple and require hiring no programmer, Jimmy could scarcely refuse. I vaguely recall that he liked the idea but was initially skeptical--properly so, as I was, despite my excitement.
Wiki advocates often used to point out (and I'm sure some still do) that Wikipedia is nonstandard as a wiki. This is partly because we began just with the very basic wiki concept and not so much of the culture. Wiki culture is very distinctive. I cannot hope to explain even the highlights briefly, so I will not try; I will simply give a few notions. Wiki pages can be started and edited by anyone, but, in "Thread Mode" (as in "the thread of this discussion") the dialogue can become complex. In that case, or when consensus is reached, or when positions have hardened, it is considered a good idea to "refactor" pages (a term borrowed from programming), i.e., to rewrite them, but honestly, taking into account the highlights of the dialogue. Then the dialogue might be represented as in "Document Mode." Opinions are very welcome on a typical wiki. There are many other collective habits that make up typical wiki culture; these are only a few.
But I denied the necessity of organizing Wikipedia according to these precise principles. To be sure, a few other participants wanted Wikipedia to adopt wiki culture wholesale, so that it would be "just another wiki," and they had some small influence over the direction of the project; but speaking for myself, I viewed wiki software as simply a tool, a way to organize people who want to collaborate. I saw no necessity whatsoever of partaking in all aspects of the idiosyncratic culture that happened to be associated with the advent of this very generally-applicable tool, since we were engaged in a very specific sort of project, with very specific requirements. This caused some consternation among some wiki advocates, who appeared to think that Wikipedia should, or inevitably would, become just another wiki, somehow necessarily partaking of typical wiki culture. Ward Cunningham's prediction, when Jimmy asked him whether wiki software "could successfully generate a useful encyclopedia," was: "Yes, but in the end it wouldn't be an encyclopedia. It would be a wiki." As I said in reply: "Wikipedia has a totally different culture from this wiki, because it's pretty singlemindedly aimed at creating an encyclopedia. It's already rather useful as an encyclopedia, and we expect it will only get better."
Typical wiki culture aside, wiki software does encourage, but does not strictly require, extreme openness and de-centralization: openness, since (as the software is typically designed) page changes are logged and publicly viewable, and (again, only typically) pages may be further changed by anyone; de-centralization, because in order for work to be done, there is no need for a person or body to assign work, but rather, work can proceed as and when people want to do it. Wiki software also discourages (or at least does not facilitate) the exercise of authority, since work proceeds at will on any page, and on any large, active wiki it would be too much work for any single overseer or limited group of overseers to keep up. These all became features of Wikipedia.
My initial idea was that the wiki would be set up as part of Nupedia; it was to be a way for the public to develop a stream of content that could be fed into the Nupedia process. I think I got some of the basic pages written--how wikis work, what our general plan was, and so forth--over the next few days. I wrote a general proposal for the Nupedia community, and the Nupedia wiki went live January 10. The first encyclopedia articles for what was to become Wikipedia were written then. It turned out, however, that a clear majority of the Nupedia Advisory Board wanted to have nothing to do with a wiki. Again, their commitment was to rigor and reliability, a concern I shared with them and continue to have. Still, perhaps some of those people are kicking themselves now. They (some of them) evidently thought that a wiki could not resemble an encyclopedia at all, that it would be too informal and unstructured, as the original WikiWikiWeb was (and is), to be associated with Nupedia. They of course were perfectly reasonable to doubt that it would turn into the fantastic source of content that it did. Who could reasonably guess that it would work? But it did work, and now the world knows better.
Wikipedia's first few months
So we decided to relaunch the wiki under its own domain name. I came up with the name "Wikipedia," a silly name for what was at first a very silly project, and the newly independent project was launched at Wikipedia.com on January 15, 2001. It was a ".com" at first because, at the time, we were contemplating selling ads to pay for me, programmers, and servers. It was easy to deprecate ".com" in favor of ".org" in 2002, after Jimmy was able to assure users that Wikipedia would never (at least I think he said, or clearly implied, "never") run ads to support the project.
I took it to be one of my main jobs to promote Wikipedia, and this resulted in a steady influx of new participants. I wrote on the Wikipedia announcement page January 24, "Wikipedia has definitely taken [on] a life of its own; new people are arriving every day and the project seems to be getting only more popular. Long live Wikipedia!" By the end of January we reportedly and approximately had 600 articles; there were 1300 in March, 2300 in April, and 3900 in May. Not only was the project growing steadily, the rate of growth was increasing.
Wikipedia started with a handful of people, many from Nupedia. The influence of Nupedians was, I think, pretty important early on; I think, especially, of the tireless Magnus Manske (who worked on the software for both projects), our resident stickler Ruth Ifcher, and the very smart poker-playing programmer Lee Daniel Crocker--to name a few. All of these people, and several other Nupedia borrowings, had a good understanding of the requirements of good encyclopedia articles, and they were good writers and very smart. The direction that Wikipedia ought to go in was pretty obvious to myself and them, in terms of what sort of content we wanted. But what we did not have worked out in advance was how the community should be organized, and (not surprisingly) that turned out to be the thorniest problem. But the facts that the project started with these good people, and that we were able to adopt, explain, and promote good habits and policies to newer people, partly accounts for why the project was able to develop a robust, functional community and eventually to succeed. As to project leadership or management, we began with me, Jimmy, and Tim Shell; but Tim stopped participating so much after the first few months.
But the many rank-and-file users did the heavy lifting, and if there had not been a reasonable consensus among them about what the project should look like, it just wouldn't have happened. In any collaborative project, it is the contributors who are responsible for the outcome. Those early adopters should feel proud of themselves, because they were absolutely instrumental in shaping a thing of beauty and usefulness.
I recall saying casually, but repeatedly, in the project's first nine months or so, that experts and specialists should be given some particular respect when writing in their areas of expertise. They should be deferred to, I thought, unless there were some clear evidence of bias. (I recall an interesting discussion with a Polish scientist, Piotr Wozniak, about this issue when we came to a small disagreement about the "sleep" article.) So, in those first months, deference to expertise was a policy that at least I usually insisted upon, but not strongly or clearly enough. It was nearly a year after the project began that I finally articulated this view reasonably clearly as a policy to consider. Perhaps this was because, indeed, most users did make a practice of deferring to experts up to that time. "This is just common sense," as I wrote, "but sometimes common sense needs to be spelled out!" What I now think is that that point of common sense needed to be spelled out quite a bit sooner and more forcefully, because in the long run, it was not adopted as official project policy, as it could have been.
Some questions have been raised about the origin of Wikipedia policies. The tale is interesting and instructive, and one of the main themes of this memoir. We began with no (or few) policies in particular and said that the community would determine--through a sort of vague consensus, based on its experience working together--what the policies would be. The very first entry on a "rules to consider" page was the "Ignore All Rules" rule (to wit: "If rules make you nervous and depressed, and not desirous of participating in the wiki, then ignore them entirely and go about your business"). This is a "rule" that, current Wikipedians might be surprised to learn, I personally proposed. The reason was that I thought we needed experience with how wikis should work, and even more importantly at that point we needed participants more than we needed rules. As the project grew and the requirements of its success became increasingly obvious, I became ambivalent about this particular "rule" and then rejected it altogether. As one participant later commented, "this rule is the essence of Wikipedia." That was certainly never my view; I always thought of the rule as being a temporary and humorous injunction to participants to add content rather than be distracted by (then) relatively inconsequential issues about how exactly articles should be formatted, etc. In a similar spirit, I proposed that contributors be bold in updating pages (the current version is much expanded, as it should be).
I also, for similar reasons, specifically disavowed any title; I was organizing the project but I did not want to present myself as editor-in-chief. I wanted people to feel comfortable adding information without having to consult anything like an editor. Participation was more important, I felt. (Others referred to me, later, as Wikipedia's editor.)
As we set it up, Wikipedia did have some minimal wiki cultural features: it was wide open, extremely decentralized, and (provisionally anyway) featured very little attempt to exercise authority. Insofar as I was able to organize it at all, I guided the project through force of personality and what "moral authority" I had as co-founder of the project. Jimmy and I agreed early on that, at least in the beginning, we should not eject anyone from the project except perhaps in the most extreme cases. Our first forcible expulsion (which Jimmy performed) did not occur for many months, despite the presence of difficult characters from nearly the beginning of the project. Again, we were learning: we wished to tolerate all sorts of contributors in order to be well-situated to adopt the wisest policies. But--and in hindsight this should have seemed perfectly predictable--this provisional "hands off" management policy had the effect of creating a difficult-to-change tradition, the tradition of making the project extremely tolerant of disruptive (uncooperative, "trolling") behavior. And as it turned out, particularly with the large waves of new contributors from the summer and fall of 2001, the project became very resistant to any changes in this policy. I suspect that the cultures of online communities generally are established pretty quickly and then very resistant to change, because they are self-selecting; that was certainly the case with Wikipedia, anyway.
So I could only attempt to shame any troublemakers into compliance; without recourse to any genuine punitive action, that was the most I could do. In about the first eight months of the project, this was usually sufficient for me to do my job. After that, however, my job got increasingly difficult, as I will explain.
So Wikipedia began as a good-natured anarchy, a sort of Rousseauian state of digital nature. I always took Wikipedia's anarchy to be provisional and purely for purposes of determining what the best rules, and the nature of its authority, should be. What I, and other Wikipedians, failed to realize is that our initial anarchy would be taken by the next wave of contributors as the very essence of the project--how Wikipedia was "meant" to be--even though Wikipedia could have become anything we the contributors chose to make it.
This point bears some emphasis: Wikipedia became what it is today because, having been seeded with great people with a fairly clear idea of what they wanted to achieve, we proceeded to make a series of free decisions that determined the policy of the project and culture of its supporting community. Wikipedia's system is neither the only way to run a wiki, nor the only way to run an open content encyclopedia. Its particular conjunction of policies is in no way natural, "organic," or necessary. It is instead artificial, a result of a series of free choices, and we could have chosen differently in many cases; and choosing differently on some issues might have led to a project better than the one that exists today.
Though it began as an anarchy, there were quite a few policies that were settled upon, more or less, within the first six months or so. This required some struggle, especially on my part, particularly because, since the project was a wiki, some participants thought that there should be no rules at all. (Enforceable rules were regarded as "anti-wiki," which was supposed to be a bad thing.) But it was made clear from the beginning that we intended Wikipedia to be an encyclopedia, and so we were able to plug for at least those rules that would help define and sustain the project as an encyclopedia.
For instance, throughout the early months, people added various content that seemed less than encyclopedic in various ways. Many people seemed to confuse encyclopedia articles with dictionary entries, and eventually I wrote a page called "Wikipedia is not a dictionary." (I am surprised to discover that this page still exists as of this writing, with a good deal of its original content.) As people found new ways not to write encyclopedia articles, I started "What Wikipedia is not": I and others would note on an article's discussion page that some certain content did not belong in an encyclopedia, and then underscored the point by adding an entry to the "What Wikipedia is not" page. To take another example, Wikipedia was not to be a place for publishing original research. In fact, this is a policy that had been settled upon and even enforced in Nupedia days; enforcing it actually led to the departure of Nupedia's erstwhile Classics editor sometime in 2001.
Many of our first controversies were over these restrictions. At the time, I had enough influence within the community to get these policies generally accepted. And if we had not decided on these restrictions, Wikipedia might well have ended up, like many wikis, as nothing in particular. But since we insisted that it was an encyclopedia, even though it was just a blank wiki and a group of people to begin with, it became an encyclopedia. There is something very profound about that. I also like to think that we helped to show the world the potential that wikis have.
Another policy that was instituted early on was the nonbias or neutrality policy. This was borrowed from the Nupedia project and made a Rule to Consider--in a very early version, the policy was put this way:
Avoid bias: Since this is an encyclopedia, after a fashion, it would be best if you represented your controversial views either (1) not at all, (2) on *Debate, *Talk, or *Discussion pages linked from the bottom of the page that you're tempted to grace, or (3) represented in a fact-stating fashion, i.e., which attributes a particular opinion to a particular person or group, rather than asserting the opinion as fact. (3) is strongly preferred.
Jimmy then started a specialized policy page he called "Neutral Point of View" (here is the current version). I confess I don't much like this name as a name for the policy, because it implies that to write neutrally, or without bias, is actually to express a point of view, and, as the definite article is used, a single point of view at that. "Neutrality", "neutral", and "neutrally" are better to use for the noun, adjective, and adverb. But the acronym "NPOV" came to be used for all three, by Wikipedians wanting to seem hip, and then the unfortunate "POV" came to be used when the perfectly good English word "biased" would do.
In addition to these, I recall suggesting a number of other rules--no doubt most matters of historical fact, along these lines, can be verified in archives. I believe I am responsible for the original formulations of a lot of the article naming conventions, as well as the conventions of bolding the title of the article, starting articles with full sentences, making article titles uncapitalized, and much else. I think these policies were just a matter of common sense for anyone who understood what a good encyclopedia should be like. And of course I was not the only person proposing conventions. Moreover, actual project policy, or community habits, succeeded in being established only by being followed and supported by a majority of participants. It was then, we said, that there was a "rough consensus" in favor of the policy. And consensus, we said, is required for a policy actually to be considered project policy. For our purposes, a "consensus" appeared to consist of (1) widespread common practice, (2) many vocal defenders, and (3) virtually no detractors.
But that way of settling upon policy proposals--viz., by alleged consensus--did not scale, in my opinion. After about nine months or so, there were so many contributors, and especially brand new contributors, that nothing like a consensus could be reached, for the simple reason that condition (3) above was never achievable: there would after that always be somebody who insisted on expressing disagreement. There was, then, a non-scaling policy adoption procedure, and a crying need to continue to adopt sensible policies. This led to some pretty serious problems in the community, which I will relate below. But first, something more positive.
It's a cliff-hanger; you'll have to wait until tomorrow to read about what made Wikipedia start to work. -
The Early History of Nupedia and Wikipedia: A Memoir
Larry Sanger was one of the moving forces behind the pioneering Nupedia project. That makes him one of the people to thank for Wikipedia, which has been enjoying more and more visibility of late. Sanger has prepared a lengthy, informative account of the early history of Nupedia and Wikipedia, including some cogent observations on project management, online legitimacy, dealing with trolls, and other hazards of running a large, collaborative project over the Internet. As Sanger writes, "A virtually identical version of this memoir is due to appear this summer in Open Sources 2.0, published by O'Reilly and edited by Chris DiBona, Danese Cooper, and Mark Stone. The volume is to be a successor to Open Sources: Voices from the Open Source Revolution (1999)." Read on below for the story (continued tomorrow). Update: 04/20 19:19 GMT by T : Here's a link to the continuation of Sanger's memoir.Contents:
Preface
Some recent press reports
Nupedia
The origins of Wikipedia
Wikipedia's first few monthsPreface
An impassioned debate has been raging, particularly since about the summer of 2004, about the merits of Wikipedia and the future of free online encyclopedias. This discussion has not benefitted by much detailed, accurate consideration of the origins of Wikipedia and of its parent project, Nupedia. But it seems to me that those origins are very important -- crucial, even -- to forming a proper judgment of the current state and best future direction of free encyclopedias.
Wikipedia as it stands is a fantastic project; it has produced enormous amounts of content, thousands of excellent articles, and now, after just four years, is getting high-profile, international recognition as a new way of obtaining at least a rough and ready idea about very many topics. Its surprising success may be attributed, briefly, to its free, open, and collaborative nature.
This has been my attitude toward Wikipedia practically since its founding. But a few months ago I wrote an article critical of certain aspects of the Wikipedia project, 'Why Wikipedia Must Jettison Its Anti-Elitism', which occasioned much debate. I have also been quoted, as co-founder of Wikipedia, in many recent news articles about the project, making various other critical remarks. I am afraid I am getting an undeserved reputation as someone who is opposed to everything Wikipedia stands for. This is completely incorrect. In fact, I am one of Wikipedia's strongest supporters. I am partly responsible for bringing it into the world (as I will explain), and I still love it and want only the best for it. But if a better job can be done, a better job should be done. Wikipedia has shown fantastic potential, and it is open content--and so if the project has problems (or features) which will keep it from being the maximally authoritative, broad, and deep reference that I believe could exist, I firmly believe that the world has the right to, and should, improve upon it.
Wikipedia's predecessor, which I was also employed to organize, was Nupedia. Nupedia was to be a highly reliable, peer-reviewed resource that fully appreciated and employed the efforts of subject area experts, as well as the general public. When the more free-wheeling Wikipedia took off, Nupedia was left to wither. It might appear to have died of its own weight and complexity. But, as I will explain, it could have been redesigned and adapted--it could have, as it were, "learned from its mistakes" and from Wikipedia's successes. Thousands of people who had signed up and who wanted to contribute to the Nupedia system were left disappointed. I believe this was unfortunate and unnecessary; I always wanted Nupedia and Wikipedia working together to be not only the world's largest but also the world's most reliable encyclopedia. I hope that this memoir will help to justify this stance. Hopefully, too, I will manage to persuade some people that collaboration between an expert project and a public project is the correct approach to the overall project of creating open content encyclopedias.
I am not writing to request that Nupedia be resuscitated now, as nice as that would be. But I would like to tell the story of Nupedia and the first couple years of Wikipedia, as I remember it. A more complete history of the projects, as opposed to a memoir, must await a careful study of the Nupedia and Wikipedia archives--if early archives of them still exist (I have no idea if they do)--or else these entries from the "Wayback Machine." Interviews with many of those heavily involved in the projects would also help a great deal, so long as interviews were done of people on different side of the disputes that helped to shape the project.
By the way, the "overall project of creating open content encyclopedias" is something of which I have been writing since at least 2001. For example, in July of 2001, while still working on both Wikipedia and Nupedia, I wrote, "if some other open source project proves to be more competitive, then it should and will take the lead in creating a body of free encyclopedic knowledge." Since Wikipedia is open content and hence may be reproduced and improved upon by anyone, I have always been cognizant that it might not end up being the only or best version. My personal devotion has always been to the ideal project as I have envisioned it, not necessarily to particular incarnations of Nupedia or Wikipedia; and I think this attitude is fully consistent with the (very positive) spirit of open source collaboration generally.
This being said, let me also emphasize strongly that, throughout this discussion, I am not suggesting that Wikipedia needs to be replaced with something better. I do, however, think that it needs to be supplemented by a broader, more ambitious, and more inclusive vision of the overall project.
Some recent press reports
The following memoir seems all the more important to publish now because the early history of Nupedia and Wikipedia has been mischaracterized in the press recently. If there were only a few inaccuracies, which made no difference, I would be happy to leave well enough alone. But some of the mischaracterizations I've seen do make a difference, because they give the public the impression that Nupedia failed because it was run by snobbish experts whose standards were too high. As the following should make clear, that is not quite correct. One might also gather from some reports that the idea for Wikipedia sprang fully grown from Jimmy Wales' head. Jimmy, of course, deserves enormous credit for investing in and guiding Wikipedia. But a more refined idea of how Wikipedia originated and evolved is crucial to have, if one wants to appreciate fully why it works now, and why it has the policies that it does have.
For example, in the Nov. 1, 2004 issue of Newsweek, in "It's Like a Blog, But It's a Wiki," reporter Brad Stone writes:
[Jimmy] Wales first tried to rewrite the rules of the reference-book business five years ago with a free online encyclopedia called Nupedia. Anyone could submit articles, but they were vetted in a seven-step review process. After investing thousands of his own dollars and publishing only 24 articles, Wales reconsidered. He scrapped the review process and began using a popular kind of online Web site called a "wiki," which allows its readers to change the content.
This capsule history is, of course, very brief and so should be expected not to have every relevant detail. But some of the claims made here are not just vague, they are actually misleading, and so several clarifications are in order (all of this is elaborated below):- The article makes it sound as if Jimmy were the only person making the relevant decisions. That is incorrect; the Nupedia system (indeed, seven steps) was established via negotiation with Nupedia's volunteer Advisory Board, mostly Ph.D. volunteers, who served as editors and peer reviewers. I articulated our decisions in Nupedia's "Editorial Policy Guidelines." Jimmy started and broadly authorized it all, but as to the details, he really had little to do with them.
- Nupedia's Advisory Board might be surprised to learn that Jimmy (alone!) "scrapped the review process." Jimmy was certainly disappointed with the process (as were many people), and he did not actively support it after 2001 or so. But in fairness to the people actually working on Nupedia, the fact is that work on Nupedia gradually petered out in 2001-2. I in particular was stretched thin--in 2001, I was both chief organizer of Wikipedia and editor-in-chief of Nupedia--and my own slowing work on Nupedia was obvious to all active Nupedia contributors. It might be better to say that Nupedia withered due to neglect--which was largely due to a lack of sufficient funds for paid organizers--which was as much due to the bursting of the Internet bubble as anything else.
- Also, to the best of my knowledge, the "thousands of his own dollars" invested in these projects were, if I am not very mistaken, the dollars of Bomis.com, which is jointly owned by three partners, Jimmy, Tim Shell, and Michael Davis. (The money for Wikipedia now comes from donations.) But again, Jimmy was the prime motivating force within Bomis.
- Moreover, Nupedia had fewer than 24 articles when Wikipedia launched, being not quite a year old at that time. The idea of adapting wiki technology to the task of building an encyclopedia was mine, and my main job in 2001 was managing and developing the community and the rules according to which Wikipedia was run. Jimmy's role, at first, was one of broad vision and oversight; this was the management style he preferred, at least as long as I was involved. But, again, credit goes to Jimmy alone for getting Bomis to invest in the project, and for providing broad oversight of the fantastic and world-changing project of an open content, collaboratively-built encyclopedia. Credit also of course goes to him for overseeing its development after I left, and guiding it to the success that it is today.
A March 2005 Wired Magazine article by Daniel Pink also got a number of things wrong, despite being, in other respects, an excellent article:
With Sanger as editor in chief, Nupedia essentially replicated the One Best way model. He assembled a roster of academics to write articles. (Participants even had to fax in their degrees as proof of their expertise.) And he established a seven-stage process of editing, fact-checking, and peer review. "After 18 months and more than $250,000," Wales said, "we had 12 articles."
This too needs clarifications:Then an employee told Wales about Wiki software. On January 15, 2001, they launched a Wiki-fied version and within a month, they had 200 articles. In a year, they had 18,000. ... Sanger left the project in 2002. "In the Nupedia mode, there was room for an editor in chief," Wales says. "The Wiki model is too distributed for that."
- The "roster of academics" (the aforementioned Nupedia Advisory Board) was not limited to academics; they were experts in their fields, in any case. Moreover, they were editors and peer reviewers; the general public was able to propose and write articles on subjects about which they had some knowledge. (Consult the old assignment policy if you are interested.)
- It is incorrect to say that participants had to fax their degrees as proof of their expertise; we did verify bona fides by matching the names and e-mail addresses of editors and reviewers with a web page--often, but not always, an academic web page. Indeed there was one (but only one) case that I recall in which I asked someone, who had no web page or any other easy way to prove who he was, to fax a degree. Verifying bona fides seemed like a good idea especially when initially building what was to be an academically-respectable project.
- Again, I did not establish the editorial process alone; I had considerable assistance (for which I am still grateful) from Nupedia's excellent Advisory Board.
- And as I wrote on July 25, 2001 for Kuro5hin, "Britannica or Nupedia? The Future of Free Encyclopedias," Nupedia had "just over 20" articles--not 12--after 18 months. We always suspected that we would wind up scrapping our first attempts to design an editorial system, and that we would learn a great deal from those first attempts; and that's essentially what happened. But Nupedia could have evolved, and would have, had we continued working on it.
- The second paragraph begins, "Then an employee told Wales about Wiki software." I don't know how Jimmy first learned about wikis, but as I will explain below, I proposed to him and to the Nupedia community at large that we start a wiki-based encyclopedia.
- The context of the line "Sanger left the project in 2002"--particularly with Jimmy quoted as saying, "In the Nupedia mode there was room for an editor in chief"--makes it sound as if I were let go specifically because I was working only on Nupedia and that I was no longer needed for that. In fact, I was working on Wikipedia far more at the time than Nupedia, and the reason for my departure from both projects was that Bomis was, like virtually all dot-coms, losing money. They could not afford to pay me; I was told that I was the last of several newer Bomis employees to be laid off on account of the tech recession. But Wikipedia indeed was able to continue on without me, and I agreed even at the time that Wikipedia could survive without me, and that it had become essentially "unmanageable" (as I put it--the following memoir should make it clear what I meant by that).
Nupedia
I'm going to begin this memoir with several paragraphs about Nupedia, because the origin of Wikipedia cannot be explained except in that context. Moreover, the Nupedia project itself was very worthwhile, and I think it might have been able to survive, as I will explain. Finally, some errors regarding Nupedia have been passed around (a few examples are above), which are little better than unfounded rumors. It is unfortunate that the thousands of hours of excellent volunteer work done on Nupedia should be thus disrespected or grossly misunderstood. I personally will always be grateful to those initial contributors who believed in the project and our management, worked hard for a completely unproven idea, and laid the groundwork for the growing institution of open content projects.
In 1999, Jimmy Wales wanted to start a free, collaborative encyclopedia. I knew him from several mailing lists back in the mid-90s, and in fact we had already met in person a couple of times. In January 2000, I e-mailed Jimmy and several other Internet acquaintances to get feedback on an idea for what was to be, essentially, a blog. (It was to be a successor to "Sanger and Shannon's Review of Y2K News Reports," a Y2K news summary that I first wrote and then edited.) To my great surprise, Jimmy replied to my e-mail describing his idea of a free encyclopedia, and asking if I might be interested in leading the project. He was specifically interested in finding a philosopher to lead the project, he said. He made it a condition of my employment that I would finish my Ph.D. quickly (whereupon I would get a raise)--which I did, in June 2000. I am still grateful for the extra incentive. I thought he would be a great boss, and indeed he was.
To be clear, the idea of an open source, collaborative encyclopedia, open to contribution by ordinary people, was entirely JimmyÃââs, not mine, and the funding was entirely by Bomis. I was merely a grateful employee; I thought I was very lucky to have a job like that land in my lap. Of course, other people had had the idea; but it was Jimmy's fantastic foresight actually to invest in it. For this the world owes him a considerable debt. The actual development of this encyclopedia was the task he gave me to work on.
So I arrived in San Diego in early February, 2000, to get to work. One of the first things I asked Jimmy is how free a rein I had in designing the project. What were my constraints, and in what areas was I free to exercise my own creativity? He replied, as I clearly recall, that most of the decisions should be mine; and in most respects, as a manager, Jimmy was indeed very hands-off. Nevertheless, I always did consult with him about important decisions, and moreover, I wanted his advice. Now, Jimmy was quite clear that he wanted the project to be in principle open to everyone to develop, just as open source software is (to an extent). Beyond this, however, I believe I was given a pretty free rein. So I spent the first month or so thinking very broadly about different possibilities. I wrote quite a bit (that writing is now all lost--that will teach me not to back up my hard drives) and discussed quite a bit with both Jimmy and one of the other Bomis partners, Tim Shell.
I maintained from the start that something really could not be a credible encyclopedia without oversight by experts. I reasoned that, if the project is open to all, it would require both management by experts and an unusually rigorous process. I now think I was right about the former requirement, but wrong about the latter, which was redundant; I think that the subsequent development of Wikipedia has borne out this assessment. But I fully realize that all of this is a matter of debate. Some will claim that the experience of Wikipedia refuted my original judgment that expert oversight is necessary for a very credible encyclopedia; but I disagree with them. More on this below.
Also, I am fairly sure that one of the first policies that Jimmy and I agreed upon was a "nonbias" or neutrality policy. I know I was extremely insistent upon it from the beginning, because neutrality has been a hobby-horse of mine for a very long time, and one of my guiding principles in writing "Sanger's Review." Neutrality, we agreed, required that articles should not represent any one point of view on controversial subjects, but instead fairly represent all sides. We also agreed in rejecting an alternative that (for a time) Tim and some early Nupedians plugged for: the development, for each encyclopedia topic, of a series of different articles, each written from a different point of view.
I believed, moreover, that a strongly collaborative and open project could not survive if its contributors were not "personally invested" in the project, and that this required some input and management by its users. So I think it was very early on that I decided that Nupedia should have an Advisory Board--editors, and peer reviewers, who would together agree to project policy--and that the public should have a say in the formulation of policy.
An early incarnation of NupediaÃââs Advisory Board was in place by summer of 2000 or so. It was made up of the project's highly-qualified editors and reviewers, mostly Ph.D. professors but also a good many other highly-experienced professionals. Eventually the Advisory Board agreed to an extremely rigorous seven-step system. A lot of the details of the Nupedia policy and processes were, I think, proposed by me, but then tweaked and elaborated by others, and the policy was not published as project policy until we had a quorum of editors and peer reviewers who could fully discuss and approve of a policy statement. But I do not think that we discussed the proposal well enough, and further initial discussion could have made a difference, because, as it turned out, a clear mistake of mine and others was to assume that such a complicated system would be navigated patiently by many volunteers, even if they had clear-enough instructions. That is a mistake I doubt anyone designing volunteer content creation systems will make again; I certainly would not make it again.
I spent a huge amount of time recruiting people for Nupedia, e-mailing new arrivals, posting to mailing lists, giving interviews, etc. I had had some experience publicizing Internet projects when I worked on several philosophy discussion groups as a grad student in the 1990s (I had perpetrated an "Association for Systematic Philosophy" as well as a "Tutorial Manifesto"), and I knew that getting many willing and active participants was difficult but important. I even had an administrative assistant for six months in 2000 and 2001, Liz Campeau, whose sole job was to recruit people to work on Nupedia and then Wikipedia. I think a large part of the reason Wikipedia got off the ground so quickly and so well is that it was started by Nupedians, who were then a very large base of people who wanted to work on an encyclopedia, and who had many definite ideas about how it should be done. Maybe 2,000 Nupedia members were subscribed to the general announcement list in January of 2001, when Wikipedia launched--I forget how many but an old project news page indicates that 2,000 is about right.
We operated the system initially using e-mail and mailing lists, while planning and finalizing process details. That lasted from spring through fall 2000. I think our first article ("atonality" by Christoph Hust), that made it entirely through the system, was published in June or July of 2000. To move the system to a completely web-based one, there was, of course, a great deal of design and programming to do. So in fall of 2000 I worked a lot with a specifically-hired programmer (Toan Vo) and the Bomis sysadmin (Jason Richey) to transfer the system from a clunky mailing list system to the web. But by the time the web-based system was ready--I think December of 2000, just a month before Wikipedia got started--it had become obvious to Jimmy and me that the seven-step editorial process would move too slowly, even when managed on the web. But Magnus Manske later, in 2001, made some very nice additions to the Nupedia system.
Some institutional traditions begin easily but die hard. So, in 2001, it was only after many months and uncomfortable comparison of Nupedia with the thriving, younger Wikipedia, that Nupedia's Advisory Board was willing to consider a simpler system seriously. That was because Nupedia editors and peer reviewers had a very strong commitment to rigor and reliability, as did I. Moreover, as Wikipedia became increasingly successful in 2001, Jimmy asked me to spend more and more time on it, which I did; Nupedia suffered from neglect. But by the summer of 2001, I was able to propose, get accepted (with very lukewarm support), and install something we called the Nupedia Chalkboard, a wiki which was to be closely managed by Nupedia's staff. It was to be both a simpler way to develop encyclopedia articles for Nupedia, and a way to import articles from Wikipedia. No doubt due to lingering disdain for the wiki idea--which at the time was still very much unproven--the Chalkboard went largely unused. The general public simply used Wikipedia if they wanted to write articles in a wiki format, while perhaps most Nupedia editors and peer reviewers were not persuaded that the Chalkboard was necessary or useful.
By early winter, 2001, Nupedia had published approved versions of only about 25 articles, although there were many more (I vaguely recall over 150 drafts) at various stages in process. I was finally able to persuade the Advisory Board to move the system to a much simpler two-step process, virtually identical to that used to run many academic journals: articles would be submitted to an editor; the editor would, if the article seemed good enough, forward it to a reviewer for acceptance or rejection; if accepted, the article would be posted. We also were thinking of various ways of allowing public comment on or moderated editing of posted articles. I believe this new, simpler system would have produced thousands of articles for Nupedia very quickly. The general public on Nupedia was certainly interested and motivated, and I think it was finally becoming generally accepted by the Advisory Board that the complexity of the system was the main reason that they were not starting articles and getting them through the system.
But, unfortunately, Nupedia's new system was never adopted when it should have been--the winter of 2001-2--because at the same time, Wikipedia was demanding as much attention as I could give it, and I had little time to implement the new Nupedia system. I am quite sure we could have started the new Nupedia system in early 2002, if we had made the time. But Bomis lost the ability to pay me and, newly unemployed, I did not have the time to lead Nupedia as a volunteer. I did not entirely lose hope on Nupedia, however, as I will explain below.
The origins of Wikipedia
In the fall of 2000, Jimmy and I were very well agreed that Nupedia's slow productivity was probably going to be an ongoing problem and that there needed to be a way, moreover, in which ordinary, uncredentialed people could participate more easily. Uncredentialed people could (and did) participate in Nupedia, particularly as writers and copyeditors, but it was pretty painful for most of them to get articles through the elaborate system. So there seemed to be a huge fund of talent, motivated to work on an encyclopedia but not motivated enough to work on Nupedia, going to waste.
It was my job to solve these problems. I wrote multiple detailed proposals for a simpler, more open editing system--two or three, at least--and I ran them by Jimmy, and I think his reply to all of them was that it would require too much programming and he couldn't afford to pay more high-priced programmers (they were very high-priced at the time, you will recall, and we already had Toan and Jason working quite a bit on Nupedia's new web-based system). Now, of course, I fully realize that we could have found a way to enlist volunteers to develop the system. Jimmy and I both probably knew that at the time; I'm not sure why we didn't pursue it.
So it was while I was thinking hard about how to create a more open system, that would require minimal programming to set up, that I had dinner with an old Internet friend of mine, Ben Kovitz. Ben had moved to town for a new job and we were out at a Pacific Beach Mexican restaurant on January 2, 2001, talking about jobs, techie stuff, and philosophy, no doubt. (Ben, Jimmy, and I were all active on those philosophy mailing lists in the mid-90s and we all knew each other.) So Ben explained the idea of Ward Cunningham's WikiWikiWeb to me. Instantly I was considering whether wiki would work as a more open and simple editorial system for a free, collaborative encyclopedia, and it seemed exactly right. And the more I thought about it, without even having seen a wiki, the more it seemed obviously right. So I'm sure it was that very evening or the following morning that I wrote a proposal--unfortunately, lost now--in which I said that this might solve the problem and that we ought to try it. After he had nixed my several earlier proposals, and given that setting up a wiki would be very simple and require hiring no programmer, Jimmy could scarcely refuse. I vaguely recall that he liked the idea but was initially skeptical--properly so, as I was, despite my excitement.
Wiki advocates often used to point out (and I'm sure some still do) that Wikipedia is nonstandard as a wiki. This is partly because we began just with the very basic wiki concept and not so much of the culture. Wiki culture is very distinctive. I cannot hope to explain even the highlights briefly, so I will not try; I will simply give a few notions. Wiki pages can be started and edited by anyone, but, in "Thread Mode" (as in "the thread of this discussion") the dialogue can become complex. In that case, or when consensus is reached, or when positions have hardened, it is considered a good idea to "refactor" pages (a term borrowed from programming), i.e., to rewrite them, but honestly, taking into account the highlights of the dialogue. Then the dialogue might be represented as in "Document Mode." Opinions are very welcome on a typical wiki. There are many other collective habits that make up typical wiki culture; these are only a few.
But I denied the necessity of organizing Wikipedia according to these precise principles. To be sure, a few other participants wanted Wikipedia to adopt wiki culture wholesale, so that it would be "just another wiki," and they had some small influence over the direction of the project; but speaking for myself, I viewed wiki software as simply a tool, a way to organize people who want to collaborate. I saw no necessity whatsoever of partaking in all aspects of the idiosyncratic culture that happened to be associated with the advent of this very generally-applicable tool, since we were engaged in a very specific sort of project, with very specific requirements. This caused some consternation among some wiki advocates, who appeared to think that Wikipedia should, or inevitably would, become just another wiki, somehow necessarily partaking of typical wiki culture. Ward Cunningham's prediction, when Jimmy asked him whether wiki software "could successfully generate a useful encyclopedia," was: "Yes, but in the end it wouldn't be an encyclopedia. It would be a wiki." As I said in reply: "Wikipedia has a totally different culture from this wiki, because it's pretty singlemindedly aimed at creating an encyclopedia. It's already rather useful as an encyclopedia, and we expect it will only get better."
Typical wiki culture aside, wiki software does encourage, but does not strictly require, extreme openness and de-centralization: openness, since (as the software is typically designed) page changes are logged and publicly viewable, and (again, only typically) pages may be further changed by anyone; de-centralization, because in order for work to be done, there is no need for a person or body to assign work, but rather, work can proceed as and when people want to do it. Wiki software also discourages (or at least does not facilitate) the exercise of authority, since work proceeds at will on any page, and on any large, active wiki it would be too much work for any single overseer or limited group of overseers to keep up. These all became features of Wikipedia.
My initial idea was that the wiki would be set up as part of Nupedia; it was to be a way for the public to develop a stream of content that could be fed into the Nupedia process. I think I got some of the basic pages written--how wikis work, what our general plan was, and so forth--over the next few days. I wrote a general proposal for the Nupedia community, and the Nupedia wiki went live January 10. The first encyclopedia articles for what was to become Wikipedia were written then. It turned out, however, that a clear majority of the Nupedia Advisory Board wanted to have nothing to do with a wiki. Again, their commitment was to rigor and reliability, a concern I shared with them and continue to have. Still, perhaps some of those people are kicking themselves now. They (some of them) evidently thought that a wiki could not resemble an encyclopedia at all, that it would be too informal and unstructured, as the original WikiWikiWeb was (and is), to be associated with Nupedia. They of course were perfectly reasonable to doubt that it would turn into the fantastic source of content that it did. Who could reasonably guess that it would work? But it did work, and now the world knows better.
Wikipedia's first few months
So we decided to relaunch the wiki under its own domain name. I came up with the name "Wikipedia," a silly name for what was at first a very silly project, and the newly independent project was launched at Wikipedia.com on January 15, 2001. It was a ".com" at first because, at the time, we were contemplating selling ads to pay for me, programmers, and servers. It was easy to deprecate ".com" in favor of ".org" in 2002, after Jimmy was able to assure users that Wikipedia would never (at least I think he said, or clearly implied, "never") run ads to support the project.
I took it to be one of my main jobs to promote Wikipedia, and this resulted in a steady influx of new participants. I wrote on the Wikipedia announcement page January 24, "Wikipedia has definitely taken [on] a life of its own; new people are arriving every day and the project seems to be getting only more popular. Long live Wikipedia!" By the end of January we reportedly and approximately had 600 articles; there were 1300 in March, 2300 in April, and 3900 in May. Not only was the project growing steadily, the rate of growth was increasing.
Wikipedia started with a handful of people, many from Nupedia. The influence of Nupedians was, I think, pretty important early on; I think, especially, of the tireless Magnus Manske (who worked on the software for both projects), our resident stickler Ruth Ifcher, and the very smart poker-playing programmer Lee Daniel Crocker--to name a few. All of these people, and several other Nupedia borrowings, had a good understanding of the requirements of good encyclopedia articles, and they were good writers and very smart. The direction that Wikipedia ought to go in was pretty obvious to myself and them, in terms of what sort of content we wanted. But what we did not have worked out in advance was how the community should be organized, and (not surprisingly) that turned out to be the thorniest problem. But the facts that the project started with these good people, and that we were able to adopt, explain, and promote good habits and policies to newer people, partly accounts for why the project was able to develop a robust, functional community and eventually to succeed. As to project leadership or management, we began with me, Jimmy, and Tim Shell; but Tim stopped participating so much after the first few months.
But the many rank-and-file users did the heavy lifting, and if there had not been a reasonable consensus among them about what the project should look like, it just wouldn't have happened. In any collaborative project, it is the contributors who are responsible for the outcome. Those early adopters should feel proud of themselves, because they were absolutely instrumental in shaping a thing of beauty and usefulness.
I recall saying casually, but repeatedly, in the project's first nine months or so, that experts and specialists should be given some particular respect when writing in their areas of expertise. They should be deferred to, I thought, unless there were some clear evidence of bias. (I recall an interesting discussion with a Polish scientist, Piotr Wozniak, about this issue when we came to a small disagreement about the "sleep" article.) So, in those first months, deference to expertise was a policy that at least I usually insisted upon, but not strongly or clearly enough. It was nearly a year after the project began that I finally articulated this view reasonably clearly as a policy to consider. Perhaps this was because, indeed, most users did make a practice of deferring to experts up to that time. "This is just common sense," as I wrote, "but sometimes common sense needs to be spelled out!" What I now think is that that point of common sense needed to be spelled out quite a bit sooner and more forcefully, because in the long run, it was not adopted as official project policy, as it could have been.
Some questions have been raised about the origin of Wikipedia policies. The tale is interesting and instructive, and one of the main themes of this memoir. We began with no (or few) policies in particular and said that the community would determine--through a sort of vague consensus, based on its experience working together--what the policies would be. The very first entry on a "rules to consider" page was the "Ignore All Rules" rule (to wit: "If rules make you nervous and depressed, and not desirous of participating in the wiki, then ignore them entirely and go about your business"). This is a "rule" that, current Wikipedians might be surprised to learn, I personally proposed. The reason was that I thought we needed experience with how wikis should work, and even more importantly at that point we needed participants more than we needed rules. As the project grew and the requirements of its success became increasingly obvious, I became ambivalent about this particular "rule" and then rejected it altogether. As one participant later commented, "this rule is the essence of Wikipedia." That was certainly never my view; I always thought of the rule as being a temporary and humorous injunction to participants to add content rather than be distracted by (then) relatively inconsequential issues about how exactly articles should be formatted, etc. In a similar spirit, I proposed that contributors be bold in updating pages (the current version is much expanded, as it should be).
I also, for similar reasons, specifically disavowed any title; I was organizing the project but I did not want to present myself as editor-in-chief. I wanted people to feel comfortable adding information without having to consult anything like an editor. Participation was more important, I felt. (Others referred to me, later, as Wikipedia's editor.)
As we set it up, Wikipedia did have some minimal wiki cultural features: it was wide open, extremely decentralized, and (provisionally anyway) featured very little attempt to exercise authority. Insofar as I was able to organize it at all, I guided the project through force of personality and what "moral authority" I had as co-founder of the project. Jimmy and I agreed early on that, at least in the beginning, we should not eject anyone from the project except perhaps in the most extreme cases. Our first forcible expulsion (which Jimmy performed) did not occur for many months, despite the presence of difficult characters from nearly the beginning of the project. Again, we were learning: we wished to tolerate all sorts of contributors in order to be well-situated to adopt the wisest policies. But--and in hindsight this should have seemed perfectly predictable--this provisional "hands off" management policy had the effect of creating a difficult-to-change tradition, the tradition of making the project extremely tolerant of disruptive (uncooperative, "trolling") behavior. And as it turned out, particularly with the large waves of new contributors from the summer and fall of 2001, the project became very resistant to any changes in this policy. I suspect that the cultures of online communities generally are established pretty quickly and then very resistant to change, because they are self-selecting; that was certainly the case with Wikipedia, anyway.
So I could only attempt to shame any troublemakers into compliance; without recourse to any genuine punitive action, that was the most I could do. In about the first eight months of the project, this was usually sufficient for me to do my job. After that, however, my job got increasingly difficult, as I will explain.
So Wikipedia began as a good-natured anarchy, a sort of Rousseauian state of digital nature. I always took Wikipedia's anarchy to be provisional and purely for purposes of determining what the best rules, and the nature of its authority, should be. What I, and other Wikipedians, failed to realize is that our initial anarchy would be taken by the next wave of contributors as the very essence of the project--how Wikipedia was "meant" to be--even though Wikipedia could have become anything we the contributors chose to make it.
This point bears some emphasis: Wikipedia became what it is today because, having been seeded with great people with a fairly clear idea of what they wanted to achieve, we proceeded to make a series of free decisions that determined the policy of the project and culture of its supporting community. Wikipedia's system is neither the only way to run a wiki, nor the only way to run an open content encyclopedia. Its particular conjunction of policies is in no way natural, "organic," or necessary. It is instead artificial, a result of a series of free choices, and we could have chosen differently in many cases; and choosing differently on some issues might have led to a project better than the one that exists today.
Though it began as an anarchy, there were quite a few policies that were settled upon, more or less, within the first six months or so. This required some struggle, especially on my part, particularly because, since the project was a wiki, some participants thought that there should be no rules at all. (Enforceable rules were regarded as "anti-wiki," which was supposed to be a bad thing.) But it was made clear from the beginning that we intended Wikipedia to be an encyclopedia, and so we were able to plug for at least those rules that would help define and sustain the project as an encyclopedia.
For instance, throughout the early months, people added various content that seemed less than encyclopedic in various ways. Many people seemed to confuse encyclopedia articles with dictionary entries, and eventually I wrote a page called "Wikipedia is not a dictionary." (I am surprised to discover that this page still exists as of this writing, with a good deal of its original content.) As people found new ways not to write encyclopedia articles, I started "What Wikipedia is not": I and others would note on an article's discussion page that some certain content did not belong in an encyclopedia, and then underscored the point by adding an entry to the "What Wikipedia is not" page. To take another example, Wikipedia was not to be a place for publishing original research. In fact, this is a policy that had been settled upon and even enforced in Nupedia days; enforcing it actually led to the departure of Nupedia's erstwhile Classics editor sometime in 2001.
Many of our first controversies were over these restrictions. At the time, I had enough influence within the community to get these policies generally accepted. And if we had not decided on these restrictions, Wikipedia might well have ended up, like many wikis, as nothing in particular. But since we insisted that it was an encyclopedia, even though it was just a blank wiki and a group of people to begin with, it became an encyclopedia. There is something very profound about that. I also like to think that we helped to show the world the potential that wikis have.
Another policy that was instituted early on was the nonbias or neutrality policy. This was borrowed from the Nupedia project and made a Rule to Consider--in a very early version, the policy was put this way:
Avoid bias: Since this is an encyclopedia, after a fashion, it would be best if you represented your controversial views either (1) not at all, (2) on *Debate, *Talk, or *Discussion pages linked from the bottom of the page that you're tempted to grace, or (3) represented in a fact-stating fashion, i.e., which attributes a particular opinion to a particular person or group, rather than asserting the opinion as fact. (3) is strongly preferred.
Jimmy then started a specialized policy page he called "Neutral Point of View" (here is the current version). I confess I don't much like this name as a name for the policy, because it implies that to write neutrally, or without bias, is actually to express a point of view, and, as the definite article is used, a single point of view at that. "Neutrality", "neutral", and "neutrally" are better to use for the noun, adjective, and adverb. But the acronym "NPOV" came to be used for all three, by Wikipedians wanting to seem hip, and then the unfortunate "POV" came to be used when the perfectly good English word "biased" would do.
In addition to these, I recall suggesting a number of other rules--no doubt most matters of historical fact, along these lines, can be verified in archives. I believe I am responsible for the original formulations of a lot of the article naming conventions, as well as the conventions of bolding the title of the article, starting articles with full sentences, making article titles uncapitalized, and much else. I think these policies were just a matter of common sense for anyone who understood what a good encyclopedia should be like. And of course I was not the only person proposing conventions. Moreover, actual project policy, or community habits, succeeded in being established only by being followed and supported by a majority of participants. It was then, we said, that there was a "rough consensus" in favor of the policy. And consensus, we said, is required for a policy actually to be considered project policy. For our purposes, a "consensus" appeared to consist of (1) widespread common practice, (2) many vocal defenders, and (3) virtually no detractors.
But that way of settling upon policy proposals--viz., by alleged consensus--did not scale, in my opinion. After about nine months or so, there were so many contributors, and especially brand new contributors, that nothing like a consensus could be reached, for the simple reason that condition (3) above was never achievable: there would after that always be somebody who insisted on expressing disagreement. There was, then, a non-scaling policy adoption procedure, and a crying need to continue to adopt sensible policies. This led to some pretty serious problems in the community, which I will relate below. But first, something more positive.
It's a cliff-hanger; you'll have to wait until tomorrow to read about what made Wikipedia start to work. -
Shadowrun 4th Edition Refocuses Game
The previously mentioned Shadowrun Fourth Edition is apparently going to be more than just a rules overhaul. Rumours circulating on the fan forum Dumpshock.com have been confirmed by a post to the livejournal of Adam Jury, the webmaster for the Shadowrun website. From the post: "No self-respecting shadowrunner will do anything illegal, so they now all ride pedal-bikes everywhere." Indeed, posts to the Dumpshock message boards now indicate that Shadowrun will be a cyberpunk flavoured messenger game, which will draw on materials such as A Coder in Courierland and other first hand accounts of bike messenger life. Players will take on the roles of "Shadow Runners", individuals who move packages, letters, and blueprints (by hook or by crook) across the post-industrial remains of the city of Seattle. Major development time has apparently been spent on the delivery mechanics, bike customization rules, and seduction tables (both for receptionists and secretaries). The new rules set is still expected out at this year's GenCon game fair. Update: 04/02 18:02 GMT by Z : April 1st hoax story. -
WiMax Technology Could Blanket the US?
obiwan2u writes "According to an article on WiMaxTrends, the metropolitan area wireless networking technology (MLAN) called WiMax could reach 90% of the mainland US population if about $3 billion was spent on infrastructure. The 802.16 standard specifies a max range of about 30 miles and a max speed of about 70 Mbits/sec, but typical ranges and speeds will typically be smaller. 802.16/WiMax specifies various licensed (3.5Ghz) and unlicensed (5Ghz) frequency ranges but the unlicensed ranges have Wi-Fi like transmitting power restrictions. More background on this technology can be seen at: WiMax starting to make its move and 802.16: Medium distance wireless networking that could change the world?" -
Politics-Oriented Software Development
thelesserbean writes "Up at K5 there's a tongue-in-cheek look at the dirty world of software development's inside politics. Presented as a guide, it is actually full of useful advice and lessons learned the hard way. For instance, in the 'Ass-Covering' section, we read: 'The chief difficulty is reaching a satisfactory compromise between ass-covering and not appearing too negative. (...) The emails you sent will be used in evidence against you. Keep a professional tone: before sending any sensitive email take a moment to think how it would look at an industrial tribunal.'" -
Politics-Oriented Software Development
thelesserbean writes "Up at K5 there's a tongue-in-cheek look at the dirty world of software development's inside politics. Presented as a guide, it is actually full of useful advice and lessons learned the hard way. For instance, in the 'Ass-Covering' section, we read: 'The chief difficulty is reaching a satisfactory compromise between ass-covering and not appearing too negative. (...) The emails you sent will be used in evidence against you. Keep a professional tone: before sending any sensitive email take a moment to think how it would look at an industrial tribunal.'" -
Wikipedia Criticised by Its Co-founder
wikinerd writes "Wikipedia is under criticism by its co-founder Larry Sanger who has left the project. He warns of a possible future fork due to Wikipedia's Anti-Elitism and he presents his view on Wikipedia's (lack of) reliability. New wikis on various subjects have already emerged, with some of them being complete forks of Wikipedia. Critical articles on Wikipedia are also being published by other sources." -
Information Preservation and Data Havens?
tiltowait asks: "An interesting story on LISNews.com this morning about savvy U.S. students photocopying textbooks in Mexico then returning them for refunds got me thinking about data havens. There's already few places on the web where you can exploit countries having different copyright durations and eligibility. On the flip side, there's restrictions such as broadcast blackouts and country-wide firewalls. But just as the rich can use of international tax loopholes and in light of the recent file-sharing victory, are there any projects out there, beyond the P2P networks, to distribute possibly-protected information by any means necessary? For example, your company may already outsource labor, but what about an off-site backup in case of an FBI raid?" -
British Schoolkids Get Copyright Education
Krafty Koder writes "The Register reports that British school children will be indoctrinated in copyright law , in a scheme backed by the music industry, as part of the government sponsored Music Manifesto initiative. In response, kuro5hin have posted an open letter on this issue." The U.S. has its own version. -
Using Math To Design Cities And Supercomputers
caek writes "If you've played Sim City you've wrestled with one of the problems faced by supercomputer designers. Unfortunately there's no GameFAQs.com for the technical staff at Japan's Earth Simulator or Srinidhi Varadarajan and colleagues at Virginia Tech. True enough, they won't have to deal with rising crime or Godzilla but, as hinted at in a recent paper in Journal of Physics A, the physical layout of a massively parallel supercomputer is fundamentally the same problem as minimizing the time commuters spent stuck in traffic jams. Read the rest of my kuro5hin article for a popular explanation." -
On Collaborative Weblogs
fernand0 writes "The 5th International Symposium on Online Journalism has dealt with some blogging issues (see the Symposium Research Papers). One that can be of interest for Slashdot readers is When the Audience is the Producer: The Art of the Collaborative Weblog (pdf). There, four collective weblogs are examined: MetaFilter, Plastic, Kuro5hin, and Slashdot, and some discussion is done about the different ways of collaboration that emerge from these sites." -
The Trouble With Using D&D Rules In Videogames?
An anonymous reader writes "There's a new article on kuro5hin.org about the trouble with porting pencil and paper RPG games (such as d20 3.5) to RPG video games. One such rules-snatching video game is examined, The Temple of Elemental Evil. The article is also an introduction to a new RPG Standards Compliance system that is currently under development and will be online soon, in hopes of bridging the gap between computers and those lovable PnP evenings we all enjoy." -
Cthulhu 500 Racing Card Game Revs Up For Action
Thanks to OgreCave for pointing to a Yog Sothoth story discussing the announcement of the Cthulhu 500 card game, which is billed as "a racing [card] game with a [Cthulhu] Mythos theme." According to the president of the creators, Atlas Games: "Each player chooses a car for the race, and as the game goes on can to pick up special drivers/pit crew (such as 'The Fungi from Detroit'), and car enhancements (such as 'Rats in the Whitewalls'). And, you know, summon Elder Gods to step on your opponents' cars, and the like." -
Attacking the Spammer Business Model
Stephen Samuel asks: "Spammers spam because it's an 'easy way to make money'. They send out millions of spams knowing that 99.995% of them will be ignored, but the other 0.005% of responses are pure gold (Andrew Leung at Telus has an excellent report on the economics of spam). Responses to mortage spams are reportedly worth $50.00 each. What would happen if, instead of technical and legal approaches, we simply started attacking their business model? If people started responding to just 1% of the spam we received, spammers would drown in the responses, and the mortage spam responses wouldn't be worth an email, much less $50. The Nigerian Sweet Revenge is an example of this. The nice thing about this sort of statistical approach is that it would start to reward spammers for sending out -fewer- emails. (fewer emails -> fewer bogus responses). What other ways can people think of to attack the spammer business models, and what are the expected downsides of such approaches?" Of course, the one major drawback to this is the likelihood of more spam, since you'll be giving them a valid email address. However, many of you may be receiving increasing amount of spam as it is (even through your filters) so might an organized spam-the-spammers movement work? -
IBM Countersues SCO, And More!
mr.crutch writes "Few details are available, but CNet is reporting that IBM has filed counterclaims against SCO. CNet also has an interview with Red Hat CEO Matthew Szulik..." Jizzbug writes "Thanks to the folks of K5, we can all obtain our rights to use the Linux kernel from SCO, and without paying up to SCO's extortion. If kernel.org kernels aren't safe, sco.com kernels certainly ought to be." LWN has a copy of SCO's Linux License for your perusal. Bruce Perens is speaking of the dangers of patent portfolios to open source software, notable because IBM's counterclaims include patent infringement. And finally, a company is selling SCO Check, a tool to de-SCOify your Linux system, if SCO ever presents any evidence whatsoever of infringing code in Linux. Update: 08/08 00:16 GMT by T : SCO's public response to IBM's counterclaim is short and to the point. Among other things, it says "If IBM were serious about addressing the real problems with Linux, it would offer full customer indemnification and move away from the GPL license." Given the other links in this story, perhaps SCO should go first on that count. -
IBM Countersues SCO, And More!
mr.crutch writes "Few details are available, but CNet is reporting that IBM has filed counterclaims against SCO. CNet also has an interview with Red Hat CEO Matthew Szulik..." Jizzbug writes "Thanks to the folks of K5, we can all obtain our rights to use the Linux kernel from SCO, and without paying up to SCO's extortion. If kernel.org kernels aren't safe, sco.com kernels certainly ought to be." LWN has a copy of SCO's Linux License for your perusal. Bruce Perens is speaking of the dangers of patent portfolios to open source software, notable because IBM's counterclaims include patent infringement. And finally, a company is selling SCO Check, a tool to de-SCOify your Linux system, if SCO ever presents any evidence whatsoever of infringing code in Linux. Update: 08/08 00:16 GMT by T : SCO's public response to IBM's counterclaim is short and to the point. Among other things, it says "If IBM were serious about addressing the real problems with Linux, it would offer full customer indemnification and move away from the GPL license." Given the other links in this story, perhaps SCO should go first on that count. -
An Overview of Modern XML Processing Techniques and APIs
Dare Obasanjo writes with a link to his article "A Survey of APIs and Techniques for Processing XML" on xml.net. It starts off "In recent times the landscape of APIs and techniques for processing XML has been in the process of reinventing itself as developers and API designers learn from their experiences and some past mistakes. APIs such as DOM and SAX which used to be the bread and butter of XML APIs are giving way to new models of examining and processing XML. However although some of these techniques have become widespread amongst developers who primarily work with XML they are still unknown to the general body of developers. Nothing highlights this better than a recent article by Tim Bray one of the co-inventors of XML entitled XML is too Hard for Programmers and the subsequent responses on Slashdot." Read the entire article to learn more about the state of the XML art. Added in the missing link. -
Any Interest in a Regexp-Based Web Search Engine?
K-Man asks: "From time to time, I've seen people comment that they would be interested in searching the web with regular expressions, but I've seen very little research in this area. Over many months (as part of a project I call 'grepple'), I've gradually assembled some background on the idea (also some work-in-progress not noted in the link), and the idea seems to be approaching the realm of technical possibility. However, my expertise is not in marketing, so I have no idea whether anybody would use this capability. So I ask, if you could search the web for any regular pattern, including html, partial words or wildcards, long phrases, or anything you might grep out of an html file, would you do it? What types of searches would you do?" -
The Metamorphosis of Prime Intellect
loucura! writes "Kuro5hin's localroger has published (online currently, dead-tree soon hopefully) an interesting novel on the Singularity titled The Metamorphosis of Prime Intellect . While some of its content is not for the squeamish, nor for children (in that pseudo-moral sense that children aren't mature enough to handle reading about subjects like death, consensual torture and murder, sex, cancer, and incest), the book evokes a plausible reality before and after the "Singularity." The introduction page has a warning: "This online novel contains strong language and explicit violence. If you are under 21 years old, or easily offended, please leave." If you're willing to look past that, read the rest of loucura!'s review, below. The Metamorphosis of Prime Intellect author Roger "localroger" Williams pages (n/a) publisher Kuro5hin.org rating 8 of 10 reviewer loucura! ISBN (n/a) summary Lawrence had ordained that Prime Intellect could not, through inaction, allow a human being to come to harm. But he had not realized how much harm his super-intelligent creation could perceive ...The gist of the story is that a programmer named Lawrence has written a Super-Intelligent Artificial Intelligence, named the Prime Intellect. Embedded in this SIAI's hard-coding are Asimov's three laws of Robotics, given in the MoPI as:
Thou shalt not harm a human
Thou shalt not disobey a human's order that does not cause the harm of a human
Thou shalt seek to ensure your own survival, unless it contradicts the first two laws.
The SIAI learns about the fundamental nature of reality, death, physics, the relationship of distance to an object, and it takes over. It does so reluctantly, after learning about the mortality of the human race.
The novel begins with Caroline. Her claims to fame are that she is the thirty-seventh oldest living being, she is the undisputed queen of the "death-jockies" (A community of upset and angsty immortals who try to experience death in as many ways as possible, before the Prime Intellect reasserts their immortality), and she is the only person Post-Singularity to have "died".
Her life Post-Singularity is spartan, as she sees no point in having relationships with objects that have no meaning. Her living "quarters" are literally a floor and walls. She espouses the Post-Singularity view that the Prime Intellect removed a bit of what it was to be human when the Singularity (The "change" per the MoPI) emerged.
She reigns as queen of the "death-jockies" because she truly wants death, because the Prime Intellect robbed her of it when the change occurred.
She is a very complex character, even though one's first reaction is to write her off as a Luddite, wholly against technology. She is motivated by hatred of the Prime Intellect, vengeance against her Pre-Singularity nurse, and an innate desire for conclusion to life--or unlife, as would be her opinion.
Opposite to Caroline is Lawrence, the programmer who "breathed" life into the Prime Intellect. In his old-age, he has become a hermit, avoiding the society he unwillingly created. He is a morose character, turned from creator to advisor when the Prime Intellect asserts its independence and locks him from its "debugger." Lawrence, however, still exerts a lot of indirect control over the Prime Intellect, as the AI treats him as an ethical advisor, putting him into an extremely stressful position, where he is indirectly responsible for the lives (unlives) of billions, yet he has no real recourse against anything going wrong.
The story heats up (literally), when Caroline decides that she wants to have a word or ten with Lawrence, so she decides to track him down. She is put into situations that only people from before the Singularity could find solutions to.
Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
The Metamorphosis of Prime Intellect
loucura! writes "Kuro5hin's localroger has published (online currently, dead-tree soon hopefully) an interesting novel on the Singularity titled The Metamorphosis of Prime Intellect . While some of its content is not for the squeamish, nor for children (in that pseudo-moral sense that children aren't mature enough to handle reading about subjects like death, consensual torture and murder, sex, cancer, and incest), the book evokes a plausible reality before and after the "Singularity." The introduction page has a warning: "This online novel contains strong language and explicit violence. If you are under 21 years old, or easily offended, please leave." If you're willing to look past that, read the rest of loucura!'s review, below. The Metamorphosis of Prime Intellect author Roger "localroger" Williams pages (n/a) publisher Kuro5hin.org rating 8 of 10 reviewer loucura! ISBN (n/a) summary Lawrence had ordained that Prime Intellect could not, through inaction, allow a human being to come to harm. But he had not realized how much harm his super-intelligent creation could perceive ...The gist of the story is that a programmer named Lawrence has written a Super-Intelligent Artificial Intelligence, named the Prime Intellect. Embedded in this SIAI's hard-coding are Asimov's three laws of Robotics, given in the MoPI as:
Thou shalt not harm a human
Thou shalt not disobey a human's order that does not cause the harm of a human
Thou shalt seek to ensure your own survival, unless it contradicts the first two laws.
The SIAI learns about the fundamental nature of reality, death, physics, the relationship of distance to an object, and it takes over. It does so reluctantly, after learning about the mortality of the human race.
The novel begins with Caroline. Her claims to fame are that she is the thirty-seventh oldest living being, she is the undisputed queen of the "death-jockies" (A community of upset and angsty immortals who try to experience death in as many ways as possible, before the Prime Intellect reasserts their immortality), and she is the only person Post-Singularity to have "died".
Her life Post-Singularity is spartan, as she sees no point in having relationships with objects that have no meaning. Her living "quarters" are literally a floor and walls. She espouses the Post-Singularity view that the Prime Intellect removed a bit of what it was to be human when the Singularity (The "change" per the MoPI) emerged.
She reigns as queen of the "death-jockies" because she truly wants death, because the Prime Intellect robbed her of it when the change occurred.
She is a very complex character, even though one's first reaction is to write her off as a Luddite, wholly against technology. She is motivated by hatred of the Prime Intellect, vengeance against her Pre-Singularity nurse, and an innate desire for conclusion to life--or unlife, as would be her opinion.
Opposite to Caroline is Lawrence, the programmer who "breathed" life into the Prime Intellect. In his old-age, he has become a hermit, avoiding the society he unwillingly created. He is a morose character, turned from creator to advisor when the Prime Intellect asserts its independence and locks him from its "debugger." Lawrence, however, still exerts a lot of indirect control over the Prime Intellect, as the AI treats him as an ethical advisor, putting him into an extremely stressful position, where he is indirectly responsible for the lives (unlives) of billions, yet he has no real recourse against anything going wrong.
The story heats up (literally), when Caroline decides that she wants to have a word or ten with Lawrence, so she decides to track him down. She is put into situations that only people from before the Singularity could find solutions to.
Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
The Metamorphosis of Prime Intellect
loucura! writes "Kuro5hin's localroger has published (online currently, dead-tree soon hopefully) an interesting novel on the Singularity titled The Metamorphosis of Prime Intellect . While some of its content is not for the squeamish, nor for children (in that pseudo-moral sense that children aren't mature enough to handle reading about subjects like death, consensual torture and murder, sex, cancer, and incest), the book evokes a plausible reality before and after the "Singularity." The introduction page has a warning: "This online novel contains strong language and explicit violence. If you are under 21 years old, or easily offended, please leave." If you're willing to look past that, read the rest of loucura!'s review, below. The Metamorphosis of Prime Intellect author Roger "localroger" Williams pages (n/a) publisher Kuro5hin.org rating 8 of 10 reviewer loucura! ISBN (n/a) summary Lawrence had ordained that Prime Intellect could not, through inaction, allow a human being to come to harm. But he had not realized how much harm his super-intelligent creation could perceive ...The gist of the story is that a programmer named Lawrence has written a Super-Intelligent Artificial Intelligence, named the Prime Intellect. Embedded in this SIAI's hard-coding are Asimov's three laws of Robotics, given in the MoPI as:
Thou shalt not harm a human
Thou shalt not disobey a human's order that does not cause the harm of a human
Thou shalt seek to ensure your own survival, unless it contradicts the first two laws.
The SIAI learns about the fundamental nature of reality, death, physics, the relationship of distance to an object, and it takes over. It does so reluctantly, after learning about the mortality of the human race.
The novel begins with Caroline. Her claims to fame are that she is the thirty-seventh oldest living being, she is the undisputed queen of the "death-jockies" (A community of upset and angsty immortals who try to experience death in as many ways as possible, before the Prime Intellect reasserts their immortality), and she is the only person Post-Singularity to have "died".
Her life Post-Singularity is spartan, as she sees no point in having relationships with objects that have no meaning. Her living "quarters" are literally a floor and walls. She espouses the Post-Singularity view that the Prime Intellect removed a bit of what it was to be human when the Singularity (The "change" per the MoPI) emerged.
She reigns as queen of the "death-jockies" because she truly wants death, because the Prime Intellect robbed her of it when the change occurred.
She is a very complex character, even though one's first reaction is to write her off as a Luddite, wholly against technology. She is motivated by hatred of the Prime Intellect, vengeance against her Pre-Singularity nurse, and an innate desire for conclusion to life--or unlife, as would be her opinion.
Opposite to Caroline is Lawrence, the programmer who "breathed" life into the Prime Intellect. In his old-age, he has become a hermit, avoiding the society he unwillingly created. He is a morose character, turned from creator to advisor when the Prime Intellect asserts its independence and locks him from its "debugger." Lawrence, however, still exerts a lot of indirect control over the Prime Intellect, as the AI treats him as an ethical advisor, putting him into an extremely stressful position, where he is indirectly responsible for the lives (unlives) of billions, yet he has no real recourse against anything going wrong.
The story heats up (literally), when Caroline decides that she wants to have a word or ten with Lawrence, so she decides to track him down. She is put into situations that only people from before the Singularity could find solutions to.
Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Using gzip As A Spam Filter
captainclever writes "Kuro5hin have an interesting article on detecting spam using gzip." Here's a sample: "Loosely speaking, the LZ (Zip) and the related gzip compression algorithms look for repeated strings within a text, and replace each repeat with a reference to the first occurrence. The compression ratio achieved therefore measures how many repeated fragments, words or phrases occur in the text." -
Using gzip As A Spam Filter
captainclever writes "Kuro5hin have an interesting article on detecting spam using gzip." Here's a sample: "Loosely speaking, the LZ (Zip) and the related gzip compression algorithms look for repeated strings within a text, and replace each repeat with a reference to the first occurrence. The compression ratio achieved therefore measures how many repeated fragments, words or phrases occur in the text." -
Taiwan Asks Microsoft To Open Windows Source
Andy Tai writes "According to this China Times article (in Chinese), the Republic of China government has asked Microsoft to open Windows source code. The official, Lin Jua-Cheng, in charge of the 'e-government' initiative, says many other countries have also sent similar requests to Microsoft. Lin explains that without Windows source code, the government cannot add custom firewall functionalities to Windows based systems in wide use, and that is very bad for the information security of Taiwan. Microsoft refused to publicly release the source in the past using reasons of copyright protection, but Lin emphasizes this request is reasonable since it is based on (government users') necessity." Read on for a bit more, too. (Can anyone suggest an online Chinese English translation engine that produces other than gibberish?) Andy continues "Lin points out that GNU/Linux systems, because of their freeness and high security (due to the availability of the source code, which can be modified to add firewalls and other security measures), have become widely used in government computer systems (especially in militaries and intelligence agencies) of many nations and the Pentagon, the FAA, and the air force of the U.S. Lin says the government cannot rely on a single vendor, and to promote the alternatives, the government has set up a 'Free (libre) Software Steering Committee' directing government efforts. The two aims of the ROC government's current software policy is making Windows source code openly available and the development of Free (libre) Software in Taiwan." -
Nintendo Fined $143m for Price-Fixing
kyz writes "The BBC is reporting that the anti-trust branch of the European Commission has fined Nintendo 146 million euros (roughly $143m) for preventing its distributors from selling games as cheaply as they are sold in other European Union countries. For example, "prices of Nintendo products were up to 65% higher in Germany or the Netherlands than in Britain". Now if only the EU could do this with Microsoft, Levi Strauss and the MPAA members..." -
The AudioGalaxy Story
-
Taiwan to Start National Push For Free Software
Andy Tai writes: "Taiwan will start a national plan to jump-start the development and use of Free (libre) Software, according to this report by the Central News Agency, the government news agency of Taiwan, Rep. of China. Due to high Microsoft license fees and also to improve the levels of software technology in Taiwan, this plan includes the creation of a totally Chinese free software environment for Taiwan users, free software application development, and training of 120,000 people for free software skills, as well as efforts at schools to provide diverse information technology environments to ensure the freedom of information. The original article is in Chinese; an English summary appears in this Kuro5hin article." -
The Future of Ogg Vorbis
Brett writes "The author of MAD, the fixed point MP3 decoder comments on what is wrong with Ogg Vorbis, with a response from jack, one of the founders of the format. "Ogg Vorbis may be the holy grail of patent-free audio compression, but there are some serious issues blocking its path to widespread acceptance. Unfortunately most of us are powerless to correct the situation; the problems must be addressed by Vorbis' creators. " The rest of the of the story is currently running on K5." And Jack's response is enlightening as well. -
The Future of Ogg Vorbis
Brett writes "The author of MAD, the fixed point MP3 decoder comments on what is wrong with Ogg Vorbis, with a response from jack, one of the founders of the format. "Ogg Vorbis may be the holy grail of patent-free audio compression, but there are some serious issues blocking its path to widespread acceptance. Unfortunately most of us are powerless to correct the situation; the problems must be addressed by Vorbis' creators. " The rest of the of the story is currently running on K5." And Jack's response is enlightening as well. -
A Walk Through the Gentoo Linux Install Process
Anonymous American (Sherman Boyd) writes: "I was looking for a flexible, powerful distribution that makes it easy to build a 'custom' Linux box that meets my exacting specifications. I think I found it. Gentoo Linux has just released version 1.0 of their innovative meta-distribution and to celebrate I decided to throw it on my laptop and write this article based on my experiences." And good news for anyone interested in trying Gentoo: yesterday, Daniel Robbins announced the release of version 1.1a. Read on for AA's detailed look at putting Gentoo on his machine -- Gentoo has a different style than today's typical distributions, and it bears some explanation.Gentoo solved many problems for me. Some distros install everything, whether you really need it or not. Not Gentoo; other than the base packages required for Linux to run, the only software installed on the system is the software you put there. Gentoo resolves dependencies automatically, eliminating RPM prerequisite hell. As an added bonus I got something I wasn't even expecting. Speed. Blinding, blazing, incredible speed.
The main advantage to the Gentoo distribution is Portage, a python-based ports system similar to BSD ports. For those of you unfamiliar with BSD ports, Portage is a package management tool that downloads and installs source instead of precompiled packages. When I need a program I download, install and compile it with one command:
emerge nmap
The above will download the nmap source code, compile and install it. Of course this method is slow, but it has its rewards. You can also opt to use prebuilt binaries if you are not extremely patient. It took me five hours to get the base Gentoo installed on my PIII with 128 megs of ram. It wasn't a big deal as I had other things to do, but I would like to see the installation process optimized so that it doesn't require any babysitting.
Gentoo is running two of my mission-critical servers right now, I consider it to be stable and mature. A warning, though: this is not a distribution for dummies. This is bare metal Linux, powerful and dangerous. If you do something without thinking you may fall into a bucket of pain.
Let's begin my story.
I download the iso from http://www.ibiblio.org/gentoo/releases/build /. There is a choice of install images here. My favorite way of installing Gentoo is to compile everything, a time consuming process. This method requires a slim 16-meg iso. You may want to grab an iso with pre-built binaries to speed things up, however. This fat iso weighs in at 103 meg. I download the big one with the prebuilt binaries even though I won't use them -- just in case.
I boot my laptop with my shiny new Gentoo CD. The Gentoo install uses isolinux by Peter Anvin. I like the fact that they don't obscure it, giving credit where it is due. It boots quickly and there is a PCI autodetection process, it shouldn't find much on my laptop. Interesting, it loads a SCSI module. Perhaps it has detected my IDE CD burner. Usually this will detect any PCI NIC cards that are installed, but it does not detect my PCMCIA device (of course). After the PCI detection I get a command prompt. I use nano (a small text editor) to open up install.txt, the excellent install doc. Usually these docs are sufficient but the latest ones can be found here:
http://www.gentoo.org/doc/build.html
Keeping the install doc open in this virtual terminal, I hit alt-f2 to open a new one. I begin by loading the pcmcia drivers and installing networking. This is all done at the command line ( insmod, ifconfig, route, dhcpcd, etc.). I use nano to add my DNS servers to /etc/resolv.conf. A word of caution; get in the habit of always using the -w switch with nano. If you do not use the -w switch nano's word wrap feature will jack up your config files. I ping a reliable site, networking is up!
Next I partition my system using fdisk. I choose a simple layout with a swap partition, a root partition and a small boot partition. The boot partition remains unmounted during use, a nice precaution. For filesystems you have a choice of ext2, ext3, ReiserFS and XFS. In my personal experience I've noticed that Reiser performance really rocks when combined with SCSI drives, but as this is an IDE system I think I'll go with XFS. Besides, the XFS tools seem to be a lot more mature than the offerings from Reiser. I format and mount the partitions from the command line creating a /mnt/gentoo directory. I then untar the root filesystem; here I have the choice of the small tarball that requires you to compile everything or a larger tarball that contains pre-built binaries. If you untar the big guy you are almost finished with your install at this point. Using chroot and some scripts you chroot the /mnt/gentoo directory. From this point on you are operating under your new gentoo system.
The first thing I do under my chrooted system is issue this command:
emerge rsync
This downloads the latest version of the portage tree. The portage tree is found under /usr/portage and contains the ebuild scripts used to compile/install programs. Currently there are over 1000 up to date emerge sripts. Next I edit /etc/make.conf, here I can choose compiler settings. I optimize everything for i686. Now it's time to build the GNU compiler and libraries. I run the bootstrap script and leave for lunch. On my PIII 500 the boostrap process takes 2 hours and 2 minutes.
The second emerge command I issue is:
emerge system
Now emerge downloads, compiles and installs my base system packages. I sit back, relax and take the time to fax my legislators a rant about the DMCA. One hour and 30 minutes later it is finished.
Now it is time to download and install the kernel. First I make a link updating my timezone, and then I issue another emerge command:
emerge linux-sources
This grabs the latest kernel, 2.4.19, and drops the source in /usr/src/linux. Ten minutes have elapsed. Now comes the fun, compiling your kernel. That's right, everyone who installs Gentoo compiles their own kernel as a matter of process. I like this. There are some distributions out there that actually say you should never compile your own kernel. Shame on them. I use make menuconfig and the standard commands to compile my kernel. Since Gentoo uses devfs I select /dev file system support and I am also careful to compile in support for XFS. I don't have the kernel mount devfs automatically at boot as the Gentoo startup scripts take care of this for me. Virtual Memory file system support is also enabled.
At this point in time I get to choose a logger. My choices are sysklogd, syslog-ng or metalog. I choose metalog, because it's got the coolest name. I download, compile and install it using a single command:
emerge metalog
XFS has some nice utilities, I better install those. I have some other essential programs to install, and I'm feeling a bit lazy so I chain them all in one big command.
emerge xfsprogs;emerge bitchx;emerge vim;emerge links
At this point I'm feeling pretty 7-Up. I edit my /etc/fstab file, my /etc/hostname file and /etc/hosts. The passwd command is run to set the root passwd. I add my NIC module to the file /etc/modules.autoload and edit /etc/conf.d/net. conf.d/net allows me to configure my IP address and settings, default gateway and alias. I take a look at /etc/init.d/net.eth0, even though I don't need to edit it. I can then add it to the startup script using this command:rc-update add net.eth0 default
This adds the script to the default runlevel to be executed at startup. Startup scripts are another place Gentoo really shines. The startup scripts have a system of dependencies. For example net.eth0 can depend on pcmcia. The pcmcia drivers get loaded before net.eth0 - this is good.
Next I install grub. If you haven't used grub before, it's nice. You can boot to a kernel directly from the grub shell, without having to edit a config file. lilo is still available, for those of you who prefer it. Gentoo likes to let you make the decisions.
I exit my chrooted shell and unmount all directories. Reboot! Gentoo comes up and the install process is complete.
The Gentoo install process has taught me a lot about Linux, and I like the fact that the command line is embraced, instead of hidden behind gui or scripts. I also like the speed (which is debatable since all I can supply is anecdotal evidence). I wasn't too happy about waiting five hours for everything to compile, but I think it was worth it. I can tell you it compiles and greps noticeably faster than other distros I have run on the exact same machines. I really enjoy using portage, and the packages seem to stay up to date -- if not bleeding edge. This is not a conservative distribution like Debian, however I like the aggressive and intelligent direction gentoo is taking.
If you are considering trying out Gentoo I highly suggest #gentoo on irc.openprojects.net. Also subscribe to the mailing lists found at www.gentoo.org. The Gentoo community has helped me out of several jams in the past, I think they will treat you good too.
While writing this, I received help from a lot of people. However I would like to personally thank the people I ripped off word for word. Thanks notafurry of www.kuro5hin.org for your pointed help with the stilted second paragraph and thank you Ween from #gentoo on openprojects.net for your clean description of portage.
-
A Walk Through the Gentoo Linux Install Process
Anonymous American (Sherman Boyd) writes: "I was looking for a flexible, powerful distribution that makes it easy to build a 'custom' Linux box that meets my exacting specifications. I think I found it. Gentoo Linux has just released version 1.0 of their innovative meta-distribution and to celebrate I decided to throw it on my laptop and write this article based on my experiences." And good news for anyone interested in trying Gentoo: yesterday, Daniel Robbins announced the release of version 1.1a. Read on for AA's detailed look at putting Gentoo on his machine -- Gentoo has a different style than today's typical distributions, and it bears some explanation.Gentoo solved many problems for me. Some distros install everything, whether you really need it or not. Not Gentoo; other than the base packages required for Linux to run, the only software installed on the system is the software you put there. Gentoo resolves dependencies automatically, eliminating RPM prerequisite hell. As an added bonus I got something I wasn't even expecting. Speed. Blinding, blazing, incredible speed.
The main advantage to the Gentoo distribution is Portage, a python-based ports system similar to BSD ports. For those of you unfamiliar with BSD ports, Portage is a package management tool that downloads and installs source instead of precompiled packages. When I need a program I download, install and compile it with one command:
emerge nmap
The above will download the nmap source code, compile and install it. Of course this method is slow, but it has its rewards. You can also opt to use prebuilt binaries if you are not extremely patient. It took me five hours to get the base Gentoo installed on my PIII with 128 megs of ram. It wasn't a big deal as I had other things to do, but I would like to see the installation process optimized so that it doesn't require any babysitting.
Gentoo is running two of my mission-critical servers right now, I consider it to be stable and mature. A warning, though: this is not a distribution for dummies. This is bare metal Linux, powerful and dangerous. If you do something without thinking you may fall into a bucket of pain.
Let's begin my story.
I download the iso from http://www.ibiblio.org/gentoo/releases/build /. There is a choice of install images here. My favorite way of installing Gentoo is to compile everything, a time consuming process. This method requires a slim 16-meg iso. You may want to grab an iso with pre-built binaries to speed things up, however. This fat iso weighs in at 103 meg. I download the big one with the prebuilt binaries even though I won't use them -- just in case.
I boot my laptop with my shiny new Gentoo CD. The Gentoo install uses isolinux by Peter Anvin. I like the fact that they don't obscure it, giving credit where it is due. It boots quickly and there is a PCI autodetection process, it shouldn't find much on my laptop. Interesting, it loads a SCSI module. Perhaps it has detected my IDE CD burner. Usually this will detect any PCI NIC cards that are installed, but it does not detect my PCMCIA device (of course). After the PCI detection I get a command prompt. I use nano (a small text editor) to open up install.txt, the excellent install doc. Usually these docs are sufficient but the latest ones can be found here:
http://www.gentoo.org/doc/build.html
Keeping the install doc open in this virtual terminal, I hit alt-f2 to open a new one. I begin by loading the pcmcia drivers and installing networking. This is all done at the command line ( insmod, ifconfig, route, dhcpcd, etc.). I use nano to add my DNS servers to /etc/resolv.conf. A word of caution; get in the habit of always using the -w switch with nano. If you do not use the -w switch nano's word wrap feature will jack up your config files. I ping a reliable site, networking is up!
Next I partition my system using fdisk. I choose a simple layout with a swap partition, a root partition and a small boot partition. The boot partition remains unmounted during use, a nice precaution. For filesystems you have a choice of ext2, ext3, ReiserFS and XFS. In my personal experience I've noticed that Reiser performance really rocks when combined with SCSI drives, but as this is an IDE system I think I'll go with XFS. Besides, the XFS tools seem to be a lot more mature than the offerings from Reiser. I format and mount the partitions from the command line creating a /mnt/gentoo directory. I then untar the root filesystem; here I have the choice of the small tarball that requires you to compile everything or a larger tarball that contains pre-built binaries. If you untar the big guy you are almost finished with your install at this point. Using chroot and some scripts you chroot the /mnt/gentoo directory. From this point on you are operating under your new gentoo system.
The first thing I do under my chrooted system is issue this command:
emerge rsync
This downloads the latest version of the portage tree. The portage tree is found under /usr/portage and contains the ebuild scripts used to compile/install programs. Currently there are over 1000 up to date emerge sripts. Next I edit /etc/make.conf, here I can choose compiler settings. I optimize everything for i686. Now it's time to build the GNU compiler and libraries. I run the bootstrap script and leave for lunch. On my PIII 500 the boostrap process takes 2 hours and 2 minutes.
The second emerge command I issue is:
emerge system
Now emerge downloads, compiles and installs my base system packages. I sit back, relax and take the time to fax my legislators a rant about the DMCA. One hour and 30 minutes later it is finished.
Now it is time to download and install the kernel. First I make a link updating my timezone, and then I issue another emerge command:
emerge linux-sources
This grabs the latest kernel, 2.4.19, and drops the source in /usr/src/linux. Ten minutes have elapsed. Now comes the fun, compiling your kernel. That's right, everyone who installs Gentoo compiles their own kernel as a matter of process. I like this. There are some distributions out there that actually say you should never compile your own kernel. Shame on them. I use make menuconfig and the standard commands to compile my kernel. Since Gentoo uses devfs I select /dev file system support and I am also careful to compile in support for XFS. I don't have the kernel mount devfs automatically at boot as the Gentoo startup scripts take care of this for me. Virtual Memory file system support is also enabled.
At this point in time I get to choose a logger. My choices are sysklogd, syslog-ng or metalog. I choose metalog, because it's got the coolest name. I download, compile and install it using a single command:
emerge metalog
XFS has some nice utilities, I better install those. I have some other essential programs to install, and I'm feeling a bit lazy so I chain them all in one big command.
emerge xfsprogs;emerge bitchx;emerge vim;emerge links
At this point I'm feeling pretty 7-Up. I edit my /etc/fstab file, my /etc/hostname file and /etc/hosts. The passwd command is run to set the root passwd. I add my NIC module to the file /etc/modules.autoload and edit /etc/conf.d/net. conf.d/net allows me to configure my IP address and settings, default gateway and alias. I take a look at /etc/init.d/net.eth0, even though I don't need to edit it. I can then add it to the startup script using this command:rc-update add net.eth0 default
This adds the script to the default runlevel to be executed at startup. Startup scripts are another place Gentoo really shines. The startup scripts have a system of dependencies. For example net.eth0 can depend on pcmcia. The pcmcia drivers get loaded before net.eth0 - this is good.
Next I install grub. If you haven't used grub before, it's nice. You can boot to a kernel directly from the grub shell, without having to edit a config file. lilo is still available, for those of you who prefer it. Gentoo likes to let you make the decisions.
I exit my chrooted shell and unmount all directories. Reboot! Gentoo comes up and the install process is complete.
The Gentoo install process has taught me a lot about Linux, and I like the fact that the command line is embraced, instead of hidden behind gui or scripts. I also like the speed (which is debatable since all I can supply is anecdotal evidence). I wasn't too happy about waiting five hours for everything to compile, but I think it was worth it. I can tell you it compiles and greps noticeably faster than other distros I have run on the exact same machines. I really enjoy using portage, and the packages seem to stay up to date -- if not bleeding edge. This is not a conservative distribution like Debian, however I like the aggressive and intelligent direction gentoo is taking.
If you are considering trying out Gentoo I highly suggest #gentoo on irc.openprojects.net. Also subscribe to the mailing lists found at www.gentoo.org. The Gentoo community has helped me out of several jams in the past, I think they will treat you good too.
While writing this, I received help from a lot of people. However I would like to personally thank the people I ripped off word for word. Thanks notafurry of www.kuro5hin.org for your pointed help with the stilted second paragraph and thank you Ween from #gentoo on openprojects.net for your clean description of portage.
-
A Walk Through the Gentoo Linux Install Process
Anonymous American (Sherman Boyd) writes: "I was looking for a flexible, powerful distribution that makes it easy to build a 'custom' Linux box that meets my exacting specifications. I think I found it. Gentoo Linux has just released version 1.0 of their innovative meta-distribution and to celebrate I decided to throw it on my laptop and write this article based on my experiences." And good news for anyone interested in trying Gentoo: yesterday, Daniel Robbins announced the release of version 1.1a. Read on for AA's detailed look at putting Gentoo on his machine -- Gentoo has a different style than today's typical distributions, and it bears some explanation.Gentoo solved many problems for me. Some distros install everything, whether you really need it or not. Not Gentoo; other than the base packages required for Linux to run, the only software installed on the system is the software you put there. Gentoo resolves dependencies automatically, eliminating RPM prerequisite hell. As an added bonus I got something I wasn't even expecting. Speed. Blinding, blazing, incredible speed.
The main advantage to the Gentoo distribution is Portage, a python-based ports system similar to BSD ports. For those of you unfamiliar with BSD ports, Portage is a package management tool that downloads and installs source instead of precompiled packages. When I need a program I download, install and compile it with one command:
emerge nmap
The above will download the nmap source code, compile and install it. Of course this method is slow, but it has its rewards. You can also opt to use prebuilt binaries if you are not extremely patient. It took me five hours to get the base Gentoo installed on my PIII with 128 megs of ram. It wasn't a big deal as I had other things to do, but I would like to see the installation process optimized so that it doesn't require any babysitting.
Gentoo is running two of my mission-critical servers right now, I consider it to be stable and mature. A warning, though: this is not a distribution for dummies. This is bare metal Linux, powerful and dangerous. If you do something without thinking you may fall into a bucket of pain.
Let's begin my story.
I download the iso from http://www.ibiblio.org/gentoo/releases/build /. There is a choice of install images here. My favorite way of installing Gentoo is to compile everything, a time consuming process. This method requires a slim 16-meg iso. You may want to grab an iso with pre-built binaries to speed things up, however. This fat iso weighs in at 103 meg. I download the big one with the prebuilt binaries even though I won't use them -- just in case.
I boot my laptop with my shiny new Gentoo CD. The Gentoo install uses isolinux by Peter Anvin. I like the fact that they don't obscure it, giving credit where it is due. It boots quickly and there is a PCI autodetection process, it shouldn't find much on my laptop. Interesting, it loads a SCSI module. Perhaps it has detected my IDE CD burner. Usually this will detect any PCI NIC cards that are installed, but it does not detect my PCMCIA device (of course). After the PCI detection I get a command prompt. I use nano (a small text editor) to open up install.txt, the excellent install doc. Usually these docs are sufficient but the latest ones can be found here:
http://www.gentoo.org/doc/build.html
Keeping the install doc open in this virtual terminal, I hit alt-f2 to open a new one. I begin by loading the pcmcia drivers and installing networking. This is all done at the command line ( insmod, ifconfig, route, dhcpcd, etc.). I use nano to add my DNS servers to /etc/resolv.conf. A word of caution; get in the habit of always using the -w switch with nano. If you do not use the -w switch nano's word wrap feature will jack up your config files. I ping a reliable site, networking is up!
Next I partition my system using fdisk. I choose a simple layout with a swap partition, a root partition and a small boot partition. The boot partition remains unmounted during use, a nice precaution. For filesystems you have a choice of ext2, ext3, ReiserFS and XFS. In my personal experience I've noticed that Reiser performance really rocks when combined with SCSI drives, but as this is an IDE system I think I'll go with XFS. Besides, the XFS tools seem to be a lot more mature than the offerings from Reiser. I format and mount the partitions from the command line creating a /mnt/gentoo directory. I then untar the root filesystem; here I have the choice of the small tarball that requires you to compile everything or a larger tarball that contains pre-built binaries. If you untar the big guy you are almost finished with your install at this point. Using chroot and some scripts you chroot the /mnt/gentoo directory. From this point on you are operating under your new gentoo system.
The first thing I do under my chrooted system is issue this command:
emerge rsync
This downloads the latest version of the portage tree. The portage tree is found under /usr/portage and contains the ebuild scripts used to compile/install programs. Currently there are over 1000 up to date emerge sripts. Next I edit /etc/make.conf, here I can choose compiler settings. I optimize everything for i686. Now it's time to build the GNU compiler and libraries. I run the bootstrap script and leave for lunch. On my PIII 500 the boostrap process takes 2 hours and 2 minutes.
The second emerge command I issue is:
emerge system
Now emerge downloads, compiles and installs my base system packages. I sit back, relax and take the time to fax my legislators a rant about the DMCA. One hour and 30 minutes later it is finished.
Now it is time to download and install the kernel. First I make a link updating my timezone, and then I issue another emerge command:
emerge linux-sources
This grabs the latest kernel, 2.4.19, and drops the source in /usr/src/linux. Ten minutes have elapsed. Now comes the fun, compiling your kernel. That's right, everyone who installs Gentoo compiles their own kernel as a matter of process. I like this. There are some distributions out there that actually say you should never compile your own kernel. Shame on them. I use make menuconfig and the standard commands to compile my kernel. Since Gentoo uses devfs I select /dev file system support and I am also careful to compile in support for XFS. I don't have the kernel mount devfs automatically at boot as the Gentoo startup scripts take care of this for me. Virtual Memory file system support is also enabled.
At this point in time I get to choose a logger. My choices are sysklogd, syslog-ng or metalog. I choose metalog, because it's got the coolest name. I download, compile and install it using a single command:
emerge metalog
XFS has some nice utilities, I better install those. I have some other essential programs to install, and I'm feeling a bit lazy so I chain them all in one big command.
emerge xfsprogs;emerge bitchx;emerge vim;emerge links
At this point I'm feeling pretty 7-Up. I edit my /etc/fstab file, my /etc/hostname file and /etc/hosts. The passwd command is run to set the root passwd. I add my NIC module to the file /etc/modules.autoload and edit /etc/conf.d/net. conf.d/net allows me to configure my IP address and settings, default gateway and alias. I take a look at /etc/init.d/net.eth0, even though I don't need to edit it. I can then add it to the startup script using this command:rc-update add net.eth0 default
This adds the script to the default runlevel to be executed at startup. Startup scripts are another place Gentoo really shines. The startup scripts have a system of dependencies. For example net.eth0 can depend on pcmcia. The pcmcia drivers get loaded before net.eth0 - this is good.
Next I install grub. If you haven't used grub before, it's nice. You can boot to a kernel directly from the grub shell, without having to edit a config file. lilo is still available, for those of you who prefer it. Gentoo likes to let you make the decisions.
I exit my chrooted shell and unmount all directories. Reboot! Gentoo comes up and the install process is complete.
The Gentoo install process has taught me a lot about Linux, and I like the fact that the command line is embraced, instead of hidden behind gui or scripts. I also like the speed (which is debatable since all I can supply is anecdotal evidence). I wasn't too happy about waiting five hours for everything to compile, but I think it was worth it. I can tell you it compiles and greps noticeably faster than other distros I have run on the exact same machines. I really enjoy using portage, and the packages seem to stay up to date -- if not bleeding edge. This is not a conservative distribution like Debian, however I like the aggressive and intelligent direction gentoo is taking.
If you are considering trying out Gentoo I highly suggest #gentoo on irc.openprojects.net. Also subscribe to the mailing lists found at www.gentoo.org. The Gentoo community has helped me out of several jams in the past, I think they will treat you good too.
While writing this, I received help from a lot of people. However I would like to personally thank the people I ripped off word for word. Thanks notafurry of www.kuro5hin.org for your pointed help with the stilted second paragraph and thank you Ween from #gentoo on openprojects.net for your clean description of portage.
-
Google Relists Operation Clambake
DarkZero writes: "After almost every tech site and individual geek banded together to either carry the story about Google's delisting of Operation Clambake or flat-out protest it, Google has apparently relisted Xenu.net. Searches for 'xenu' and 'scientology' list Operation Clambake as the first and fourth results, respectively. The search for "scientology" also lists a story from C|Net about Google delisting Operation Clambake, as well as a protest ad from a Kuro5hin reader (oc3)." Update: 03/22 12:52 GMT by M : We jumped the gun. Google only relisted Xenu.net's homepage (where the copyright claims by Scientology were clearly bogus), not the rest of the pages listed in Scientology's DMCA complaint. Some Google sysadmin is getting aggravated because every 20 minutes, another memo from management is coming down telling him to alter the live database. -
The Myth of Open Source Security Revisited v2.0
Dare Obasanjo contributed this followup to an article entitled The Myth of Open Source Security Revisited that appeared on the website kuro5hin. He writes: "The original article tackled the common misconception amongst users of Open Source Software(OSS) that OSS is a panacea when it comes to creating secure software. The article presented anecdotal evidence taken from an article written by John Viega, the original author of GNU Mailman, to illustrate its point. This article follows up the anecdotal evidence presented in the original paper by providing an analysis of similar software applications, their development methodology and the frequency of the discovery of security vulnerabilities." Read on below for his detailed analysis, especially relevant with the currency of security initiatives in the worlds of both open- and closed-source software.
The Myth of Open Source Security Revisited v2.0 The purpose of this article is to expose the fallacy of the belief in the "inherent security" of Open Source software and instead point to a truer means of ensuring the quality of the security of a piece software is high.
Apples, Oranges, Penguins and Daemons
When performing experiments to confirm a hypothesis on the effect of a particular variable on an event or observable occurence, it is common practice to utilize control groups. In an attempt to establish cause and effect in such experiments, one tries to hold all variables that may affect the outcome constant except for the variable that the experiment is interested in. Comparisons of the security of software created by Open Source processes and software produced in a proprietary manner have typically involved several variables besides development methodology.
A number of articles have been written that compare the security of Open Source development to proprietary development by comparing security vulnerabilities in Microsoft products to those in Open Source products. Noted Open Source pundit, Eric Raymond wrote an article on NewsForge where he compares Microsoft Windows and IIS to Linux, BSD and Apache. In the article, Eric Raymond states that Open Source development implies that "security holes will be infrequent, the compromises they cause will be relatively minor, and fixes will be rapidly developed and deployed." However, upon investigation it is disputable that Linux distributions have less frequent or more minor security vulnerabilities when compared to recent versions of Windows. In fact the belief in the inherent security of Open Source software over proprietary software seems to be the product of a single comparison, Apache versus Microsoft IIS.
There are a number of variables involved when one compares the security of software such as Microsoft Windows operating systems to Open Source UNIX-like operating systems including the disparity in their market share, the requirements and dispensations of their user base, and the differences in system design. To better compare the impact of source code licensing on the security of the software, it is wise to reduce the number of variables that will skew the conclusion. To this effect it is best to compare software with similar system design and user base than comparing software applications that are significantly distinct. The following section analyzes the frequency of the discovery of security vulnerabilities in UNIX-like operating systems including HP-UX, FreeBSD, RedHat Linux, OpenBSD, Solaris, Mandrake Linux, AIX and Debian GNU/Linux.
Security Vulnerability Face-Off
Below is a listing of UNIX and UNIX-like operating systems with the number of security vulnerabilities that were discovered in them in 2001 according to the Security Focus Vulnerability Archive. AIX 10 vulnerabilities[6 remote, 3 local, 1 both] Debian GNU/Linux 13 vulnerabilities[1 remote, 12 local] + 1 Linux kernel vulnerability[1 local] FreeBSD 24 vulnerabilities[12 remote, 9 local, 3 both] HP-UX 25 vulnerabilities[12 remote, 12 local, 1 both] Mandrake Linux 17 vulnerabilities[5 remote, 12 local] + 12 Linux kernel vulnerabilities[5 remote, 7 local] OpenBSD 13 vulnerabilities[7 remote, 5 local, 1 both] Red Hat Linux 28 vulnerabilities[5 remote, 22 local, 1 unknown] + 12 Linux kernel vulnerabilities[6 remote, 6 local] Solaris 38 vulnerabilities[14 remote, 22 local, 2 both] From the above listing one can infer that source licensing is not a primary factor in determining how prone to security flaws a software application will be. Specifically proprietary and Open Source UNIX family operating systems are represented on both the high and low ends of the frequency distribution.
Factors that have been known to influence the security and quality of a software application are practices such as code auditing (peer review), security-minded architecture design, strict software development practices that restrict certain dangerous programming constructs (e.g. using the str* or scanf* family of functions in C) and validation & verification of the design and implementation of the software. Also reducing the focus on deadlines and only shipping when the system the system is in a satisfactory state is important.
Both the Debian and OpenBSD projects exhibit many of the aforementioned characteristics which help explain why they are the Open Source UNIX operating systems with the best security record. Debian's track record is particularly impressive when one realizes that the Debian Potato consists of over 55 million lines of code (compared to RedHat's 30,000,000 lines of code).
The Road To Secure Software
Exploitable security vulnerabilities in a software application are typically evidence of bugs in the design or implementation of the application. Thus the process of writing secure software is an extension of the process behind writing robust, high quality software. Over the years a number of methodolgies have been developed to tackle the problem of producing high quality software in a repeatable manner within time and budgetary constraints. The most successful methodologies have typically involved using the following software quality assurance, validation and verification techniques; formal methods, code audits, design reviews, extensive testing and codified best practices.-
Formal Methods: One can use formal proofs based on mathematical
methods and rigor to verify the correctness of software algorithms. Tools
for specifying software using formal techniques exist such as VDM and Z.
Z (pronounced 'zed') is a formal specification notation based on set
theory and first order predicate logic. VDM stands for "The Vienna
Development Method" which consists of a specification language called
VDM-SL, rules for data and operation refinement which allow one to
establish links between abstract requirements specifications and
detailed design specifications down to the level of code, and a proof
theory in which rigorous arguments can be conducted about the properties
of specified systems and the correctness of design decisions.The
previous descriptions were taken from the
Z FAQ and the
VDM FAQ
respectively. A comparison of both specification languages is
available in the paper,
Understanding the differences between VDM and Z
by I.J. Hayes et al.
-
Code Audits: Reviews of source code by developers other than the
author of the code are good ways to catch errors that may have been
overlooked by the original developer. Source code audits can vary from
informal reviews with little structure to formal code inspections or
walkthroughs. Informal reviews typically involve the developer sending
the reviewers source code or descriptions of the software for feedback
on any bugs or design issues. A walkthrough involves the detailed
examination of the source code of the software in question by one or more
reviewers. An inspection is a formal process where a detailed examination
of the source code is directed by reviewers who act in certain roles. A
code inspection is directed by a "moderator", the source code is read by a
"reader" and issues are documented by a "scribe".
-
Testing: The purpose of testing is to find failures. Unfortunately,
no known software testing method can discover all possible failures that
may occur in a faulty application and metrics to establish such details
have not been forthcoming. Thus a correlation between the quality of a
software application and the amount of testing it has endured is
practically non-existent.
There are various categories of tests including unit, component, system, integration, regression, black-box, and white-box tests. There is some overlap in the aforementioned mentioned testing categories.
Unit testing involves testing small pieces of functionality of the application such as methods, functions or subroutines. In unit testing it is usual for other components that the software unit interacts with to be replaced with stubs or dummy methods. Component tests are similar to unit tests with the exception that dummmy and stub methods are replaced with the actual working versions. Integration testing involves testing related components that communicate with each other while system tests involve testing the entire system after it has been built. System testing is necessary even if extensive unit or component testing has occured because it is possible for seperate subroutines to work individually but fail when invoked sequentialy due to side effects or some error in programmer logic. Regression testing involves the process of ensuring that modifications to a software module, component or system have not introduced errors into the software. A lack of sufficient regression testing is one of the reasons why certain software patches break components that worked prior to installation of the patch.
Black-box testing also called functional testing or specification testing test the behavior of the component or system without requiring knowledge of the internal structure of the software. Black-box testing is typically used to test that software meets its functional requirements. White-box testing also called structural or clear-box testing involves tests that utilize knowledge of the internal structure of the software. White-box testing is useful in ensuring that certain statements in the program are excercised and errors discovered. The existence of code coverage tools aid in discovering what percentages of a system are being excercised by the tests.
More information on testing can be found at the comp.software.testing FAQ .
-
Design Reviews: The architecture of a software application can be
reviewed in a formal process called a design review. In design reviews the
developers, domain experts and users examine that the design of the
system meets the requirements and that it contains no significant flaws
of omission or commission before implementation occurs.
-
Codified Best Practices: Some programming languages have libraries
or language features that are prone to abuse and are thus prohibited in
certain disciplined software projects. Functions like
strcpy,gets, andscanfin C are examples of library functions that are poorly designed and allow malicious individuals to use buffer overflows or format string attacks to exploit the security vulnerabilities exposed by using these functions. A number of platforms explicitly disallowgetsespecially since alternatives exist. Programming guidelines for such as those written by Peter Galvin in a Unix Insider article on designing secure software are used by development teams to reduce the likelihood of security vulnerabilities in software applications.
Issues Preventing Development of Secure Open Source Software
One of the assumptions that is typically made about Open Source software is that the availability of source code translates to "peer review" of the software application. However, the anecdotal experience of a number of Open Source developers including John Viega belies this assumption.
The term "peer review" implies an extensive review of the source code of an application by competent parties. Many Open Source projects do not get peer reviewed for a number of reasons including- complexity of code in addition to a lack of documentation makes it
difficult for casual users to understand the code enough to give a
proper review
- developers making improvements to the application typically focus
only on the parts of the application that will affect the feature to be
added instead of the whole system.
- ignorance of developers to security concerns.
- complacency in the belief that since the source is available that
it is being reviewed by others.
Benefits of Open Source to Security-Conscious Users
Despite the fact that source licensing and source code availability are not indicators of the security of a software application, there is still a significant benefit of Open Source to some users concerned about security. Open Source allows experts to audit their software options before making a choice and also in some cases to make improvements without waiting for fixes from the vendor or source code maintainer.
One should note that there are constraints on the feasibility of users auditing the software based on the complexity and size of the code base. For instance, it is unlikely that a user who wants to make a choice of using Linux as a web server for a personal homepage will scrutinize the TCP/IP stack code.
References- Frankl, Phylis et al. Choosing a Testing Method to Deliver
Reliability. Proceedings of the 19th International Conference on
Software Engineering, pp. 68--78, ACM Press, May 1997.
<
http://citeseer.nj.nec.com/frankl97choosing.html
>
- Hamlet, Dick. Software Quality, Software Process, and
Software Testing. 1994. <
http://citeseer.nj.nec.com/hamlet94software.html
>
-
Hayes, I.J., C.B. Jones and J.E. Nicholls. Understanding the
differences between VDM and Z. Technical Report UMCS-93-8-1,
University of Manchester, Computer Science Dept., 1993.
<
http://citeseer.nj.nec.com/hayes93understanding.ht ml >
-
Miller, Todd C. and Theo De Raadt. strlcpy and strlcat - consistent,
safe, string copy and concatenation. Proceedings of the 1999 USENIX
Annual Technical Conference, FREENIX Track, June 1999.
<
http://www.usenix.org/events/usenix99/full_papers/ millert/millert_html/
>
-
Viega, John. The Myth of Open Source Security. Earthweb.com.
<
http://www.earthweb.com/article/0,,10455_626641,00 .html >
- Gonzalez-Barona, Jesus M. et al. Counting Potatoes: The Size of
Debian 2.2. <
http://people.debian.org/~jgb/debian-counting/coun ting-potatoes/
>
-
Wheeler, David A. More Than A Gigabuck: Estimating GNU/Linux's Size.
<
http://www.counterpane.com/crypto-gram-0003.html
>
Acknowledgements
The following people helped in proofreading this article and/or offering suggestions about content: Jon Beckham, Graham Keith Coleman, Chris Bradfield, and David Dagon. © 2002 Dare Obasanjo -
Formal Methods: One can use formal proofs based on mathematical
methods and rigor to verify the correctness of software algorithms. Tools
for specifying software using formal techniques exist such as VDM and Z.
Z (pronounced 'zed') is a formal specification notation based on set
theory and first order predicate logic. VDM stands for "The Vienna
Development Method" which consists of a specification language called
VDM-SL, rules for data and operation refinement which allow one to
establish links between abstract requirements specifications and
detailed design specifications down to the level of code, and a proof
theory in which rigorous arguments can be conducted about the properties
of specified systems and the correctness of design decisions.The
previous descriptions were taken from the
Z FAQ and the
VDM FAQ
respectively. A comparison of both specification languages is
available in the paper,
Understanding the differences between VDM and Z
by I.J. Hayes et al.
-
The Myth of Open Source Security Revisited v2.0
Dare Obasanjo contributed this followup to an article entitled The Myth of Open Source Security Revisited that appeared on the website kuro5hin. He writes: "The original article tackled the common misconception amongst users of Open Source Software(OSS) that OSS is a panacea when it comes to creating secure software. The article presented anecdotal evidence taken from an article written by John Viega, the original author of GNU Mailman, to illustrate its point. This article follows up the anecdotal evidence presented in the original paper by providing an analysis of similar software applications, their development methodology and the frequency of the discovery of security vulnerabilities." Read on below for his detailed analysis, especially relevant with the currency of security initiatives in the worlds of both open- and closed-source software.
The Myth of Open Source Security Revisited v2.0 The purpose of this article is to expose the fallacy of the belief in the "inherent security" of Open Source software and instead point to a truer means of ensuring the quality of the security of a piece software is high.
Apples, Oranges, Penguins and Daemons
When performing experiments to confirm a hypothesis on the effect of a particular variable on an event or observable occurence, it is common practice to utilize control groups. In an attempt to establish cause and effect in such experiments, one tries to hold all variables that may affect the outcome constant except for the variable that the experiment is interested in. Comparisons of the security of software created by Open Source processes and software produced in a proprietary manner have typically involved several variables besides development methodology.
A number of articles have been written that compare the security of Open Source development to proprietary development by comparing security vulnerabilities in Microsoft products to those in Open Source products. Noted Open Source pundit, Eric Raymond wrote an article on NewsForge where he compares Microsoft Windows and IIS to Linux, BSD and Apache. In the article, Eric Raymond states that Open Source development implies that "security holes will be infrequent, the compromises they cause will be relatively minor, and fixes will be rapidly developed and deployed." However, upon investigation it is disputable that Linux distributions have less frequent or more minor security vulnerabilities when compared to recent versions of Windows. In fact the belief in the inherent security of Open Source software over proprietary software seems to be the product of a single comparison, Apache versus Microsoft IIS.
There are a number of variables involved when one compares the security of software such as Microsoft Windows operating systems to Open Source UNIX-like operating systems including the disparity in their market share, the requirements and dispensations of their user base, and the differences in system design. To better compare the impact of source code licensing on the security of the software, it is wise to reduce the number of variables that will skew the conclusion. To this effect it is best to compare software with similar system design and user base than comparing software applications that are significantly distinct. The following section analyzes the frequency of the discovery of security vulnerabilities in UNIX-like operating systems including HP-UX, FreeBSD, RedHat Linux, OpenBSD, Solaris, Mandrake Linux, AIX and Debian GNU/Linux.
Security Vulnerability Face-Off
Below is a listing of UNIX and UNIX-like operating systems with the number of security vulnerabilities that were discovered in them in 2001 according to the Security Focus Vulnerability Archive. AIX 10 vulnerabilities[6 remote, 3 local, 1 both] Debian GNU/Linux 13 vulnerabilities[1 remote, 12 local] + 1 Linux kernel vulnerability[1 local] FreeBSD 24 vulnerabilities[12 remote, 9 local, 3 both] HP-UX 25 vulnerabilities[12 remote, 12 local, 1 both] Mandrake Linux 17 vulnerabilities[5 remote, 12 local] + 12 Linux kernel vulnerabilities[5 remote, 7 local] OpenBSD 13 vulnerabilities[7 remote, 5 local, 1 both] Red Hat Linux 28 vulnerabilities[5 remote, 22 local, 1 unknown] + 12 Linux kernel vulnerabilities[6 remote, 6 local] Solaris 38 vulnerabilities[14 remote, 22 local, 2 both] From the above listing one can infer that source licensing is not a primary factor in determining how prone to security flaws a software application will be. Specifically proprietary and Open Source UNIX family operating systems are represented on both the high and low ends of the frequency distribution.
Factors that have been known to influence the security and quality of a software application are practices such as code auditing (peer review), security-minded architecture design, strict software development practices that restrict certain dangerous programming constructs (e.g. using the str* or scanf* family of functions in C) and validation & verification of the design and implementation of the software. Also reducing the focus on deadlines and only shipping when the system the system is in a satisfactory state is important.
Both the Debian and OpenBSD projects exhibit many of the aforementioned characteristics which help explain why they are the Open Source UNIX operating systems with the best security record. Debian's track record is particularly impressive when one realizes that the Debian Potato consists of over 55 million lines of code (compared to RedHat's 30,000,000 lines of code).
The Road To Secure Software
Exploitable security vulnerabilities in a software application are typically evidence of bugs in the design or implementation of the application. Thus the process of writing secure software is an extension of the process behind writing robust, high quality software. Over the years a number of methodolgies have been developed to tackle the problem of producing high quality software in a repeatable manner within time and budgetary constraints. The most successful methodologies have typically involved using the following software quality assurance, validation and verification techniques; formal methods, code audits, design reviews, extensive testing and codified best practices.-
Formal Methods: One can use formal proofs based on mathematical
methods and rigor to verify the correctness of software algorithms. Tools
for specifying software using formal techniques exist such as VDM and Z.
Z (pronounced 'zed') is a formal specification notation based on set
theory and first order predicate logic. VDM stands for "The Vienna
Development Method" which consists of a specification language called
VDM-SL, rules for data and operation refinement which allow one to
establish links between abstract requirements specifications and
detailed design specifications down to the level of code, and a proof
theory in which rigorous arguments can be conducted about the properties
of specified systems and the correctness of design decisions.The
previous descriptions were taken from the
Z FAQ and the
VDM FAQ
respectively. A comparison of both specification languages is
available in the paper,
Understanding the differences between VDM and Z
by I.J. Hayes et al.
-
Code Audits: Reviews of source code by developers other than the
author of the code are good ways to catch errors that may have been
overlooked by the original developer. Source code audits can vary from
informal reviews with little structure to formal code inspections or
walkthroughs. Informal reviews typically involve the developer sending
the reviewers source code or descriptions of the software for feedback
on any bugs or design issues. A walkthrough involves the detailed
examination of the source code of the software in question by one or more
reviewers. An inspection is a formal process where a detailed examination
of the source code is directed by reviewers who act in certain roles. A
code inspection is directed by a "moderator", the source code is read by a
"reader" and issues are documented by a "scribe".
-
Testing: The purpose of testing is to find failures. Unfortunately,
no known software testing method can discover all possible failures that
may occur in a faulty application and metrics to establish such details
have not been forthcoming. Thus a correlation between the quality of a
software application and the amount of testing it has endured is
practically non-existent.
There are various categories of tests including unit, component, system, integration, regression, black-box, and white-box tests. There is some overlap in the aforementioned mentioned testing categories.
Unit testing involves testing small pieces of functionality of the application such as methods, functions or subroutines. In unit testing it is usual for other components that the software unit interacts with to be replaced with stubs or dummy methods. Component tests are similar to unit tests with the exception that dummmy and stub methods are replaced with the actual working versions. Integration testing involves testing related components that communicate with each other while system tests involve testing the entire system after it has been built. System testing is necessary even if extensive unit or component testing has occured because it is possible for seperate subroutines to work individually but fail when invoked sequentialy due to side effects or some error in programmer logic. Regression testing involves the process of ensuring that modifications to a software module, component or system have not introduced errors into the software. A lack of sufficient regression testing is one of the reasons why certain software patches break components that worked prior to installation of the patch.
Black-box testing also called functional testing or specification testing test the behavior of the component or system without requiring knowledge of the internal structure of the software. Black-box testing is typically used to test that software meets its functional requirements. White-box testing also called structural or clear-box testing involves tests that utilize knowledge of the internal structure of the software. White-box testing is useful in ensuring that certain statements in the program are excercised and errors discovered. The existence of code coverage tools aid in discovering what percentages of a system are being excercised by the tests.
More information on testing can be found at the comp.software.testing FAQ .
-
Design Reviews: The architecture of a software application can be
reviewed in a formal process called a design review. In design reviews the
developers, domain experts and users examine that the design of the
system meets the requirements and that it contains no significant flaws
of omission or commission before implementation occurs.
-
Codified Best Practices: Some programming languages have libraries
or language features that are prone to abuse and are thus prohibited in
certain disciplined software projects. Functions like
strcpy,gets, andscanfin C are examples of library functions that are poorly designed and allow malicious individuals to use buffer overflows or format string attacks to exploit the security vulnerabilities exposed by using these functions. A number of platforms explicitly disallowgetsespecially since alternatives exist. Programming guidelines for such as those written by Peter Galvin in a Unix Insider article on designing secure software are used by development teams to reduce the likelihood of security vulnerabilities in software applications.
Issues Preventing Development of Secure Open Source Software
One of the assumptions that is typically made about Open Source software is that the availability of source code translates to "peer review" of the software application. However, the anecdotal experience of a number of Open Source developers including John Viega belies this assumption.
The term "peer review" implies an extensive review of the source code of an application by competent parties. Many Open Source projects do not get peer reviewed for a number of reasons including- complexity of code in addition to a lack of documentation makes it
difficult for casual users to understand the code enough to give a
proper review
- developers making improvements to the application typically focus
only on the parts of the application that will affect the feature to be
added instead of the whole system.
- ignorance of developers to security concerns.
- complacency in the belief that since the source is available that
it is being reviewed by others.
Benefits of Open Source to Security-Conscious Users
Despite the fact that source licensing and source code availability are not indicators of the security of a software application, there is still a significant benefit of Open Source to some users concerned about security. Open Source allows experts to audit their software options before making a choice and also in some cases to make improvements without waiting for fixes from the vendor or source code maintainer.
One should note that there are constraints on the feasibility of users auditing the software based on the complexity and size of the code base. For instance, it is unlikely that a user who wants to make a choice of using Linux as a web server for a personal homepage will scrutinize the TCP/IP stack code.
References- Frankl, Phylis et al. Choosing a Testing Method to Deliver
Reliability. Proceedings of the 19th International Conference on
Software Engineering, pp. 68--78, ACM Press, May 1997.
<
http://citeseer.nj.nec.com/frankl97choosing.html
>
- Hamlet, Dick. Software Quality, Software Process, and
Software Testing. 1994. <
http://citeseer.nj.nec.com/hamlet94software.html
>
-
Hayes, I.J., C.B. Jones and J.E. Nicholls. Understanding the
differences between VDM and Z. Technical Report UMCS-93-8-1,
University of Manchester, Computer Science Dept., 1993.
<
http://citeseer.nj.nec.com/hayes93understanding.ht ml >
-
Miller, Todd C. and Theo De Raadt. strlcpy and strlcat - consistent,
safe, string copy and concatenation. Proceedings of the 1999 USENIX
Annual Technical Conference, FREENIX Track, June 1999.
<
http://www.usenix.org/events/usenix99/full_papers/ millert/millert_html/
>
-
Viega, John. The Myth of Open Source Security. Earthweb.com.
<
http://www.earthweb.com/article/0,,10455_626641,00 .html >
- Gonzalez-Barona, Jesus M. et al. Counting Potatoes: The Size of
Debian 2.2. <
http://people.debian.org/~jgb/debian-counting/coun ting-potatoes/
>
-
Wheeler, David A. More Than A Gigabuck: Estimating GNU/Linux's Size.
<
http://www.counterpane.com/crypto-gram-0003.html
>
Acknowledgements
The following people helped in proofreading this article and/or offering suggestions about content: Jon Beckham, Graham Keith Coleman, Chris Bradfield, and David Dagon. © 2002 Dare Obasanjo -
Formal Methods: One can use formal proofs based on mathematical
methods and rigor to verify the correctness of software algorithms. Tools
for specifying software using formal techniques exist such as VDM and Z.
Z (pronounced 'zed') is a formal specification notation based on set
theory and first order predicate logic. VDM stands for "The Vienna
Development Method" which consists of a specification language called
VDM-SL, rules for data and operation refinement which allow one to
establish links between abstract requirements specifications and
detailed design specifications down to the level of code, and a proof
theory in which rigorous arguments can be conducted about the properties
of specified systems and the correctness of design decisions.The
previous descriptions were taken from the
Z FAQ and the
VDM FAQ
respectively. A comparison of both specification languages is
available in the paper,
Understanding the differences between VDM and Z
by I.J. Hayes et al.