Domain: gutenberg.org
Stories and comments across the archive that link to gutenberg.org.
Stories · 15
-
Project Gutenberg Blocks German Users After Outrageous Court Ruling (teleread.org)
Slashdot reader David Rothman writes: The oldest public domain publisher in the world, Project Gutenberg, has blocked German users after an outrageous legal ruling saying this American nonprofit must obey German copyright law... Imagine the technical issues for fragile, cash-strapped public domain organizations -- worrying not only about updated databases covering all the world's countries, but also applying the results to distribution. TeleRead carries two views on the German case involving a Holtzbrinck subsidiary...
Significantly, older books provide just a tiny fraction of the revenue of megaconglomerates like Holtzbrinck but are essential to students of literature and indeed to students in general. What's more, as illustrated by the Sonny Bono Copyright Term Extension Act in the U.S., copyright law in most countries tends to reflect the wishes and power of lobbyists more than it does the commonweal. Ideally the travails of Project Gutenberg will encourage tech companies, students, teachers, librarians and others to step up their efforts against oppressive copyright laws. While writers and publishers deserve fair compensation, let's focus more on the needs of living creators and less on the estates of authors dead for many decades. The three authors involved in the German case are Heinrich Mann (died in 1950), Thomas Mann (1955) and Alfred Döblin (1957).
One solution in the U.S. and elsewhere for modern creators would be national library endowments... Meanwhile, it would be very fitting for Google and other deep-pocketed corporations with an interest in a global Internet and more balanced copyright to help Gutenberg finance its battle. Law schools, other academics, educators and librarians should also offer assistance. -
Text Analyzer Reveals Emotional 'Temperature' of Novels and Fairy Tales
KentuckyFC writes "Stories are a powerful channel for communicating emotions. But while they have been studied in detail by generations of critics, there is little in the way of objective tools for analyzing and comparing their emotional content. That looks set to change thanks to one data mining researcher who has applied the process of sentiment analysis to novels and fairy tales that have been digitized on Project Gutenburg and the Google Books Corpus. The results show the density of emotions in different parts of a story and how the emotional 'temperature' changes throughout the tale. For example, this guy has used the technique to compare the emotional content of the entire collection of the Brothers Grimm fairy tales to reveal that the darkest story is a tale called Gambling Hansel; clearly a lesson to us all." -
25000 Books Proofread By Project Gutenberg Distributed Proofreaders
New submitter fritsd writes "Project Gutenberg Distributed Proofreaders, a volunteer site which helps provide public domain books to Project Gutenberg, announced that their 100 000+ volunteers have reached the milestone of 25 000 books scanned, OCRed, and then meticulously proofread." The 25000th title is The Art and Practice of Silver Printing by Capt. Abney and H. P. Robinson. -
25000 Books Proofread By Project Gutenberg Distributed Proofreaders
New submitter fritsd writes "Project Gutenberg Distributed Proofreaders, a volunteer site which helps provide public domain books to Project Gutenberg, announced that their 100 000+ volunteers have reached the milestone of 25 000 books scanned, OCRed, and then meticulously proofread." The 25000th title is The Art and Practice of Silver Printing by Capt. Abney and H. P. Robinson. -
Do E-Readers Spell the Demise Of Traditional Schooling?
Attila Dimedici writes "I came across a an article this morning that suggests that the Nook and the Kindle have changed things in such a way that schools are becoming obsolete. His premise is that the ideal way to teach children is by a tutor ..., [and] the Nook and the Kindle have allowed large amounts of written material on many different subjects to become accessible enough that parents can tutor their children at a price that just about everyone can afford." The author is a bit off-base on the nature of the public schooling, but easy access to resources like Project Gutenberg and Wikibooks certainly removes some barriers to self-study and the limitations of the 20+ child classroom. -
Michael Hart, Inventor of the E-book, Dead At 64
FeatherBoa writes "Michael Hart, the founder and long time driving force behind Project Gutenberg and 1971 inventor of the electronic book has died at his home in Urbana Ill, on Sept. 6th 2011. Project Gutenberg is recognized as one of the earliest and longest-lasting online literary projects, has spawned sister projects in Australia, Canada, Germany and other locations to transcribe public domain literature and make it available via the Internet." -
On iPhone, Searching For Kama Sutra = Porn
heychris writes "Eucalyptus, an ebook app for iPhone, has been rejected from the App Store for 'objectionable content.' What's so objectionable? The Kama Sutra, available from Project Gutenberg, which is available on other ebook readers as well. Not only that, but the screenshot shows that you would have to search for Kama Sutra to get it; it's not built in to Eucalyptus. The author is reasonable but frustrated, while Herr Gruber is more succinct." I wonder how good the now-cheap Nokia 810 is as an e-book reader. -
OLPC To Be Distributed To US Students
eldavojohn writes "The One Laptop Per Child Project plans to launch OLPC America in 2008 , to distribute the low-cost laptop computers originally intended for developing nations to needy students here in the United States. Nicholas Negroponte is quoted as saying, 'We are doing something patriotic, if you will, after all we are and there are poor children in America. The second thing we're doing is building a critical mass. The numbers are going to go up, people will make more software, it will steer a larger development community.'" -
SGI Arises From the Ashes
eldavojohn writes "Six months ago, Slashdot reported on SGI's filing of Chapter Eleven Bankruptcy. I wondered why Slashdot kept the Silicon Graphics category with them now defunct. But Chapter Eleven means a reorganization — not liquidation. And, surprisingly, SGI has dusted itself off and stood back up. What did they dust off? About $150 million worth of spending a year. Will this reorganization put them back as a player in the graphics game? Maybe but as the article notes, they have some stiff competition that offer comparable services for less money. Is this a phoenix story or the final death throes of the company?" To be honest, no one here suspected a thing. We just keep the old topics around so it's still possible to find old stories related to them. Sometimes (like now!) they even still come in handy. -
A Repository for Multimedia in the Public Domain?
8tim8 asks: "I was looking through my uni's record library yesterday (where they have lots of old jazz records) and it made me wonder, Is there anything like Project Gutenberg for audio or video files? I'm not talking about just a place to download old audio or video files, I'm talking about somewhere that has lots of old broadcasts/movies, and has actually checked to verify that what they have is in the public domain. It seems like there must be lots of stuff in the public domain...is there a place that let's people access it?" -
The Early History of Nupedia and Wikipedia, Part II
Today, read the continuation of Larry Sanger's account of the early history of Nupedia and Wikipedia (below), in which Sanger talks about the difficulties of governance in a large, free-wheeling project, some final attempts to save Nupedia, and how he came to resign from the organization. (And if you missed it, you might want to start with yesterday's installment.)Contents:
Why Wikipedia started working
A series of controversies
The governance challenge
My resignation and final few months with the project
Some final attempts to save Nupedia
ConclusionsWhy Wikipedia started working
This is a good place to explain why Wikipedia actually got started and why it worked (and still does work, at least as well as it does). The explanation involves a combination of quite a few factors, some borrowed from the open source movement, some borrowed from wiki software and culture, and some more idiosyncratic:
- Open content license. We promised contributors that their work would always remain free for others to read. This, as is well known, motivates people to work for the good of the world--and for the many people who would like to teach the whole world, that's a pretty strong motivation.
- Focus on the encyclopedia. We said that we were creating an encyclopedia, not a dictionary, etc., and we encouraged people to stick to creating the encyclopedia and not use the project as a debate forum.
- Openness. Anyone could contribute. Everyone was specifically made to feel welcome. (E.g., we encouraged the habit of writing on new contributors' user pages, "Welcome to Wikipedia!" etc.) There was no sense that someone would be turned away for not being bright enough, or not being a good enough writer, or whatever.
- Ease of editing. Wikis are pretty easy for most people to figure out. In other collaborative systems (like Nupedia), you have to learn all about the system first. Wikipedia had an almost flat learning curve.
- Collaborate radically; don't sign articles. Radical collaboration, in which (in principle) anyone can edit any part of anyone else's work, is one of the great innovations of the open source software movement. On Wikipedia, radical collaboration made it possible for work to move forward on all fronts at the same time, to avoid the big bottleneck that is the individual author, and to burnish articles on popular topics to a fine luster.
- Offer unedited, unapproved content for further development. This is required if one wishes to collaborate radically. We encouraged putting up their unfinished drafts--as long as they were at least roughly correct--with the idea that they can only improve if there are others collaborating. This is a classic principle of open source software. It helped get Wikipedia started and helped keep it moving. This is why so many original drafts of Wikipedia articles were basically garbage (no offense to anyone--some of my own drafts were sometimes garbage), and also why it is surprising to the uninitiated that many articles have turned out very well indeed.
- Neutrality. A firm neutrality policy made it possible for people of widely divergent opinions to work together, without constantly fighting. It's a way to keep the peace.
- Start with a core of good people. I think it was essential that we began the project with a core group of intelligent good writers who understood what an encyclopedia should look like, and who were basically decent human beings.
- Enjoy the Google effect. We had little to do with this, but had Google not sent us an increasing amount of traffic each time they spidered the growing website, we would not have grown nearly as fast as we did. (See below.)
That's pretty much it. The focus on the encyclopedia provided the task and the open content license provided a natural motivation: people work hard if they believe they are teaching the world stuff. Openness and ease of editing made it easy for new people to join in and get to work. Collaboration helped move work forward quickly and efficiently, and posting unedited drafts made collaboration possible. The fact that we started with a core of good people from Nupedia meant that the project could develop a functional, cooperative community. Neutrality made it easy for people to work together with relatively little conflict. And the Google effect provided a steady supply of "fresh blood"--who in turn supplied increasing amounts of content.
Probably, all or nearly all other project rules were either optional, or straightforward applications of these principles. The project probably would still have succeeded nicely even if it had moderated or tweaked some of the above principles. For instance, radical openness, that is, being open even to those who brazenly flouted and disrespected the project's mission, was surely not necessary; after all, without them, the project would have been more welcoming to the many people who felt they could not work with such difficult people. And if we had required people to sign in, that would not have made very much difference (although it probably would have made some in the beginning; the project wouldn't have grown as fast). Of course we didn't have to use the GNU FDL for the license. Certainly, we did not need to set the community up initially as an anarchy governed by some vague consensus: instead, we could have adopted a charter from the very start. The project could have been managed quite differently; there could have been specially-designated and well-qualified editors. The project could have officially encouraged and deferred to experts. An article approval process could have been adopted without threatening the principle of posting unedited content for collaboration. Certainly, many of the later bells and whistles--the arbitration committee, a three-revert rule, having administrators with the particular configuration of rights they have, etc.--were not absolutely necessary to adopt in the precise forms they took. These differences would not have threatened the basic principles that made the project work, listed above.
So the basic principles that explain why Wikipedia could start working--and still does work--are relatively simple, few in number, and above all general. The more specific principles that Wikipedia wound up with was a matter of historical accident. There was a great deal of "wiggle room." Those intent on studying or replicating the Wikipedia model would do well to bear that in mind.
A series of controversies
So much for the very early history of Wikipedia; the next phase involved rapid growth and some serious internal controversies over policy and authority. If Wikipedia's basic policy was settled upon in the first nine months, its culture was solidified into something closer to its present form in the next nine.
The project continued to grow. We had 6000 articles by July 8; 8000 by August 7; 11,200 by September 9; and 13,000 by October 4. Consulting the website logs, we noted a Google effect: each time Google spidered the website, more pages would be indexed; the greater the number of pages indexed, the more people arrived at the project; the more people involved in the project, the more pages there were to index. In addition to this source of new contributors, Wikipedia was Slashdotted several times, and had large influxes of new users particularly after two articles I wrote for Kuro5hin were posted on Slashdot: "Britannica or Nupedia? The Future of Free Encyclopedias (July 25, 2001) and Wikipedia is wide open. Why is it growing so fast? Why isn't it full of nonsense? (September 24, 2001).
This growth brought difficult challenges, challenges that perhaps I did not sufficiently anticipate and plan for. Some of our earliest contributors were academics and other highly-qualified people, and it seems to me that they were slowly worn down and driven away by having to deal with difficult people on the project. I hope they will not mind that I mention their names, but the two that stick in my mind are J. Hoffman Kemp and Michael Tinkler, a couple of Ph.D. historians. They helped to set what I think was a good precedent for the project in that they wrote about their own areas of expertise, and they contributed under their own, real names. The latter has the salutary effect of making the contributor more serious and more apt to take responsibility for his or her contributions. They are also very nice people, but did not "suffer fools gladly," as the phrase goes. Consequently, they wound up in some pretty silly disputes that would have driven less patient people away instantly. So there was a growing problem: persistent and difficult contributors tend to drive away many better, more valuable contributors; Kemp and Tinkler were only two examples. There were many more who quietly came and quietly left. Short of removing the problem contributors altogether--which we did only in the very worst cases--there was no easy solution, under the system as we had set it up. And I am sorry to have to admit that those aspects of the system that led to this problem were as much my responsibility as anyone else's. Obviously, I would not design the system the same way if given the chance again.
As a result, I grew both more protective of the project and increasingly sensitive to abuse of the system. As I tried to exercise what little authority I claimed, as a corrective to such abuse, many newer arrivals on the scene made great sport of challenging my authority. One of the earliest challenges happened in late summer, 2001. The front page of Wikipedia--then open to anyone to edit, like any other page on the project--was occasionally vandalized with infantile graffiti. Someone then tried to make an archive of the vandalism that had been done to the front page of Wikipedia. I maintained that to make such an archive would be to encourage such vandalism, so I deleted the archive. This occasioned much debate. Then a user made the archive a "subpage" of his own user page--and user pages were generally held to be the bailiwick of the user. Consequently I deleted that subpage, which occasioned a further hue and cry that, perhaps, I was abusing my authority. The vandalism-enshrining user in question proceeded to create a "deleted pages" page, on which the deleted vandalism archives were listed, as if to accuse me of trying to act without public scrutiny; but this was, of course, perfectly acceptable to me. At the time, I thought that this controversy was just as silly as it will sound to most people reading this; I thought that I needed only to "put my foot down" a little harder and, as had happened for the first six months of the project, participants would fall into line. What I did not realize was that this was to be only the first in a long series of controversies, the ultimate upshot of which was to undermine my own moral authority over the project and to make the project as safe as possible for the most abusive and contentious contributors.
Throughout this and other early controversies, much of the debate about project policy was conducted on the wiki itself. Other debates were conducted on mailing lists, Wikipedia-L and then later, for the English language project, WikiEN-L. In addition, people had taken to putting their own essays on Wikipedia, as subpages of their user pages. These too were occasioning debate. It seemed to me, and many other contributors, that this debate was distracting the community from our main goal: to create an encyclopedia. Consequently I proposed that we move the debate to another wiki that was to be created specifically for that purpose--what became known as the "meta wiki." This proposal was very widely supported, so we set it up.
As it happened, the meta-wiki became even more uncontrolled than Wikipedia itself, and for many months was continually infested with contributions by people that can only be called "trolls." That epithet came to be discouraged, however, for reasons soon to be explained. The existence of trolls was a problem we felt we should tolerate--and deal with only verbally, not with harsh penalties--for the sake of encouraging the broadest amount of participation. In the first years, only the worst trolls were ever expelled from the project. I do not know whether this policy has been changed as a result of the operation of the much-later installed Arbitration Committee.
The reasons the meta-wiki became (at least temporarily) more uncontrolled are not far to seek. First, it had no specific purpose, other than to host project debate and essays that do not belong on the main wiki--which was not enough to make anyone care very much about it. Second, because many people did not care what happened on the meta-wiki, they did not do the very necessary weeding that takes place on Wikipedia; besides, as the meta-wiki was a repository of opinion, people felt less comfortable editing or deleting what was, after all, only opinion.
What happened was that project policy discussions moved almost exclusively to the project mailing lists. There is a reason why this was a superior solution to having much debate on an uncontrolled, "unmoderated" wiki. On a wiki, contributions exist in perpetuity, as it were, or until they are deleted or radically changed; consequently, anyone new to a discussion sees the first contribution first. So whoever starts a new page for discussion also, to a great extent, sets the tone and agenda of the discussion. Moreover, nasty, heated exchanges live on forever on a wiki, festering like an open wound, unless deliberately toned down afterwards; if the same exchange takes place on a mailing list, it slips mercifully and quietly into the archives.
At about the same time that we decided to start the meta-wiki, and soon after the vandalism archive affair, I was thinking a great deal about Wikipedia's apparent anarchy, and I wrote an essay titled, "Is Wikipedia an experiment in anarchy?" This and the discussion that ensued tended to ossify positions with regard to the authority issue: I and a few others agreed that Jimmy and I should have special authority within the system, to settle policy issues that needed settling. Jimmy was relatively quiet about this issue; this, I think, was probably because his authority was generally not in question, but mine was, because I was "in the trenches" and continuing to encourage good habits and solidify policy positions.
By November or December of 2001, Wikipedia was growing so fast and the subject of regular news reporting, even by the likes of The New York Times and MIT's Technology Review; after the two major Slashdottings earlier in the year, we knew that large influxes of members could have a tendency to change the nature of the project, and not necessarily for the better. If there were some major news coverage--an evening news story in the U.S., for example--there might be many new people who would need to be taught about Wikipedia's standards and positive cultural aspects. So I proposed what I thought was a humorously-named "Wikipedia Militia" which would manage new (and very welcome) "invasions" by new contributors. By this time, however, there was a small core group of people who were constantly on the watch for anything that smacked the least bit of authoritarianism; consequently, the name, and various aspects of how the proposal was presented, were vigorously debated. Eventually, we switched to "The Wikipedia Welcoming Committee" and finally, the "Volunteer Fire Department"--which eventually, it seems, fell into disuse.
The governance challenge
After the September Slashdotting, I composed a page originally called "Our Replies to Our Critics" (and now called "Replies to Common Objections"), in which I addressed the problem that "cranks and partisans" might abuse the system:
Moreover--and this is something that you might not be able to understand very well if you haven't actually experienced it--there is a fair bit of (mostly friendly) peer pressure, and community standards are constantly being reinforced. The cranks and partisans, etc., are not simply outgunned. They also receive considerable opprobrium if they abuse the system.
This reflects very well the conception I had in September 2001 of Wikipedia's culture; the reply above was as much hopeful and prescriptive as descriptive. But it turned out to be only partly true. As difficult users began to have more of a "run of the place," in late 2001 and 2002, opprobrium was in fact meted out only piecemeal and inconsistently. It seemed that participation in the community was becoming increasingly a struggle over principles, rather than a shared effort toward shared goals. Any attempt to enforce what should have been set policy--neutrality, no original research, and no wholesale deletion without explanation--was frequently if not usually met with resistance. It was difficult to claim the moral high ground in a dispute, because the basic project principles were constantly coming under attack. Consequently, Wikipedia's environment was not cooperative but instead competitive, and the competition often concerned what sort of community Wikipedia should be: radically anarchical and uncontrolled, or instead more singlemindedly devoted to building an encyclopedia. Sadly, few among those who would love to work on Wikipedia could thrive in such a protean environment.
It is one thing to lack any equivalent to "police" and "courts" that can quickly and effectively eliminate abuse; such enforcement systems were rarely entertained in Wikipedia's early years, because according to the wiki ideal, users can effectively police each other. It is another thing altogether to lack a community ethos that is unified in its commitment to its basic ideals, so that the community's champions could claim a moral high ground. So why was there no such unified community ethos and no uncontroversial "moral high ground"? I think it was a simple consequence of the fact that the community was to be largely self-organizing and to set its own policy by consensus. Any loud minority, even a persistent minority of one person, can remove the appearance of consensus. In fact, I recall that (in October 2002, after I resigned) I felt compelled by ongoing controversies to request that Jimmy declare that certain policies were in fact non-negotiable, which he did. Unfortunately, this declaration was too little, too late, in my opinion.
By late 2001, I had gained both friends and detractors. I think I had become, within the project, a symbol of opposition to anarchism, of the enforcement of standards, and consequently of the exercise of authority in a radically open project. But I was still trying to manage the project as I always had--by force of personality and "moral" authority. So when people arrived who clearly and openly disrespected established policy, I was, in my frustration, very short with them; and when the project continued to try to establish new policies, my role in articulating those policies and actually establishing them (attempting to express a "consensus") was challenged. This undermined what moral authority I had. I felt my job was on the line, and the project continued in turmoil day in and day out; from my point of view, fires were spreading everywhere, and as I had become a somewhat controversial figure, I did not have quite enough allies to help me put them out. Consequently I was rather too peremptory and short with some users. This, however, exacerbated the problem, because the attitude could not be backed up by punishment; harsh words from a leader are empty threats if unenforceable; I thereby handed my anti-authoritarian "wiki-anarchist" opponents an advantage, because--ironically--they were able to portray me as dictatorial, when I was anything but. I came to the view, finally and belatedly, that it would be better to "ignore the trolls." But as it turns out, this is particularly hard to do on a wiki, because, again, unlike on an e-mail list, trollish contributions do not just disappear into the archives; they sit out in the open, as available as the first day they appeared, and "festering." Attempts to delete or radically edit such contributions were often met by reposting the earlier, problem version: the ability to do that is a necessary feature of collaboration. Persistent trolls could, thus, be a serious problem, particularly if they were able to draw a sympathetic audience. And there was often an audience of sympathizers: contributors who philosophically were opposed to nearly any exercise of authority, but who were not trolls themselves.
It is surely very ironic that it was I personally who (initially) so strongly supported the lack any enforceable rules in the community. Some legal theorists would maintain that a community that lacks enforceable rules lacks any law at all. In retrospect it is clear that there was a fundamental problem with my role in the system: to have real authority, I needed both to be able to enforce the rules and, for both fairness and the perception of fairness, there needed to be clear rules from the beginning. But, by my own design, I had very early on rejected the label "editor-in-chief" and much real enforcement authority; a year into the game, it would have been difficult if not impossible to claim enforcement authority over active but problem users. Moreover, I was the author of the "ignore all rules" rule. My early rejection of any enforcement authority, my attempt to portray myself and behave as just another user who happened to have some special moral authority in the project, and my rejection of rules--these were all clearly mistakes on my part. They did, I think, help the project get off the ground; but I really needed a more subtle and forward-looking understanding of how an extremely open, decentralized project might work.
In retrospect, I wish I had taken Teddy Roosevelt's advice: "Speak softly and carry a big stick." Since my "stick" was very small, I suppose I felt compelled to "speak loudly," which I regret. (This was not such a problem, by the way, on Nupedia; partly, that was because there were not nearly as many problem users on Nupedia, but partly it was because there was clear enforcement authority.) As it turns out, it was Jimmy who spoke softly and carried the big stick; he first exercised "enforcement authority." Since he was relatively silent throughout these controversies, he was the "good cop," and I was the "bad cop": that, in fact, is precisely how he (privately) described our relationship. Eventually, I became sick of this arrangement. Because Jimmy had remained relatively toward the background in the early days of the project, and showed that he was willing to exercise enforcement authority upon occasion, he was never so ripe for attack as I was.
Perhaps the root cause of the governance problem was that we did not realize well enough that a community would form, nor did we think carefully about what this entailed. For months I denied that Wikipedia was a community, claiming that it was, instead, only an encyclopedia project, and that there should not be any serious governance problems if people would simply stick to the task of making an encyclopedia. This was strictly wishful thinking. In fact, Wikipedia was from the beginning and is both a community and an encyclopedia project. And for a community attempting to achieve something, to be serious, effective, and fair, a charter seems necessary. In short, a collaborative community would do well to think of itself as a polity with everything that that entails: a representative legislative, a competent and fair judiciary, and an effective executive, all defined in advance by a charter. There are special requirements of nearly every serious community, however, best served by relevant experts; and so I think a prominent role for the relevant experts should be written into the charter. I would recommend all of this to anyone launching a serious online community. But indeed, in January 2001, we were in both "uncharted" and "unchartered" territory. The world, I think, will be able to benefit from this and our other initial mistakes.
But in fairness to ourselves, it was a good idea to allow the community to decide by experience and consensus what article content rules to endorse. This allowed us to generate a very sensible set of article content rules. To be clear, I think it was not such a good idea to apply the same thinking to the organization of the community itself; we should have acknowledged that a community would form, that it would have certain persistent and difficult issues that would need to be solved, and that a lack of any effective founding community charter might result in chaos.
My resignation and final few months with the project
Throughout the governance controversy, I was preparing for my wedding, which happened December 1, 2001. A few days after I arrived back from my honeymoon, I was informed that I should probably start looking for another job, because Bomis was having to lay off most of its workers; they had 10-12 workers at the end of 2000, and by the beginning of 2002 they were back to their original 4-5. My salary was reduced in December and then halved in January. This seemed inevitable because Wikipedia was not bringing in any money at all for Bomis, even if Wikipedia was becoming even more of a publicly-recognized, if still modest success. Our first anniversary came just before we announced having 20,000 articles, and I was invited to talk about the project at Stanford on January 16 (text here; you might notice that I was still plugging the notion of using Nupedia to vet Wikipedia articles, as an answer to the objection that Wikipedia articles are unreliable).
I was officially laid off at the beginning of February, which I announced a few weeks later. I had continued on as a volunteer; Wikipedia and Nupedia were, after all, volunteer projects. But I was laboring in the aftermath of the governance controversies of the previous fall and winter, which promised to make the job of a volunteer project leader even more difficult. Moreover, I had to look for a real job. So throughout the month of February I considered resigning altogether.
But Jimmy had told me the previous December that Bomis would start trying to sell ads on Wikipedia in order to pay for my job. Even in that horrible market for Internet advertising, there were already enough pageviews on Wikipedia that advertising proceeds might have provided me a very meager living. We knew that this would be extremely controversial, because so many of the people who are involved in open source and open content projects absolutely hate the idea of advertising on the web pages of free projects, even to support project organizers. In fact, when this advertising plan was announced, in late February of 2002, the Spanish Wikipedia was forked (something I urged them not to do).
Bomis was not successful in selling any ads for Wikipedia anyway--you might recall that early 2002 was at about the very bottom of the market for Internet advertising. I also had some hope that we might, finally, set up the project's managing nonprofit, which we had discussed doing for a long time (and which eventually did come into being: Wikimedia). The job of setting up the nonprofit was left to me, but ongoing controversies seemed to eat up any time I had for Wikipedia, and frankly I had no idea where to begin. So, after a month without pay, I announced my general resignation; I completely stayed away from the project for a few months.
Just by the way, Wikipedia's offshoot projects--a dictionary, a textbook project, a quotation project, a public domain book repository, etc.--were all started in 2002 or later, and I cannot claim any credit for them. I did supply the name "Wiktionary" in April 2001, more or less on a whim. I quickly disavowed any responsibility for leading any such project, and it seems the Wiktionary project did not start up for another year and a half (December 12, 2002). My view now is that Webster's and the OED are quite good enough as far as English dictionaries go, and there will always be excellent free dictionaries in every language online. To try to develop a dictionary by collaboration among random Internet users, particularly in a completely uncontrolled wiki format, now strikes me as a nonstarter. I confess I am now puzzled why I didn't think so instantly; it was no doubt because I simply was throwing out ideas as they occurred to me, and also because we had too many dictionary definition-type entries in Wikipedia. (So why not give people a place to put their dictionary definitions?--Perhaps that's what I was thinking, but it hardly seems like a good justification for starting a project.) But Jimmy's first reaction was properly skeptical regarding the use of wikis and Ruth Ifcher made a stronger criticism very nicely. Dictionaries, even more than encyclopedias, must be extremely reliable to be even minimally usable; without direct oversight by linguists, a public dictionary project seems pointless. As to the other projects, they are mostly conducted using wikis and according to some of the basic founding principles of Wikipedia. But other sorts of project--for example, textbook projects, quotation repositories, and archives--necessarily require quite different specifications from those of an encyclopedia. For example, the fact that the wiki format works for encyclopedia development hardly means that it is appropriate for the hosting of public domain books. Since the same texts are available in many other places online, such as the wonderful Project Gutenberg, why would anyone choose to read The Iliad on a wiki, which could have been subtly changed by any random passer-by, without any oversight by someone who had access to an authoritative text? There is a fact about the way the text actually reads; so is editing via wiki software more apt to increase or reduce the number of errors over other systems, such as Project Gutenberg's? I do not mean to dismiss any such efforts. I simply think that considerable thought needs to be put into exactly how those other projects should be organized: the wiki format is not a magic pill that somehow makes all problems go away. Wiki is just one software paradigm, which must be adapted, supplemented, changed, or replaced in order to solve the unique set of problems a project faces.
In the spring, a controversy erupted. Caring as I did--and as I still do--about the future of free encyclopedias, I felt compelled to get involved. The controversy featured a troll who was putting up huge numbers of screeds on the "meta-wiki" and on Wikipedia as well. The controversy began with a discussion of what to do about, and how to react to, this particular troll. I maintained that one should not "feed the troll," and that the troll should be "outed" (it was an anonymous user, but it was not hard to use Google to determine the identity of the troll) and shamed.
There resulted a broader controversy about how to treat problem users generally. There were, as I recall, two main schools of thought. One, to which I adhered and still adhere, was that bona fide trolls should be "named and shamed" and, if they were unresponsive to shaming, they should be removed from the project (by a fair process) sooner rather than later. We held that a collaborative project requires commitment to ethical standards which are--as all ethical standards ultimately are--socially established by pointing out violations of those standards. Hence naming and shaming. A second school of thought held that all Wikipedia contributors, even the most difficult, should be treated respectfully and with so-called WikiLove. Hence trolls were not to be identified as such (since "troll" is a term of abuse), and were to be removed from the project only after a long (and painful) public discussion. For the latter school, it seemed to me, the only really egregious faux pas one could commit in the project was to suggest that there were objective standards that could be enforced via "shaming."
I felt at the time that the prevalence of the second school entailed rejection of both objective standards and rules-based authority. It is impossible to explain why one is removing some partisan screeds from the wiki without, in some way, identifying it as a partisan screed, and pointing out that such productions are inconsistent with the neutrality policy. This will necessarily be received as less than respectful and "loving," especially if one must engage the troll himself in a long, drawn-out dispute; in a very long dispute with any trollish type, it is only a matter of time before some epithet gets bandied about, since they are so darned useful (and accurate) when applied to trollish types. More generally, the very application of rules, or laws, entails a moral judgment, or what for its effectiveness must have the force of a moral judgment. I suppose I agree with those legal theorists who say that there is necessarily, in its core, a moral component to the law. Consequently, the new policy of "WikiLove" handed trolls and other difficult users a very effective weapon for purposes of combatting those who attempted to enforce rules. After all, any forthright declaration that a user is doing something that is clearly against established conventions--posting screeds, falsehoods, nonsense, personal opinion, etc.--is nearly always going to appear disrespectful, because such a declaration involves a moral accusation. The only way to avoid such an appearance of disrespect, perhaps, is to step very lightly and use much flattery and qualifications: "Now don't get me wrong, I think you're doing a good job overall, but it seems to me that in this particular case, your contribution is slightly inconsistent with the neutrality policy." Suppose the offender replies: "So what? I disagree with the neutrality policy." Or: "I disagree. What I wrote is perfectly neutral. Who do you think you are, anyway?" It is a very rare person who can practice "WikiLove" in such a case. In Wikipedia's developing culture, if anyone reacted out of frustration, or merely attempted to apply the law as a moral instrument, as laws typically are applied, he would become the problem, and a much more serious problem, than mere violations of the neutrality policy, say. The result is that, on pain of becoming persona non grata in the community, one had to treat brazen, self-conscious violators of basic policy with particular respect. It was a perfect coup for the resident wiki-anarchists. I again left the project for several months.
In fall of 2002, I had started teaching at a local community college, and with some extra time on my hands, I started editing Wikipedia a little and engaging in mailing list discussions. I think my first new post to Wikipedia-L, from September 1, 2002, was "Why the free encyclopedia movement needs to be more like the free software movement." In it I argued that the free software movement is led and dominated by highly-qualified programmers, and that the "free encyclopedia movement"--that is, Wikipedia, Nupedia, and other newer projects--needs to move in that direction. I suggested that Nupedia be redesigned to release "approved" versions of Wikipedia articles; Wikipedia itself was not to be touched. This proposal met with a very cool reception. After a few months of discussion, Jimmy himself was "intending to revive Nupedia in the near future" and "thinking very much along the lines of what is being discussed here." Unfortunately, this never happened.
By November or December, I think, I proposed, and Magnus Manske very helpfully coded, an expert-controlled approval process for Wikipedia that was in fact to be independent of both Nupedia and Wikipedia. It would not have affected the Wikipedia editorial process. It would have lived in a separate namespace or domain, as an independent add-on project for Wikipedia. Without explaining the details, expert reviewers, the recruitment of which I would organize, would examine Wikipedia articles and approve or disapprove of particular versions of those articles. We set up a mailing list, Sifter-L (archives no longer online, apparently), which for several weeks discussed policy issues.
There was not a great deal of support for the proposal on Wikipedia-L. There was little or no excitement that the new project might bring into Wikipedia a fresh crop of subject area specialists. But that was fine as far as I was concerned, since the project was to operate independently of Wikipedia. Still, I had the very distinct sense that any specialists arriving on the scene would not necessarily be met with open arms--particularly if before approving an article they wished to make whatever changes to articles that they felt necessary. There were even a few Wikipedians who made it clear that experts should not expect to be treated any differently than anyone else, even when writing about their areas of expertise.
I then considered whether the interaction between Wikipedians and the new reviewers might be a problem after all. Surely, I thought, most specialists would want to edit even very good articles before approving them (in the independent system). This would require that the reviewers interact with Wikipedians. Wikipedia's culture had become such that disrespect of expertise was tolerated, and, again, trolls were merely warned, but very politely (in keeping with the policy of WikiLove), that they please ought to stop their inflammatory behavior. Trolls would certainly find ripe targets in expert reviewers, I thought. I recalled that patient, well-educated Wikipedians like J. Hoffmann Kemp and Michael Tinkler had been driven off the project not only by trolls but by some of the more abrasive and disrespectful regulars. I then considered: could I in good conscience really ask academics, who are very busy, to engage in this activity that would probably annoy most of them and do nothing to contribute to their academic careers? Recruiting for Nupedia was very easy by comparison, and caused me no such pangs of conscience.
I believe it was this problem that finally prompted me, in I believe January of 2003, to inform Jimmy as follows (by private e-mail): I was breaking with the project altogether; the only way he could prevent this, I told him, was that he personally crack down on problem users, and make the project more officially welcoming to experts. I also told him that I did not expect this information to change his mind, and that I did not mean to issue an ultimatum. And in fact our exchange did not change his mind. I concluded that we had a fundamental philosophical disagreement about how the project should be run. I respected and still respect his view. That is where matters ended, and it was then that I broke with Wikipedia altogether.
Some final attempts to save Nupedia
Nevertheless, I was interested in pursuing Nupedia's development. It still seemed rescuable to me.
I recall two incidents in which I tried to have Nupedia revived, in 2002 or 2003, but I don't recall exactly. First, I approached Jimmy with the offer to try to find a buyer/managing organization for Nupedia. The suggestion was that, since Bomis did not have enough money to support it, and since Jimmy did not appear to have any specific intentions with the project other than to let it run on the system set up in 2000-1, I might be able to find a university or other organization that would take on the responsibility. I do not recall the details, but we did not pursue this possibility. Second, and later, I offered to buy Nupedia myself--that is, the domain name, the membership list, and whatever other proprietary material Bomis might have controlled. I wanted to start it up again as a simpler, more streamlined, but still fully peer-reviewed project; I thought, moreover, that if I owned it I might be able to give it to a suitable sponsoring educational or nonprofit institution. Jimmy seemed cool to the idea, and did not ask for any specific offers.
Perhaps it is, therefore, not entirely accurate to say that Nupedia died due to the inefficiency of its system. To some extent it was also allowed to die, even after it was clear that its former editor-in-chief expressed an interest in continuing the project under an entirely different system. The result was that, without a leader or organization that could support its mission, Nupedia died a slow death. The server it lived on had some trouble in 2003, and as a result the website went offline. For whatever reason, the website was never brought up again after that.
I obviously cannot speak for Jimmy, but I will say that, if he was worried that Nupedia would essentially fork Wikipedia--again, I don't claim that he had that concern--then it seems to me that such a concern would not have justified letting Nupedia wither untended. The projects, Wikipedia and Nupedia, were naturally complementary parts of a single, symbiotic whole. That at least is how I always regarded them, indeed, from the very founding of Wikipedia. From the founding of Wikipedia, I always thought Wikipedia without Nupedia would have been unreliable, and that Nupedia without Wikipedia would have been unproductive. Together they were to be an "unstoppable high-quality article-creation juggernaut."
It is still disappointing to me, that we made plans and promises to thousands of Nupedians, including hundreds of extremely well-qualified people, some of them leaders in their fields. We spent many thousands of person-hours, all told, on the project. I apologize to those people, and I can only hope that they will find some future open content encyclopedia project worthy of their participation, one that will show the world the potential that Nupedia had.
Conclusions
I have some advice for anyone who would like to start new projects on the model of Wikipedia.
You can learn from Wikipedia's success; so, first and most importantly, see above for considerations about why Wikipedia works.
But you can also learn from our mistakes. The following primarily concerns project governance, because governance issues are, in my opinion, the primary failing of Wikipedia. Bear in mind, also, that these are only rough guidelines, for those who are starting projects that have enough resemblance to Wikipedia. These are not perfectly general rules:
- If you intend to create a very large, complex project, establish early on that there will be some non-negotiable policy. Wikis and collaborative projects necessarily build communities, and once a community becomes large enough, it absolutely must have rules to keep order and to keep people at work on the mission of the project. "Force of personality" might be enough to make a small group of people hang together; for better or worse, however, clearly enunciated rules are needed to make larger groups of people hang together.
- There is some policy that, with forethought, can be easily predicted will be necessary. Articulate this policy as soon as possible. Indeed, consider making a project charter to make it clear from the beginning what the basic principles governing the project will be. This will help the community to run more smoothly and allow participants to self-select correctly.
- Establish any necessary authority early and clearly. Managers should not be afraid to enforce the project charter by removing people from the project; as soon as it becomes necessary, it should be done. Standards that are not enforced in any way do not exist in any robust sense. Do not tolerate deliberate disruption from those who oppose your aims; tell them to start their own project; there's a potentially infinite amount of cyberspace.
- As any disagreements among project managers are apt to be publicly visible in a collaborative project, and as this is apt to undermine the (very important) moral authority of at least one manager, make sure management is on the same page from the beginning--preferably before launch. This requires a great deal of thinking through issues together.
- In knowledge-creation projects, and perhaps many other kinds of projects, make special roles for experts from the very beginning; do not attempt to add those roles later, as an afterthought. Specialists are one of your most important resources, and it is irrational not to use them as much as you can. Preferably, design the charter so that they are included and encouraged. Moreover, make the volunteer project management a meritocracy, and not based on longevity but based on the ability to lead and contribute to the project; that is the only condition under which very many of the best qualified people will want to participate.
Another point needs more in-depth development.
Radical and untried new ideas require constant refinement and adaptation in order to succeed; the first proposal is very rarely the best, and project designers must learn from their mistakes and constantly redesign better projects. Nupedia's Advisory Board failed to admit to inherent flaws in its system, and its delay in admission shut the window of opportunity to its improvement. And it seems to me that the Wikipedia community fell into a mistake by thinking that just one or two features--the wiki feature and the neutrality policy and a few other things--explained Wikipedia's success, and that those features can thus be applied with no significant changes to new projects. But there is no substitute for constant creativity and problem-solving--nor for honesty about what problems need solving. The honesty to recognize problems and creativity in solving them are, after all, what made Wikipedia succeed in the first place.
This is a crucial point: if you use a tool or model from another project, think through very carefully how that tool or model should be adapted. Do not assume that you need to use every feature, or every aspect of the surrounding culture, that you are borrowing. Wikipedia borrowed rather too much from (1) the culture of wikis, (2) unmoderated online discussions, and (3) free-wheeling online culture generally. To be sure, Wikipedia is also a product of those cultures, and works as well as it does largely because of what it borrowed from those cultures. But it also shares some of its more serious current flaws with such cultures. Those planning new projects, or wanting to overhaul old ones, might well bear in mind that a certain cultural context, including the context that has grown up around a tool, just might not be right for that project. Let me elaborate.
(1) Consider first the culture of wikis. On the one hand, I said we wanted to determine the best rules, and experience would help us determine that; so we had no rules to begin with. On the other hand, one might add that another reason we began without rules was that we were partaking in the extremely uncontrolled, free-wheeling nature of "traditional" wikis. I think that's right. But there is an excellent reason why an encyclopedia project should not partake in that extremely uncontrolled nature of wiki culture, and why it should adopt actually enforceable rules: unlike traditional wikis, encyclopedia projects have a very specific aim, with very specific constraints, and efficient work toward that aim, within those constraints, practically requires the adoption of enforceable rules. The mere fact that most wikis, when Wikipedia was created, did not have enforceable rules hardly meant that one could not innovate further, and create one that did have rules.
(2) Moreover, Jimmy and I and most of the first participants on Wikipedia were veterans of unmoderated Internet discussion groups, and hence, naturally, we could appreciate the advantages of letting a virtual community develop in the absence of any real (enforcement) authority. In unmoderated forums there is often found a sense, among some participants, that any attempt to oust a particularly troublesome user amounts to unjustifiable censorship. The result is that the existence of many unmoderated forums online has created a small army of people militantly opposed to the slightest restriction on speech, who feel that they do and should have a right to say whatever they like, wherever they like, online. Any attempt to create and enforce rules for Internet projects, when that small army is ready to cry "censorship," will seem daring or even outrageous in many contexts online. But there is an excellent reason why such anarchy is inappropriate for many projects, including encyclopedia projects, even one that is self-policing like a wiki: there simply must be a way to enforce rules in order for rules to be effective. Given that encyclopedia project development happens almost entirely using words, nearly any rules will also be restrictions on speech. Anyone who advocates many enforceable rules on a collaborative project, in the cultural context of an Internet filled with so many unmoderated discussion groups, can be made to seem reactionary. But this is only a result of that cultural context; in any other context, the existence of rules would be perfectly natural and unobjectionable.
(3) Finally, and generally speaking, the Internet is a great leveller. Since social interaction can proceed among complete strangers who cannot so much as see each other, things that seem to matter in many "meatspace" discussions, such as age, social status, and level of education, are often dismissed as unimportant online. Many Internet forums, chatrooms, and blogs are populated by people who are identified by only a "handle," and any suggestion that communication should be restricted or in any way altered in accordance with "expertise" or "authority" is likely to be met with outrage, in most forums. But there are several excellent and obvious reasons why expertise does need special consideration in an encyclopedia project, and in other collaborative projects. First, there are many subjects that dilettantes cannot write about credibly; I, for example, could not write very credibly about astronomy or speleology, but I have a passing interest in both. If I am working only with other dilettantes, our articles are apt to remain amateurish at best; we can fill in the gaps in each other's knowledge, and do research, but the results will remain problematic until someone with more knowledge of the subject contributes. Second, there are very many specialized subjects about which no one but experts has any significant knowledge at all. Third, it is only the opinions of experts that will be trusted by most of the public as authoritative in determining whether an article is generally reliable or not. Moreover, the standards of public credibility are not likely to be changed by the widespread use of Wikipedia or by online debate about the reliability of Wikipedia. Like them or hate them, those are the facts. But if one points these facts out online, culturally "levelled" as it is, particularly in forums or projects like Wikipedia which go out of their way to ignore individual differences among people, one finds a frosty reception at best.
Consider, if you will, that it was because Wikipedia was started in the context of the ingrained cultures of wikis, of unmoderated discussion forums, and of the levelling, anti-elitist influence of the Internet at large, that it was very difficult for us to exercise the maximal amount of creativity that a maximally successful project would require. In establishing a new cultural context, we were deeply constrained by the old. Now, to be sure, I have said above and many times elsewhere that Wikipedia did not have to adopt the particular conjunction of policies that it did. But it is not surprising that it did adopt its particular conjunction of policies, considering the conjunction of influences on its development. So it would have required much more explanation and persuasion, and indeed, much more struggle, for us to, for example, have persuaded potential participants that some persons, even in a wiki environment, should have special rights that others do not. So powerful is the influence of cultural context that there are quite a few people whose lack of imagination is such that they believe I simply must not understand "why Wikipedia works" if I am willing to suggest that it does not have to work in precisely the way it does work. Constantly-reinforced cultural habits die very hard indeed, and place very strong constraints upon what can be imagined, and what bare possibilities seem even worth thinking about.
But it was our willingness to exercise our creativity and follow our imagination, and create what is, to some extent, a new kind of culture, that led to Wikipedia's success. For the overall project of creating open content encyclopedias--and indeed, for the fantastic collaborative Internet that has yet to be created--to reach its full potential, the process of identifying mistakes honestly and creatively seeking solutions must be ramped up and continued unabated.
Many thanks to Larry Sanger and to O'Reilly for this memoir. -
Low Tech Gutenberg?
Peace Corps Guy asks: "I have a friend who recently left for a two year Peace Corps stint in Mozambique. While there she has limited access to electricity, no technology, and not a lot to do with her 'off' time. She's a big literature fan, and many of us here at home would like to send a care package - but how best to ship pieces of free online text like Project Gutenberg to a developing nation? We can print it (high shipping and printing costs), print it very small and ship her a high quality fresnel lens (awkward), or put it all on a cheap PDA, which would be a high theft risk en route and in situ. High shipping costs on weight and volume are another major limiting factor. What alternative solutions can Slashdot readers suggest for shipping a freely available byte-stream to someone without a computer?" -
Best PDA To Read e-Texts On?
GabrielStrange writes "I've been thinking for a while now that I'd like to own some sort of portable device on which I could read e-Texts. This device should be able to read both simple text files (i.e. Project Gutenberg e-Texts) and more complex formats, like Plucker, Acrobat or Microsoft Reader. It should have a fairly high-res display with a backlight that would be easy on the eyes... but doesn't particularly need to be a color display. I'd like it to work with at least one (if not both) of the machines on my desktop, which run Linux 2.6 and MacOS X Panther... And to use a USB port. And I'd like it to have a built in, rechargeable battery, because I already have enough devices to worry about batteries for. And, of course, I don't want to pay very much for it. Anyone got any recommendations for such a device? It's proving to be almost impossible to even obtain an actual list of devices that have these features." -
Slashback: Hatred, Glass, Identification
Slashback brings you another source for the Unix Haters' Handbook, along with more news on the Caldera v. IBM lawsuit and other updates on topics from XPde to creating a stained-glass computer. Read on below for the details.Why Yes, you can sell the Free books. ProteusQ writes "Project Gutenberg has released a 'Best Of' CD, April 2003 Edition. The CD compilation is copyrighted and licensed under a Creative Commons license that allows unlimited non-commercial duplication and distribution. You can even sell it, provided that you share 20% of the gross profits with Project Gutenberg. It contains almost 500 books, and the 'Best Of' project itself based on the Open Source model. All of the work was performed by volunteers (mostly by me, in this case), with the goal of building a volunteer base to create about three editions per year."
Welcome to the American legal system, mind your footing. An anonymous reader submits: "In an e-mail discussion that took place 24 and 25 April, SCO-Caldera Senior Vice President Chris Sontag told MozillaQuest Magazine that there is SCO-owned code in Red Hat and SuSE Linux distributions. He also told MozillaQuest Magazine that the tainted code is not in the Linux kernel that Linus [Torvalds] and others have helped develop. We're talking about what's on the periphery of the Linux kernel."
On this topic, Random BedHead Ed writes "IBM has released its denial of SCO Group's charges that it borrowed proprietary UNIX code in its development of the GNU/Linux system. Story at News.com.com.com.etc. The battle continues.
Also, check out PCLinuxOnline.com for a good summary of the events thus far. They also have a Boycott SCO page if you're interested."
The height of practicality. Jerami Campbell writes "I just saw your article in Slashdot 'Building a stained glass computer case?' I have made several stained glass computer cases, I thought you might be interested in checking them out. You can see all of my cases at lucentrigs.com. I will have a new one finished in a couple of days. It is black glass with a red lava lamp mounted in the front."
Gun buffs have well-adjusted sights. In regards to the MP3-player-in-a-rifle-magazine posted the other day, Mat S. writes "I would be reaaaaally surprised if this fit a standard AK-47, as it is an SVD (Russian infantry rifle, as opposed to the AK, which is in fact a carbine, although called an assault rifle) mag. It accommodates much more powerful ammo, and the cartridges are about 50% longer than the AK's. Thank you for your attention. I still WANT this player. Might be a bit on the heavy side, though. this case is stamped steel, about 3 mm thick :)"
Fair and balanced, naturally. An anonymous reader writes "For those of you who were unable to obtain the Microsoft propaganda about Unix, it's up at MIT."
Note for the humorless: the UHH is not "Microsoft propaganda."
The best Congress money can buy. If you thought Hilary Rosen writing Iraq's copyright law was an isolated incident, don't worry, she's not alone. theodp writes "The RIAA paid $18,000 for the chairman of the House Judiciary Committee to travel to Taiwan and Thailand to make it clear to government officials that the pressure to enforce U.S. laws against pirating of music and movies 'is a unified message coming from all levels of the U.S. government.' Watchdog groups say the trip may have violated House ethics rules, and one is calling for a House Ethics Committee investigation. Rep. Jim Sensenbrenner, R-Wis., said he could have used committee funds to pay for the trip but, 'I thought I would save the taxpayers some money on this.'"
Thanks a bundle.
A considerate way to fool your friends and family. We've mentioned the blink-twice Trompe L'Oeil Windows-looking desktop XPde a few times before; now xexen writes "On April 26th 2003, I received an email. The XPde Team released XPde 0.3.5, a major upgrade to the XPde desktop environment and window manager. Check out the announcement, view the screenshots, or read the detailed ChangeLog."
Build up your frequent flyer miles. A few weeks ago we mentioned that the proceedings of the most recent linux.conf.au (a Linux gathering Down Under) were available as an ISO; hemos, who was on hand at the conference, passes on word that the CDs have been sent out, and points to some more info on the next LCA.
-
Database Nation
We've got a double-headed review of Simson Garfinkel's new book Database Nation: The Death of Privacy at the End of the 21st Century. It's a thought-provoking vision of the future which frankly scares the heck out of me. Database Nation: The Death of Privacy at the End of the 21st Ce author Simson Garfinkel pages 312 publisher O'Reilly & Associates rating 7/10; 9/ reviewer Matthias Wenger, Kurt Gray ISBN 1-56592-653-6 summary Thoughtful look at threats to privacy, and appropriate responsesReview 1: Matthias Wenger
Personally, privacy has been a big issue lately -- hearing about DoubleClick and Real Networks customer tracking made the issue a bit of a sore point for me. Then a friend of mine bought a shredder after her credit card fell victim to a Dumpster diver, and I started getting paranoid. Reading Database Nation hasn't helped, but it brings up some possible solutions and provides a good deal to think about as we march blindly on towards Big Brother, Inc.
Database Nation starts out strong, with a hypothetical day in the life of someone with no privacy -- cold-call telemarketing at 6:30 in the morning, surveillence cameras all around, veiled blackmail for a hospital in desperate need of cash and plenty of medical histories, still more cameras at work, etc. This story ends up being a rough outline for the book, which also covers electronic footprints (ATM and credit card records and the like), private databasing a la DoubleClick, identity vs. body, and surprisingly enough, AI and intelligence agents. Each of the major topics covered has at least a full chapter devoted to it -- explaining the specific issues at hand, what sort of data is at risk, who would be interested in such data, and how data can be protected.
The biggest flaw in the book is that it is too ambitious -- how can you cover the sanctity of medical records in 30 pages? It would be difficult to do a better job with such space limitations, certainly, but it does make for a more general view of privacy rather than dealing with specifics. The result is "Privacy in a Nutshell," to steal a turn of phrase from O'Reilly. Given the subject matter, the Nutshell approach might even be preferable, since the theory can be applied in any situation once the awareness is there. Still, each topic felt like it could be expanded much further.
The over-eager breadth of the subject matter is also wonderful. Enough particular concerns are illustrated in each topic that there is an outline of the larger picture of information management even though a good deal remains to be filled in. Covering so many topics makes it easier to see just how much information can be collected about an individual while they remain unawares, and just how much that information can be abused or misused. To illustrate this very point, Garfinkel relates the story of an Internet-based scavenger hunt where the end result was to find out as much as possible about a particular "target," working only with a name. The information collected in 1993 included his place of employment, parents' names, home address, degrees earned, doctoral dissertation, the operating system he used, what his fiance's name was, and more. I found out five minutes ago, with the help of google, that he's now married and that he and his wife hyphenated their last names together. That was just the first hit. And that was a very casual search -- if someone was really interested in finding information, what are the limits?
Database Nation is, in a way, the ultimate discussion of information security. Garfinkel covers an amazing range of topics in exploring privacy and personal information today and into the 21st century. This is both a blessing and a curse -- there are so many things to be aware of, so many topics and points of view to consider, yet each one is worthy of more attention. At the opening of the book, Garfinkel expresses hope that Database Nation will do for privacy what Silent Spring did for environmentalism -- if something doesn't do it soon, there wont be any privacy left to save.
Review 2: Kurt Gray
If Simson Garfinkel's name doesn't ring a bell, check the computer section of your local bookstore or library: Garfinkel co-authored the O'Reilly Practical UNIX Security book, the O'Reilly Stopping Spam book, and some six other books. Before I was a Slashdot addict I enjoyed reading Garfinkel's columns in Packet and the Boston Globe , where his talents for technology journalism and futurist projections make informative reading for geeks and lay persons alike.
Just as Upton Sinclair's The Jungle led to sweeping reforms in the meat-packing industry (and probably turned a lot of people to vegetarianism) Garfinkel's latest book, Database Nation, should draw some much-needed attention to the manner in which everyone's personal information is being captured, cataloged and sold as commodity, and how each aspect of this process detracts from our civil liberties. If you're an American, you certainly know what the IRS is, but have you ever heard of TRW? Equifax? Experian? Or the DMA? Or the MIB, the Medical Insurance Bureau? Each of these corporate entities keeps records on you that determine your eligibility for bank loans, lines of credit, and medical insurance. Are you allowed to see your own record? Well, it's their data, so it doesn't belong to you -- but maybe if you ask them nicely and have due cause, they'll make an exception. Suppose you discover an error in the records they keep on you; are you allowed to demand corrections? Now you're asking subversive questions so we're putting an CM31 flag on your file ... George Orwell warned that the march of technology could allow a monolithic, tyrannical Big Brother to emerge. Database Nation points out that it's the thousands of unsupervised "kid brothers" that have a far greater potential to disrupt your life, and in ways you never expected.
I find the best way to summarize this book is chapter-by-chapter, so here are my own brief reviews of each chapter:
Chapter 1: Privacy Under Attack: Garfinkel opens with his own futurist vision: a day in the life of a typical working American. This hapless near-future dweller is continuously surrounded by targeted advertising, monitored at home and even in his car, and works in an office where constant politeness is enforced by the company surveillance cameras that are programmed to recognize facial expressions and sound an alarm whenever an employee appears disgruntled. Garfinkel explains that this book is not about Big Brother, but rather how the widespread capture and exchange of our personal information has been eroding our civil liberties already and goes largely unnoticed. Garfinkel makes the positive point that no threat to our privacy that exists today is beyond our control, and that we can develop robust, built-in systems of privacy protection rather than allow them to be only loosely guaranteed by the legal equivalent of patchwork.
Chapter 2: Database Nation: Chapter 2 starts with a historical perspective, answering the question "How did we get here?" In short, via the national census, the Social Security Board (leading to the creation of the National Data Center) and the widespread adoption of the Social Security Number and its inherent flaws (limited data capacity and lack of a checksum digit to avoid clerical errors). Page 26 launches into the disturbing episode of Steve and Nancy Ross, whose lives were shattered when the IRS botched their tax returns in 1983 and put a lien on the Ross' house for $10,000. That lien was noted in their credit records at TRW and Equifax, which in turn sold this data to 187 other independent credit bureaus. Here Garfinkel makes an interesting observation: the Ross' bad credit data spread "like a computer virus that kept reinfecting TRW's computer with incorrect information," and it took over seven years for the bulk of their credit problems to subside. Chapter 2 then explains how simple identity theft can be, whether Dumpster diving for credit statements (hint: buy yourself a cross-cutting shredder), or using Equifax's quickie credit report service to find chumps with good lines of credit, then applying for new credit cards in the victims' names. Equifax provides such thieves with everything they need: mother's maiden name, previous addresses, SSN -- it's all there. The victim's credit rating is ruined for years while bill collectors harass them day and night, and the credit card company writes off the charges and flags the victim's file. Frequently, the credit thief gets a slap on the wrist if anything at all. Page 33 lists at least 30 government agencies that are hardwired to track you only by your SSN. Chapter 2 definitely had me sitting up and paying attention.
Chapter 3: Absolute Identification: Chapter 3 is about biometrics and unambiguous identification of every member in a society, a seductive idea that has tantalized policymakers for centuries. Garfinkel argues, however, that this idea is fundamentally flawed. Garfinkel again provides historical perspective, pointing out that using biometrics is an old idea that only appears new as the technology matures. Garfinkel reminds us that even DNA testing is flawled. When a person's name is linked to a given DNA profile, for example, how hard would it be to modify that database record and change the name attached to that profile? (And did you know that 99% of DNA from any two people is identical, so DNA tests actually compare only regions of the genome that are nonessential to cell life? Hmmm ...) Garfinkel then lists various other biometric technologies such as face, voice and iris recognition; even your signature can be used as a biometric identifier. Some of these systems are already in use: Have you signed for a UPS delivery lately, or signed for credit-card purchases on an electronic touch pad? Biometrics. So here's a near-future scenerio: suppose all children need to have a DNA test shortly after being born "for the baby's health." Then the FBI warehouses the DNA fingerprints of every citizen in the U.S., and sells the data to the insurance industry, which can then compare it to the human genome map to weed out the "at risk" people, then target healthy prospects for profitable health plan solicitations... big ol' cluestick being waved around here.
Chapter 4: What Did You Do Today?: Maybe you went shopping, got some cash from the ATM, racked up some more frequent flier miles? Even the most mundane events in your daily life are recorded and archived somewhere -- from how often you withdraw cash from an ATM, to your entire purchasing history at the neighborhood grocery store, even the movies you rent at the video store. Dramatic developments in data-storage technology make it easier for businesses to keep what Garfinkel calls "hot files" on every customer transaction from day one, and then describes how we are creating the Earth's "datasphere." Nearly every durable product you buy has a serial number. Often that serial number becomes attached to your name and personal information (ever filled out a warranty card?) which can then be sold on the open data market, Garfinkel argues that even seemingly mundane information needs to be treated with respect for privacy.
Chapter 5: The View From Above: Chapter 5 is about surveillence technology and the growing private market for satellite photos and Webcams. Does it bother me that right now someone can buy a grainy aerial photo of my neighborhood taken sometime in 1987? No, sorry, that doesn't bother me. City police departments are installing surveillance cameras in public places. I still don't care. Garfinkel then explains how he set up a QuickCam to time-lapse record his Realtor while allowing prospective buyers to browse through his home without supervision. At this point I can't tell if the chapter is supposed to a condemnation or an endorsement. I suppose Garfinkel is pointing out that it's technically possible that are being watched and recorded in places when you assume you're alone. At the very least, it should change your ideas about expectations of privacy.
Chapter 6: cite> To Know Your Future: So who is the MIB? Men in Black right? No, the MIB referred to here is the Medical Information Bureau, which happens to be the secretive data warehouse of the American medical insurance industry's "customer profiles." Think you have a God-given right to medical coverage? Well, if you like Kafka novels then you'll definitely enjoy the hijinks that erupt around page 139, where Garfinkel tells us of more than a few people who've been refused medical insurance because of clerical errors in their MIB records -- records which they never knew exisited. But wait, isn't it illegal in many cases to deny medical coverage to someone with preexisting conditions? Yeah, sure it is, so what's your point? Garfinkel points out that only 23 of the 50 states actually have laws that require citizens be allowed to view their own medical histories. My only complaint with this chapter is that it pursues flaws in existing policies rather than staying with the theme of technology marching faster than prudent policy.
Chapter 7: Buy Now!: The DMA is the Direct Marketing Association. They lobby lawmakers at the state and federal level to further what they consider a God-given right to own and sell any piece of information they can attach to you. One of the nation's largest direct marketing list resellers is Metromail, now owned by the credit bureau giant Experian. Ever apply for a shopping card or magazine subscription, or fill in a product bingo card? Ever fill out a change of address form at the post office? Direct marketers get an automatic notification of your new address from the U.S. Postal Service, which causes your name/address to be copied into a hot prospect list called "New Movers," one of many direct-mailing lists sold by Metromail at the rate of $60 per thousand names. Garfinkel lists some 50 products Experian sells to businesses, like AutoCredit for quickie loan approvals, Bankruptcy candidates, Business Owner Profiles, and Property Link which provides a details of a subject's property holdings. He then argues against the opt-out clause the DMA offers to whiners (arguing instead for a more consumer-oriented opt-in approach), and lists preventative steps you can take to keep your name on as few lists as possible. This chapter left me with a question: if you complain to a direct marketing firm about what they've been doing with your personal information and then they flag you as hostile, and that direct marketer happens to be owned by a major credit bureau, what would that to your credit rating? Food for thought.
Chapter 8: Who Owns Your Information?: Take the case of Ram Avarahmi, who tried to sue a magazine publisher for selling his name, which was in their list of subscribers, to other magazine publishers. Mr. Avarahmi argued that Virginia law states that his name and his image are his property which can not be used in advertising or trade without his consent, and guess what the courts told him? "Sorry Charlie, or Ram, whatever your name is." Information is basically owned by those who gather the information and personal information is a commodity. Medical information is also a commodity owned by medical insurance providers. But can all this medical information be abused? Or let me ask it like this: are we evolved enough to not attach genetic defects to say, a person's ethnicity? Garfinkel excerpts an ad he found in the New York Times: "Ashkenazi Jewish Families Are Needed to Help Scientists Understand the Biological Basis for Schizophrenia and Bipolar Disorder" -- a 1998 John Hopkins University study, right here in America in 1998. Certainly, some medical disorders are confined to certain populations; the question is, what if someone wants to abuse such links? So do you own the books you read or the software you use? No, thanks to copyright laws. Garfinkel makes the point that you can't use the concept of ownership to protect your privacy, because you don't own data about you, however I'm not convinced. Maybe I can't force you to take my name out of your address book, because you own your address book, but I think I do have the right to demand that you not send me mail or sell my address to other businesses without my consent.
Chapter 9: Kooks and Terrorists: This chapter argues that individual terrorists deploying low-tech explosive and biological contaminants have spooked us into accepting ever more surveillance of our everyday activities. True to his style, Garfinkel dismisses some well-known urban terrorist acts as amateur-night material, then describes two fairly effective methods of introducing anthrax into an unsuspecting office building. Further pages show how terrorists might gain access to nuclear and biochemical devices. Garfinkel's point here is that constant surveillance cannot save us from a determined kook. The chapter then moves into the Big Brother question: what constitutes thoughtcrime? Didn't our benevolent goverment inter over 100,000 Japanese-Americans at the start of World War II? Didn't J. Edgar Hoover's FBI spend much of 1950's investigating "Communists" and "homosexuals"? So could our government be trusted with "brain wiretapping" technology? Sounds far-fetched? We're already using polygraphs and experiments involving fast sucessive MRI scans. Garfinkel makes the point that if we are truly concerned about public safety, we should track dangerous materials rather than try to identify potentially dangerous people.
Chapter 10: Excuse Me, But Are You Human? Imagine you're on an electronic mailing list, and you strike up an e-mail dialog with another member of the list. He tells you some things bout himself and you share something about yourself in return. Turns out "he" was actually an AI conversationalist programmed by a marketing agency to gather personal information to be sold in the form of marketing lists. Garfinkel then describes various intelligent agents that can parse natural language. But how is this useful for marketing? It is technically feasible for a marketer to scan the entire datasphere for everything that can be found about you in order to create a predictive model of your behavior: When will you buying a new car? When you will be on vacation? Valuable stuff for direct marketers to know. Might it be possible in 50 years to create a complete AI behavorial copy of you, and test various marketing schemes against it? Garfinkel actually argues that avatars should be afforded the same privacy rights as humans.
Chapter 11: Privacy Now!: Is technology neutral in the war on privacy? Garfinkel's answer is no, technology permits the greater cataloging and measuring of the world around us, and therefore technology is inherently intrusive. He argues that for the cost of around $5 million added to the annual budget, a Federal oversight agency could be created to monitor and regulate the flow of personal information throughgovernment and business data channels. Further, he proposes a list of reasonable amendments to the Fair Credit Reporting Act of 1970, such as giving consumers the ability to sue for damages resulting from the addition of erroneous information to their credit reports. Garfinkel argues that better laws and policies will be more effective than cryptography in protecting one's privacy, and warns that when some have their privacy violated, you can expect retaliation such as deliberate pollution -- and disruption to -- the datasphere. Overall, Garfinkel concludes that we need laws and policies that repect our personal information, not just a technological picket fence.
Before reading Database Nation, I had the typical "nothing-to-hide" attitude regarding my own privacy. I didn't care if some government agency or large corporation was able to read my academic records, my medical records, my magazine subscriptions, my credit-card purchases, my phone bill. "Let them read it all for all I care," I thought, "I'm sure it would bore them to tears." After reading this book, I realize it's not so much about Big Brother, it's about how the spread of your personal information can bite you in the ass someday.
My assessment: Garfinkel jam-packed this book with information every American ought to be aware of -- enough to think about to make your head spin. Thankfully his tone is not hopeless gloom-and-doom; he does remind you that 30 years ago the Cuyahoga River was an environmental disaster, but today it's safe to eat fish caught there. Overall, it's a great book. Yet another reason for me to give a favorable review to anything Simson Garfinkel writes.
Purchase this book at ThinkGeek.