The problem with your rant, Pete, is that I have told the absolute truth at every point here. We are not pursuing a search engine to rival Google et al. This grant is not about that type of project, and that type of project would be - quite frankly - ludicrous to attempt on a $250,000 grant.
Discovery at Wikipedia is awful, this is universally understood and acknowledged. This grant is the beginnings of an exploration of how to improve it.
The bullshit - and it is bullshit, and I have said it before and will say it again, that this is some kind of google competitor or was ever conceived to be - is a fantasy based on absolutely no facts of any kind, and a very very very skewed and aggressive reading of a preliminary document.
I thought this worthy of just popping in to comment even before the real interview because the question is so ludicrously misinformed.
I am a strong supporter of personal privacy and freedom of speech. Based on everything that I have seen so far, Eric Snowden will go down in history as a hero. I have been reading lots about him, including his youthful posts to Ars Technica. I think it really interesting to think about the process by which the young man who made those posts became the man we see before us today facing down all the might of the US intelligence services based on a strong belief that mass surveillance is wrong and illegal.
My actions at Wikipedia around this were perfectly honorable and noble and did not violate any rules of any kind. I invited a discussion of information that is already completely public - the user accounts that he used at Ars Technica have been widely reported. I was curious (and am still curious) to find more of his past writings. I am working through various connections to try to talk to him - I had hoped to do so in person when I visit Hong Kong in August, but obviously he's gone from there now.
I think he needs strong support from people well positioned to provide that support. I think that what he did was illegal - quite clearly so. I highly recommend the book "Concerning Dissent and Civil Disobendiance" by former US Supreme Court justice Abe Fortas for a very interesting analysis of the ethics around breaking the law deliberately in the interests of justice.
The knee jerk reaction by some in the Internet community has been, as usual, annoying. They call it anonymous "coward" for a reason - it's easy to sling mud and pretend to have the high moral ground if you feel completely and utterly unconcerned about the facts of reality.
Rather than make it hard for users to do what they want to do, on the (very valid) assumption that some of them will do bad things, or things they don't really want to do, it is better to make it easy for users to recover from those mistakes, and for others to recover easily from any side effects of those mistakes.
Since the objective is to recover disk space, the smallest couple of million files are unlikely to do very much for you at all. It's the big files that are the issue in most situations.
Compile a list of all your files, sorted by size. The ones that are the same size and the same name are probably the same file. If you're paranoid about duplicate file names and sizes (entirely plausible in some situations), then crc32 or byte-wise comparison can be done for reasonable or absolute certainty. Presumably at that point, to maintain integrity of any links to these files, you'll want to replace the files with hard links (not soft links!) so that you can later manually delete any of the "copies" without hurting all the other "copies". (There won't be separate copies, just hard links to one copy.)
If you give up after a week, or even a day, at least you will have made progress on the most important stuff.
Theresa May has not said "NO" and indeed has not responded at all. The report quotes a press release that was issued before my petition was even launched. There has been no response to me at all so far.
Every signature counts as they are clearly feeling the pressure.
Completely different.:) For one thing, we are doing everything completely freely licensed. Mahalo is proprietary.
For another thing, Mahalo is "human edited" search results for the top queries, which is not a bad idea of course, but it is not intended to be a full search engine. Mahalo have indicated an interest in replacing their google search backup with our open source alternative, if we get to be good enough, which is obviously a far from foregone conclusion.
"You operate under the sham of an open community, yet exclude those outside a very narrow political agenda. Your a fraud, using open source principals as a smokescreen that presents your personal world-view set as fact to the world."
Actually, no. Wikipedia can be criticized on a lot of grounds, some of them even valid:-); but that it presents my personal-world view or that we exclude people outside a narrow political agenda is just... not grounded in fact.
Perhaps you'd like to come to my talk page at Wikipedia and tell me what you're upset about.
Again, it would be hard for this to be a response to Knol, since I announced it and have been working on it for a year.:-)
And, if you read the linked article, you would know that *zero* donations from Wikipedia have anything at all to do with this: Wikia is a completely separate organization.
Also don't make the classic mistake of thinking that "open source" automatically means "volunteer coders". It generally does not, and the classic FUD from the proprietary world fails to describe reality for precisely this reason.
And finally, one of the most important concepts here is that of a broad deep whitelist, which is something that I think can be done realiably and well with appropriate tools in the hands of the end users. The entire problem of bot-driven spam comes from a lack of reliable quantities of human oversight in the process. All you have to do to massively spam google is fool a computer. (Well, even then, google does a pretty damned good job of preventing massive spam though of course there are always some problems.) Pretty hard to get that nonsense by a properly organized community effort.
(But of course, the design of a community which can move things forward quickly without a lot of useless work is nontrivial.)
No, it is no response to Knol. I have been working on this for a year. The press has talked about it endlessly.:-)
It'd be sort of cool if we could create a search engine in a week or two to respond to Knol, but actually it takes a bit longer.:)
I see Larry and Sergei socially from time to time. I spoke about the search project at Google Zeigeist a few months ago. Going to a google party next month. The media loves a "fight" but really, that's just a nice story arc the press makes up. (Notice: google is not in the search business, google is in the advertising matching business. This search engine doesn't hurt that business at all, indeed it probably makes it marginally less likely we will see the emergence of a proprietary competitor to topple them.)
It is actually possible for people to just enjoy doing cool stuff without being bastards about it. People forget this sometimes, maybe due to the reputation of a certain dominant software provider.:)
The question of abuse is obviously one that we are taking very seriously in thinking about design issues. My belief is that the key to solving this thorny question is hinted at by the success of wikis and the wiki model: the key is to put tools in the hands of the community that allow for broad oversight and control by the community in a process of open dialogue and discussion. This is very different from approaches that allow only for atomistic participation by a "community" which is never allowed to really become a community due to excessive reliance on algorithmic voting systems and similar.
One of the first lines of defense in the early days will be use of a community (wiki) generated whitelist of sites to crawl. We will want to work outward from there, but basically the first thing is for us to assess "look, what are the most important must-have sites on the net" and crawl them.
One thing that the mainstream media never seems to report very well, mostly because I think they don't get why it is important, is that we are doing everything here under free licenses. The software GPL, the data we generate under free licenses, etc. The aim here is not just to create a good search engine, but to create it and *give it all away* in a way that I think has a chance to restructure the entire search industry. Well, maybe not, maybe so, but what the hell, it'll be fun to see.:-)
This story is demented and broken on so many levels, it is quite difficult to know where to begin, even.
Here we have an excellent Wikipedia administrator who has been victimized by lunatic conspiracy theorists, a private person who has absolutely no relation to the wild stories that this article promulgates.
Account creation was unblocked when the ip address was unblocked. The admin who originally did the block apologized immediately when his error was pointed out, and the ip address was unblocked. Everything at Wikipedia is done by volunteers who monitor everything constantly. All sorts of things go on, ip numbers are blocked and unblocked all the time. Sometimes mistakes are made and then corrected.
My point is that a headline of "Wikipedia blocks Qatar" is inflammatory and gives people entirely the wrong idea.
I don't know what else to say about it. Wikipedia is not blocking Qatar. An IP number was blocked for about 12 hours. There was an admin discussion about the issue. The IP number was unblocked.
TVTome is an excellent example of why free licensing matters. When a community has free licensing as the social framework to allow for forking, the infrastructure providers are forced to continue to provide good service, to prevent the community from forking and leaving.
Even Dmoz, for which I have great fondness and respect, has been crippled for years by a non-free license that allowed AOL to run it into the dirt. (See the recent 6 week server outage, for which there is simply no excuse.) (The Dmoz license is not the worst possible, mind you, but it is still problematic in a number of important ways.) And their software is totally non-free.
We are introducing some changes, yes. The changes are specifically designed to make us MORE of a wiki than before.
We used to have to protect articles. We didn't like that, so we moved to what we call semi-protection. We still don't like that, so we are moving to non-vandalized-version flagging.
Each of these steps was specifically designed to make Wikipedia MORE of a wiki.
It would be grand to see Slashdot promote my correction to the New York Times story, which is totally wrong on the facts. I don't expect the New York Times to issue a correction, of course.
The facts are that the policy changes that the New York Times writes about were NOT a tightening of editorial policy, were NOT a closing of some articles, but the REMOVAL of certain overtight restrictions, and the OPENING of some articles. Bah, why can't they get it right?
I can tell you that the reporter understood this fully, fought with her editors over it, and apparently lost. Fine. The Internet can get the story right, even if the NYT can't.
This is not a major policy change. It is not what is being reported here or elsewhere. It is one of many very minor changes to the software to allow better management of the site by the community. It is my opinion that this particular status is not likely to be used very much at all because the other changes to the software will be more wiki-like and more powerful.
It is a very unfortunate thing that Wikipedia has gotten so popular that random internal bits of discussion in the community about all kinds of different things are so badly reported as 'news' when they are not. I advise the world to relax a notch or two.:-)
Wikipedia hereby formally announces tighter editorial controls on Reuters and Slashdot...;-)
I spoke in English to many journalists yesterday and the day before (90 journalists registered to cover Wikimania). I spoke to one journalist about our longstanding discussions of how to create a "stable version" or "Wikipedia 1.0". This would not involve substantial changes to how we do our usual work, but rather a new process for identifying our best work.
I spoke in English, and this was translate to German. Then the German was translated back to English, and then translated again into the Slashdot story.
There was no "announcement". We are constantly reviewing our policies and looking for ways to improve, but we have not "announced" anything. We don't even really work that way... if you know how Wikipedia works, it's through a long process of community discussion and consensus building, not through a process of top-down announcements.
The entire Wikipedia model depends on trust and goodwill. If you vandalize wikipedia, then someone will clean up after you. But it's still rude, even for an "experiment".
A Wikipedian put it this way the other day: In my neighborhood, people make a habit of picking up the trash. Please don't come and litter just to see if someone will pick it up.
Yes, absolutely, a Federal Judge should have this much power. It's one of the best checks against the possibility of tyranny.
Since the Executive and Legislative branches of government routinely ignore the U.S. Constitution, it is extremely important that we can count on the check of the Judiciary.
Hi! I actually have no idea why you think that. I had to search in my emails to even find out who "Goma" is.
The problem with your rant, Pete, is that I have told the absolute truth at every point here. We are not pursuing a search engine to rival Google et al. This grant is not about that type of project, and that type of project would be - quite frankly - ludicrous to attempt on a $250,000 grant.
Discovery at Wikipedia is awful, this is universally understood and acknowledged. This grant is the beginnings of an exploration of how to improve it.
The bullshit - and it is bullshit, and I have said it before and will say it again, that this is some kind of google competitor or was ever conceived to be - is a fantasy based on absolutely no facts of any kind, and a very very very skewed and aggressive reading of a preliminary document.
I thought this worthy of just popping in to comment even before the real interview because the question is so ludicrously misinformed.
I am a strong supporter of personal privacy and freedom of speech. Based on everything that I have seen so far, Eric Snowden will go down in history as a hero. I have been reading lots about him, including his youthful posts to Ars Technica. I think it really interesting to think about the process by which the young man who made those posts became the man we see before us today facing down all the might of the US intelligence services based on a strong belief that mass surveillance is wrong and illegal.
My actions at Wikipedia around this were perfectly honorable and noble and did not violate any rules of any kind. I invited a discussion of information that is already completely public - the user accounts that he used at Ars Technica have been widely reported. I was curious (and am still curious) to find more of his past writings. I am working through various connections to try to talk to him - I had hoped to do so in person when I visit Hong Kong in August, but obviously he's gone from there now.
I think he needs strong support from people well positioned to provide that support. I think that what he did was illegal - quite clearly so. I highly recommend the book "Concerning Dissent and Civil Disobendiance" by former US Supreme Court justice Abe Fortas for a very interesting analysis of the ethics around breaking the law deliberately in the interests of justice.
The knee jerk reaction by some in the Internet community has been, as usual, annoying. They call it anonymous "coward" for a reason - it's easy to sling mud and pretend to have the high moral ground if you feel completely and utterly unconcerned about the facts of reality.
I actually don't take any salary at all. Nor expenses. It's a fun joke, but it would be funnier if actually true.
Learn to think in the wiki way.
Rather than make it hard for users to do what they want to do, on the (very valid) assumption that some of them will do bad things, or things they don't really want to do, it is better to make it easy for users to recover from those mistakes, and for others to recover easily from any side effects of those mistakes.
This is not always possible. But it usually is.
Jimmy Wales - Wikipedia.org
Since the objective is to recover disk space, the smallest couple of million files are unlikely to do very much for you at all. It's the big files that are the issue in most situations.
Compile a list of all your files, sorted by size. The ones that are the same size and the same name are probably the same file. If you're paranoid about duplicate file names and sizes (entirely plausible in some situations), then crc32 or byte-wise comparison can be done for reasonable or absolute certainty. Presumably at that point, to maintain integrity of any links to these files, you'll want to replace the files with hard links (not soft links!) so that you can later manually delete any of the "copies" without hurting all the other "copies". (There won't be separate copies, just hard links to one copy.)
If you give up after a week, or even a day, at least you will have made progress on the most important stuff.
Theresa May has not said "NO" and indeed has not responded at all. The report quotes a press release that was issued before my petition was even launched. There has been no response to me at all so far.
Every signature counts as they are clearly feeling the pressure.
Jimmy Wales
My response? That you are misleading people.
There are a huge number of sites in the interwiki linnk map:
http://meta.wikimedia.org/wiki/Interwiki_map
Including for example, uhm, slashdot. And Citizendium. And Merriam-Webster.
And finally, I have nothing to do with the list. I've never edited it, never asked anyone to edit it, and I have no input into what goes on it.
I am sure you will apologize for spreading this information. Right?
Completely different. :) For one thing, we are doing everything completely freely licensed. Mahalo is proprietary.
For another thing, Mahalo is "human edited" search results for the top queries, which is not a bad idea of course, but it is not intended to be a full search engine. Mahalo have indicated an interest in replacing their google search backup with our open source alternative, if we get to be good enough, which is obviously a far from foregone conclusion.
"You operate under the sham of an open community, yet exclude those outside a very narrow political agenda. Your a fraud, using open source principals as a smokescreen that presents your personal world-view set as fact to the world."
:-); but that it presents my personal-world view or that we exclude people outside a narrow political agenda is just... not grounded in fact.
Actually, no. Wikipedia can be criticized on a lot of grounds, some of them even valid
Perhaps you'd like to come to my talk page at Wikipedia and tell me what you're upset about.
Again, it would be hard for this to be a response to Knol, since I announced it and have been working on it for a year. :-)
And, if you read the linked article, you would know that *zero* donations from Wikipedia have anything at all to do with this: Wikia is a completely separate organization.
Also don't make the classic mistake of thinking that "open source" automatically means "volunteer coders". It generally does not, and the classic FUD from the proprietary world fails to describe reality for precisely this reason.
And finally, one of the most important concepts here is that of a broad deep whitelist, which is something that I think can be done realiably and well with appropriate tools in the hands of the end users. The entire problem of bot-driven spam comes from a lack of reliable quantities of human oversight in the process. All you have to do to massively spam google is fool a computer. (Well, even then, google does a pretty damned good job of preventing massive spam though of course there are always some problems.) Pretty hard to get that nonsense by a properly organized community effort.
(But of course, the design of a community which can move things forward quickly without a lot of useless work is nontrivial.)
No, it is no response to Knol. I have been working on this for a year. The press has talked about it endlessly. :-)
:)
:)
It'd be sort of cool if we could create a search engine in a week or two to respond to Knol, but actually it takes a bit longer.
I see Larry and Sergei socially from time to time. I spoke about the search project at Google Zeigeist a few months ago. Going to a google party next month. The media loves a "fight" but really, that's just a nice story arc the press makes up. (Notice: google is not in the search business, google is in the advertising matching business. This search engine doesn't hurt that business at all, indeed it probably makes it marginally less likely we will see the emergence of a proprietary competitor to topple them.)
It is actually possible for people to just enjoy doing cool stuff without being bastards about it. People forget this sometimes, maybe due to the reputation of a certain dominant software provider.
One of the first lines of defense in the early days will be use of a community (wiki) generated whitelist of sites to crawl. We will want to work outward from there, but basically the first thing is for us to assess "look, what are the most important must-have sites on the net" and crawl them. One thing that the mainstream media never seems to report very well, mostly because I think they don't get why it is important, is that we are doing everything here under free licenses. The software GPL, the data we generate under free licenses, etc. The aim here is not just to create a good search engine, but to create it and *give it all away* in a way that I think has a chance to restructure the entire search industry. Well, maybe not, maybe so, but what the hell, it'll be fun to see. :-)
This story is demented and broken on so many levels, it is quite difficult to know where to begin, even.
Here we have an excellent Wikipedia administrator who has been victimized by lunatic conspiracy theorists, a private person who has absolutely no relation to the wild stories that this article promulgates.
Slashdot, you have been trolled.
You're right, I had not noticed that.
Account creation was unblocked when the ip address was unblocked. The admin who originally did the block apologized immediately when his error was pointed out, and the ip address was unblocked. Everything at Wikipedia is done by volunteers who monitor everything constantly. All sorts of things go on, ip numbers are blocked and unblocked all the time. Sometimes mistakes are made and then corrected.
My point is that a headline of "Wikipedia blocks Qatar" is inflammatory and gives people entirely the wrong idea.
I don't know what else to say about it. Wikipedia is not blocking Qatar. An IP number was blocked for about 12 hours. There was an admin discussion about the issue. The IP number was unblocked.
Move along, nothing to see.
--Jimbo Wales
TVTome is an excellent example of why free licensing matters. When a community has free licensing as the social framework to allow for forking, the infrastructure providers are forced to continue to provide good service, to prevent the community from forking and leaving.
Even Dmoz, for which I have great fondness and respect, has been crippled for years by a non-free license that allowed AOL to run it into the dirt. (See the recent 6 week server outage, for which there is simply no excuse.) (The Dmoz license is not the worst possible, mind you, but it is still problematic in a number of important ways.) And their software is totally non-free.
I don't really know what else to say about it.
We are introducing some changes, yes. The changes are specifically designed to make us MORE of a wiki than before.
We used to have to protect articles. We didn't like that, so we moved to what we call semi-protection. We still don't like that, so we are moving to non-vandalized-version flagging.
Each of these steps was specifically designed to make Wikipedia MORE of a wiki.
Sheesh.
--Jimbo
It would be grand to see Slashdot promote my correction to the New York Times story, which is totally wrong on the facts. I don't expect the New York Times to issue a correction, of course.
The facts are that the policy changes that the New York Times writes about were NOT a tightening of editorial policy, were NOT a closing of some articles, but the REMOVAL of certain overtight restrictions, and the OPENING of some articles. Bah, why can't they get it right?
I can tell you that the reporter understood this fully, fought with her editors over it, and apparently lost. Fine. The Internet can get the story right, even if the NYT can't.
Here is my correction
This is not a major policy change. It is not what is being reported here or elsewhere. It is one of many very minor changes to the software to allow better management of the site by the community. It is my opinion that this particular status is not likely to be used very much at all because the other changes to the software will be more wiki-like and more powerful.
:-)
It is a very unfortunate thing that Wikipedia has gotten so popular that random internal bits of discussion in the community about all kinds of different things are so badly reported as 'news' when they are not. I advise the world to relax a notch or two.
--Jimbo Wales
+3 insightful?
How about -5 completely lacking in any relation to reality?
eBay has never given money to Wikipedia. 1 board member works a regular 9-5 job, the rest don't.
Wikipedia hereby formally announces tighter editorial controls on Reuters and Slashdot... ;-)
I spoke in English to many journalists yesterday and the day before (90 journalists registered to cover Wikimania). I spoke to one journalist about our longstanding discussions of how to create a "stable version" or "Wikipedia 1.0". This would not involve substantial changes to how we do our usual work, but rather a new process for identifying our best work.
I spoke in English, and this was translate to German. Then the German was translated back to English, and then translated again into the Slashdot story.
There was no "announcement". We are constantly reviewing our policies and looking for ways to improve, but we have not "announced" anything. We don't even really work that way... if you know how Wikipedia works, it's through a long process of community discussion and consensus building, not through a process of top-down announcements.
The entire Wikipedia model depends on trust and goodwill. If you vandalize wikipedia, then someone will clean up after you. But it's still rude, even for an "experiment".
A Wikipedian put it this way the other day: In my neighborhood, people make a habit of picking up the trash. Please don't come and litter just to see if someone will pick it up.
So you know, like, be cool, huh?
WikiLove,
Jimbo Wales
Yes, absolutely, a Federal Judge should have this much power. It's one of the best checks against the possibility of tyranny.
Since the Executive and Legislative branches of government routinely ignore the U.S. Constitution, it is extremely important that we can count on the check of the Judiciary.