There's some level of obvious invalidity past which it can become illegal, if it's coupled with monetary demands (doesn't seem to be the case here). If the sender of the C&D knew or should have known that the claim was frivolous, and demanded monetary settlement, at least one case has held that to constitute criminal extortion.
Stylometrics is essentially a correlational field: it's not that people inherently must write in unique styles that are identifiable from a few measurable features: there is no strong genetic causation for handwriting or anything like that, which would mean that a handwriting style really does truly identify an individual or narrow set of individuals. Rather, it's that, all else being equal, people in practice, do tend to write in a way that lets the stylometric features distinguish them. But, when all else isn't equal, and people are actively trying to thwart that sort of analysis, they are, unsurprisingly, able to do so in a lot of cases.
I suspect that a lot of forensic analysis runs into this problem: it takes some fact that empirically is true among the general population, but only because the general population is not actively trying to thwart you. The set of robust empirical truths about people, that hold up even when the person is aware that you're trying to use it against them and actively trying to keep you from doing so, is much smaller.
That's no more fair use than, say, sampling a record to produce a new track is fair use.
I wouldn't use sampling as a benchmark for anything--- the caselaw on sampling is an extreme example of courts coming down against fair use in a particular area. Most other areas, fortunately, have somewhat better caselaw, which don't get it nearly as wrong as Bridgeport did.
Beyond the discussion of landlines, my favorite part of the article was actually at the end:
With broadband networks, the role of the state has less to do with limiting handouts than increasing choice. Fibre-optic networks can be run like any other public infrastructure: government, municipalities or utilities lay the cables and let private firms compete to offer services, just as public roadways are used by private logistics firms. In Stockholm, a pioneer of this system, it takes 30 minutes to change your broadband provider. Australiaâ(TM)s new $30 billion all-fibre network will use a similar model.
Even if this is unquestionably a patent violation, the Supreme Court has already held, in eBay Inc. v. MercExchange, L.L.C. (2006), that an injunction prohibiting sale of the infringing product is not necessarily the appropriate remedy in all cases. Rather, the traditional four factors for issuing an injunction must be balanced: 1) that the injury is irreparable; 2) that there are inadequate alternate remedies to compensate for the injury; 3) that the balance of hardships favors the plaintiff; and 4) that an injunction does not harm the public interest. Microsoft has an least a plausible argument that they are not satisfied in this case, and that alternate remedies (perhaps money damages) would be better than an injunction against sale, even if indeed Word is infringing.
IMHO, we really need to start talking about taking away cable and in some places fiber monopolies.
The Economist, a pro-free-market newsmagazine, proposed something like that recently:
With broadband networks, the role of the state has less to do with limiting handouts than increasing choice. Fibre-optic networks can be run like any other public infrastructure: government, municipalities or utilities lay the cables and let private firms compete to offer services, just as public roadways are used by private logistics firms. In Stockholm, a pioneer of this system, it takes 30 minutes to change your broadband provider.
Unfortunately, I doubt there are very good prospects for this: the business model of the telecom firms depends inherently on rent-seeking enabled by lack of competition.
Although amusing to ponder, I don't think there's any real question. The deletionist controversy has only ever been over edge cases, some of them high profile, but always swamped by the huge numbers of new articles that nobody's attempted to delete. Even if deletionists won on some really major class of article---delete all Pokemon characters, maybe---it'd at best be only a blip in the time v. # of articles graph.
1. WikiWikiWeb was founded by Ward Cunningham, not Jimmy Wales; and focused on cataloguing software patterns, not Simpsons episodes.
2. The direct precursor to Wikipedia was MeatballWiki, a wiki based on a new wiki engine, UseModWiki (which Wikipedia would adopt for its initial period), and focused on online culture.
3. Wikipedia was formed as a side project of Nupedia, an attempt to produce an open-content encyclopedia along more traditional lines (get volunteer writers, editors, a review process, have professors submit draft manuscripts, attach author names---usually a single author---to articles, etc.). The idea was that Wikipedia could be used as work space where people collected and organized the information, making it easier to write Nupedia articles. It never really cracked up that way, as the workspace itself quickly became a lot better encyclopedia than Nupedia ever was.
I can see it as being reasonable in some cases. One of the most common is with contests and giveaways, which essentially means, "if contests with cash prizes such as this one aren't allowed where you live, then you can't enter this contest, obviously".
It's particularly weird when you think about the fact that Amazon can afford to pay 4-10% royalties to people who refer sales to them. Surely the author should get a bigger cut than the guy who linked to the book on his blog?
On the other hand, that's also true of any money you take from anyone, ever. I've seen plenty of corporations accept money from other corporations and have costly consequences years down the road.
That's the job of physics journals. If you want to observe physical properties, derive explanations for them, and propose them with experiment that can be replicated and independently verified, what you want to do is do physics research. This is not the job of someone writing a physics encyclopedia article, which is to summarize existing physics research.
That's simply not the role of an encyclopedia (any encyclopedia), at least in the modern era. When Encyclopedia Britannica writes something about physics, it damn well had better be something the article author read in a physics journal, well-respected physics book, or other such source. It definitely shouldn't be something the author personally "saw for himself" using a pulley and lever or something.
That's the intended outcome, though. Wikipedia used to be just a collection of information put together by random people, but the goal is increasingly to build a well referenced collection of information put together by random people. If you can't cite any at least halfway-decent source for an addition, it doesn't belong in a Wikipedia article, because there would be no way for a reader to verify for themselves that the information wasn't just made up.
The fact that Wikipedia didn't do this often enough, and was to a large extent a collection of unreliable information put together by people with no credentials, with no way to verify any of it was accurate, was one of the most frequent and strongest criticisms in the early years (and still persists to some extent). So I'd say it's a definite shift in the right direction to require sources more stringently.
The web itself has changed too, for reasons other than SEO (though it's sometimes hard to tell which is which). PageRank isn't a universal law of nature, with the "best" result to any particular query being related to how many incoming links a particular site has. Rather, it's a heuristic based on something that often happened to be true--- the most useful information was located on pages at sites that were frequently linked to. It's possible that correlation is no longer as strong as it used to be.
The problem with Krugman is that he is relentlessly political and that is a bad trait for an economist.
Do you apply that to all politically outspoken economists, across the spectrum? So you would say, for example, that Milton Friedman was a bad economist?
I read that as a pretty open attack on Greenspan rather than any sort of advocacy. He's accusing Greenspan of trying to create a housing bubble to replace the Nasdaq bubble, as a way of trying to produce some fake normalcy by "curing" one bubble with a new one. And that's pretty much what happened.
but remember, you were meant to be being "open" by using the GPL
Who "meant" that? The reason I'd use the GPL isn't for some generic notion of openness, but more specifically as a way of allowing multiple copyright/reuse models to exist in a mutually agreed fashion. If you agree to license your code under the GPL, you can have mine under the GPL too. If you prefer a standard copyright/licensing model for your own software, then fine, let's do that for mine too. I'd consider adding more options of that sort too if any seemed particularly compelling. What I don't see is why I should license you my code under terms that you aren't willing to reciprocate.
the attitude of "for-profit companies shouldn't get something for nothing" isn't very endearing
I don't really see what's wrong with that. Isn't the normal way economic transactions work in a market economy with copyright laws something like: if you want to use use a part of my copyrighted work in your own, you have to get a license from me, usually involving payment?
I see using the GPL as sticking with that as the default, but making a special exception that if you give blanket permission to the general public to use and distribute your own code, royalty-free, in both original and modified versions, then you may use my code royalty-free under the same terms. But if you want to stick to the normal copyright model, then I will also, and we can agree on terms in the usual manner.
Basically I don't see how someone who uses the normal approach to copyright licensing in their own products could possibly object to me asking them to negotiate a license in order to use my code as part of their product.
Yeah, I agree with that. I probably shouldn't have used a pejorative-sounding word like "cheat", even in scare-quotes. I meant just that it lets the researcher get for free some of the things they'd usually have to argue for. From a researcher's perspective, this is a real win: there are many technically solid papers that get rejected from conferences because the reviewers thought the problem wasn't interesting enough ("maybe I believe you solved this, but why?"), or the metric wasn't the right one. Nobody's going to reject your Netflix-prize-related paper for those reasons.
It does even provide a good jumping-off point for questioning those assumptions, so I agree it's not a bad thing in any way for Netflix to be proposing goals like this. There have been papers, for example, that accept the basic ratings-prediction goal of the competition, but argue that the specific performance metric used doesn't capture the high-level goal that well.
It allows the researchers to "cheat" a bit too via an argument by authority, which is not always good, but does at least make the researcher's job easier. A big issue in data mining is that it isn't purely a technical field, but one with both conceptual and technical issues. The over-arching goal is something like, "get useful and/or interesting information out of data". But what is "useful", what is "interesting", and how do we measure when we've gotten it or not? Usually you have to defend why your problem is the right one, why your metric is the right way to measure success on it, etc. Working on the Netflix competition lets you sidestep all that, because Netflix has already decreed exactly what the goal is, and what performance metric will be used to judge success at that goal, leaving only the technical problems.
Do you really need tens of millions of dollars? Sure, there are games that cost that, but plenty that don't. And we're talking indie games here, so presumably more of the latter. World of Goo is one recent example of a very successful game made for peanuts (supposedly $10k, plus free labor from the two founders). This year's Indie Game Festival winner, Blueberry Garden, presumably didn't cost tens of millions either.
There's pros and cons to that, I think. There are downsides to a central administrator like Amazon, because they can corner the market, anything that sucks now sucks uniformly, etc. But there are also upsides: you don't have every random publisher and individual you purchase something from processing your credit-card number, you don't have to individually route disputes through each of them, etc.
There's some level of obvious invalidity past which it can become illegal, if it's coupled with monetary demands (doesn't seem to be the case here). If the sender of the C&D knew or should have known that the claim was frivolous, and demanded monetary settlement, at least one case has held that to constitute criminal extortion.
Stylometrics is essentially a correlational field: it's not that people inherently must write in unique styles that are identifiable from a few measurable features: there is no strong genetic causation for handwriting or anything like that, which would mean that a handwriting style really does truly identify an individual or narrow set of individuals. Rather, it's that, all else being equal, people in practice, do tend to write in a way that lets the stylometric features distinguish them. But, when all else isn't equal, and people are actively trying to thwart that sort of analysis, they are, unsurprisingly, able to do so in a lot of cases.
I suspect that a lot of forensic analysis runs into this problem: it takes some fact that empirically is true among the general population, but only because the general population is not actively trying to thwart you. The set of robust empirical truths about people, that hold up even when the person is aware that you're trying to use it against them and actively trying to keep you from doing so, is much smaller.
That's no more fair use than, say, sampling a record to produce a new track is fair use.
I wouldn't use sampling as a benchmark for anything--- the caselaw on sampling is an extreme example of courts coming down against fair use in a particular area. Most other areas, fortunately, have somewhat better caselaw, which don't get it nearly as wrong as Bridgeport did.
Beyond the discussion of landlines, my favorite part of the article was actually at the end:
Even if this is unquestionably a patent violation, the Supreme Court has already held, in eBay Inc. v. MercExchange, L.L.C. (2006), that an injunction prohibiting sale of the infringing product is not necessarily the appropriate remedy in all cases. Rather, the traditional four factors for issuing an injunction must be balanced: 1) that the injury is irreparable; 2) that there are inadequate alternate remedies to compensate for the injury; 3) that the balance of hardships favors the plaintiff; and 4) that an injunction does not harm the public interest. Microsoft has an least a plausible argument that they are not satisfied in this case, and that alternate remedies (perhaps money damages) would be better than an injunction against sale, even if indeed Word is infringing.
IMHO, we really need to start talking about taking away cable and in some places fiber monopolies.
The Economist, a pro-free-market newsmagazine, proposed something like that recently:
Unfortunately, I doubt there are very good prospects for this: the business model of the telecom firms depends inherently on rent-seeking enabled by lack of competition.
Well, at the time I posted my reply, it was rated "5, Informative", so I figured there was a chance people were actually misinformed.
Although amusing to ponder, I don't think there's any real question. The deletionist controversy has only ever been over edge cases, some of them high profile, but always swamped by the huge numbers of new articles that nobody's attempted to delete. Even if deletionists won on some really major class of article---delete all Pokemon characters, maybe---it'd at best be only a blip in the time v. # of articles graph.
Lest anyone be confused:
1. WikiWikiWeb was founded by Ward Cunningham, not Jimmy Wales; and focused on cataloguing software patterns, not Simpsons episodes.
2. The direct precursor to Wikipedia was MeatballWiki, a wiki based on a new wiki engine, UseModWiki (which Wikipedia would adopt for its initial period), and focused on online culture.
3. Wikipedia was formed as a side project of Nupedia, an attempt to produce an open-content encyclopedia along more traditional lines (get volunteer writers, editors, a review process, have professors submit draft manuscripts, attach author names---usually a single author---to articles, etc.). The idea was that Wikipedia could be used as work space where people collected and organized the information, making it easier to write Nupedia articles. It never really cracked up that way, as the workspace itself quickly became a lot better encyclopedia than Nupedia ever was.
I can see it as being reasonable in some cases. One of the most common is with contests and giveaways, which essentially means, "if contests with cash prizes such as this one aren't allowed where you live, then you can't enter this contest, obviously".
It's particularly weird when you think about the fact that Amazon can afford to pay 4-10% royalties to people who refer sales to them. Surely the author should get a bigger cut than the guy who linked to the book on his blog?
On the other hand, that's also true of any money you take from anyone, ever. I've seen plenty of corporations accept money from other corporations and have costly consequences years down the road.
That's the job of physics journals. If you want to observe physical properties, derive explanations for them, and propose them with experiment that can be replicated and independently verified, what you want to do is do physics research. This is not the job of someone writing a physics encyclopedia article, which is to summarize existing physics research.
That's simply not the role of an encyclopedia (any encyclopedia), at least in the modern era. When Encyclopedia Britannica writes something about physics, it damn well had better be something the article author read in a physics journal, well-respected physics book, or other such source. It definitely shouldn't be something the author personally "saw for himself" using a pulley and lever or something.
That's the intended outcome, though. Wikipedia used to be just a collection of information put together by random people, but the goal is increasingly to build a well referenced collection of information put together by random people. If you can't cite any at least halfway-decent source for an addition, it doesn't belong in a Wikipedia article, because there would be no way for a reader to verify for themselves that the information wasn't just made up.
The fact that Wikipedia didn't do this often enough, and was to a large extent a collection of unreliable information put together by people with no credentials, with no way to verify any of it was accurate, was one of the most frequent and strongest criticisms in the early years (and still persists to some extent). So I'd say it's a definite shift in the right direction to require sources more stringently.
The web itself has changed too, for reasons other than SEO (though it's sometimes hard to tell which is which). PageRank isn't a universal law of nature, with the "best" result to any particular query being related to how many incoming links a particular site has. Rather, it's a heuristic based on something that often happened to be true--- the most useful information was located on pages at sites that were frequently linked to. It's possible that correlation is no longer as strong as it used to be.
Do you apply that to all politically outspoken economists, across the spectrum? So you would say, for example, that Milton Friedman was a bad economist?
I read that as a pretty open attack on Greenspan rather than any sort of advocacy. He's accusing Greenspan of trying to create a housing bubble to replace the Nasdaq bubble, as a way of trying to produce some fake normalcy by "curing" one bubble with a new one. And that's pretty much what happened.
Who "meant" that? The reason I'd use the GPL isn't for some generic notion of openness, but more specifically as a way of allowing multiple copyright/reuse models to exist in a mutually agreed fashion. If you agree to license your code under the GPL, you can have mine under the GPL too. If you prefer a standard copyright/licensing model for your own software, then fine, let's do that for mine too. I'd consider adding more options of that sort too if any seemed particularly compelling. What I don't see is why I should license you my code under terms that you aren't willing to reciprocate.
I don't really see what's wrong with that. Isn't the normal way economic transactions work in a market economy with copyright laws something like: if you want to use use a part of my copyrighted work in your own, you have to get a license from me, usually involving payment?
I see using the GPL as sticking with that as the default, but making a special exception that if you give blanket permission to the general public to use and distribute your own code, royalty-free, in both original and modified versions, then you may use my code royalty-free under the same terms. But if you want to stick to the normal copyright model, then I will also, and we can agree on terms in the usual manner.
Basically I don't see how someone who uses the normal approach to copyright licensing in their own products could possibly object to me asking them to negotiate a license in order to use my code as part of their product.
Ah, the sheep, a stellar piece of industrial machinery.
Yeah, I agree with that. I probably shouldn't have used a pejorative-sounding word like "cheat", even in scare-quotes. I meant just that it lets the researcher get for free some of the things they'd usually have to argue for. From a researcher's perspective, this is a real win: there are many technically solid papers that get rejected from conferences because the reviewers thought the problem wasn't interesting enough ("maybe I believe you solved this, but why?"), or the metric wasn't the right one. Nobody's going to reject your Netflix-prize-related paper for those reasons.
It does even provide a good jumping-off point for questioning those assumptions, so I agree it's not a bad thing in any way for Netflix to be proposing goals like this. There have been papers, for example, that accept the basic ratings-prediction goal of the competition, but argue that the specific performance metric used doesn't capture the high-level goal that well.
It allows the researchers to "cheat" a bit too via an argument by authority, which is not always good, but does at least make the researcher's job easier. A big issue in data mining is that it isn't purely a technical field, but one with both conceptual and technical issues. The over-arching goal is something like, "get useful and/or interesting information out of data". But what is "useful", what is "interesting", and how do we measure when we've gotten it or not? Usually you have to defend why your problem is the right one, why your metric is the right way to measure success on it, etc. Working on the Netflix competition lets you sidestep all that, because Netflix has already decreed exactly what the goal is, and what performance metric will be used to judge success at that goal, leaving only the technical problems.
Do you really need tens of millions of dollars? Sure, there are games that cost that, but plenty that don't. And we're talking indie games here, so presumably more of the latter. World of Goo is one recent example of a very successful game made for peanuts (supposedly $10k, plus free labor from the two founders). This year's Indie Game Festival winner, Blueberry Garden, presumably didn't cost tens of millions either.
There's pros and cons to that, I think. There are downsides to a central administrator like Amazon, because they can corner the market, anything that sucks now sucks uniformly, etc. But there are also upsides: you don't have every random publisher and individual you purchase something from processing your credit-card number, you don't have to individually route disputes through each of them, etc.