If you think big money can be made for the school and the creators by selling copies of this work (which is often NOT the case), then my recommendation would be to have the school be the copyright owner (it will be by default given that its employees are doing a work-for-hire), that the school enters with the employees a contract for sharing the royalties (meaning that e.g. if the school sells the rights for printing a book, it will give a certain fraction of the royalties to the author), and that the school releases the work under a Creative Commons Non-Commercial (CC-BY-NC) license.
If, on the contrary, you believe no one will make much money by selling this work, then you can have the author release the work under a Creative Commons (CC-BY) license indicating that the attribution should be given to both the creator and the school. This will allow wide dissemination of the work, will allow others to build upon it, will prevent others from making profit at your expense (someone can print and sell copies, but anyone else can do it, so if copies are sold that would be at near-cost prices), and will make sure the employee and the school get due credit.
Wikipedia:Credentials outlines the proposal. It comes from an idea suggested by Jimbo in 2005 and again in 2007, after the Essjay controversy.
The proposal is that "Wikipedia develops a system for verifying editors' credentials, so as to encourage greater accountability for users who claim expertise in certain fields".
[T]he descriptions you hear all the time about how one gram can kill a bazillion people assumes that each person gets exactly a lethal dose
This is equivalent to the fact that a healthy man, in theory, can get over 20 million women pregnant with 1 ml of semen. The problem is not the quantity of sperms, the problems are the transportation and the logistics:-)
... in the case of China, I believe that you need to trust that the first node is legit
It doesn't matter if the first node is not legit. First, you can deny that you originated the traffic, as you can be relying packets for other Tor nodes. Second, the route changes every 10 minutes.
China's internet censorship works at several levels. It includes content-based filtering (banned terms in the text of what you are sending, including "human rights", "democracy" and "Dalai Lama"), so any attempt to bypass the filtering has to be encrypted. It also includes DNS-based filtering so some DNS lookups return the wrong IP addresses, and of course it also includes IP-based filtering that prevent Chinese users from accessing the BBC or Wikipedia, for instance.
Tor can be very effective at bypassing most of these protections, and you can choose to run it on port 443 (https) to avoid port-based filtering. Also, you can limit the amount of bandwidth you want to donate to other nodes, and the default outgoing policy prevents connections to port 25 so you can't use a Tor node for sending spam.
On the client side, using SwitchProxy for FireFox is helpful to maintain a list of proxies, including a local Tor instance, that works as a SOCKS proxy, and a list of open proxies (SwitchProxy can automatically change proxy every X seconds).
I think what was lost when the IBM PCs became popular, was the fact that you no longer started inside a programming language interpreter. In an old ZX80/Atari/Commodore, after booting you had just the prompt:
READY
The computer was inviting you to type something. Nowadays the computer invites you to explore what others have done, not to create your own stuff to make it work. And that's a huge difference.
A version of an article can be validated through the article validation feature (now in beta stage). This validation is a voting by users on several topics, including how neutral, complete, accurate, etc. an article is (see proposal for the interface). This is useful, for instance, for burning certain versions of articles of the Wikipedia into a CDROM, to be used where internet access is expensive or nonexistent.
There are measures for protecting a page, and it has been done before. Vandalism has been always kept to a much lower level than alarmists think. Please see replies to common objection
Most important of all, the Wikimania Conference is an ongoing event: (from the article) "Es gibt mehrere Möglichkeiten, die wir hier in Frankfurt besprechen": "There are several possibilities, that we are discussing in Frankfurt". I think the outcome will continue in the way of protecting some pages when it is estrictly necessary and validating some versions of the articles.
It will never be something limiting the freedom for anyone to edit pages, which is at the core of Wikipedia's success, and very deeply established in the community of Wikipedians.
From Free Culture by Lawrence Lessig: "File sharers share different kinds of content. We can divide these different kinds into four types.
A. There are some who use sharing networks as substitutes for purchasing
content. Thus, when a new Madonna CD is released, rather than buying the CD,
these users simply take it [...]
B. There are some who use sharing networks to sample music before purchasing it [...] The net effect of this sharing could increase the quantity of music purchased.
C. There are many who use sharing networks to get access to copyrighted content
that is no longer sold or that they would not have purchased because the
transaction costs off the Net are too high [...]
D. Finally, there are many who use sharing networks to get access to content
that is not copyrighted or that the copyright owner wants to give away.
How do these different types of sharing balance out? [...] From the perspective of the
law, only type D sharing is clearly legal. From the perspective of economics,
only type A sharing is clearly harmful. Type B sharing is illegal but
plainly beneficial. Type C sharing is illegal, yet good for society [...]
The "net harm" to the industry as a whole is the amount by which type A sharing exceeds type B."
"Never leave people in peace, because when they are in peace, you are a nobody. They do not need you; your very purpose is not there. They need you when there is danger. Create danger. If there is no real danger, at least create the climate of a false danger." --Adolf Hitler
Linking to bad pages harms your reputation
on
Google TrustRank
·
· Score: 1
This is the opposite idea: instead of rewarding pages that are linked from trusted Web sites, pages linking to bad Web sites are punished.
See: Web Spam, Propaganda and Trust.
But the common-sense recommendation is to do it one eye at a time, with 1-2 weeks between them so you can see the results. Start with your worst eye, just in case.
This is very difficult to do. Most keyword-based search systems use an inverted index for searches.
The first step involves a hash table that converts keywords into term-ids. Then the inverted index is used: it is a table that holds, for each term-id, a sorted list of document-ids that contain the term. The search process is almost instantaneous, as it only involves operations on sorted lists.
To use regexps, the search engine must convert the regexp into a series of words that match the regexp (a very large set - potentially infinite) and then look them all in the inverted index. This is very slow and, as most users never use the advanced search function, very unlikely to be added to popular search engines until some competitive data structure is discovered.
An idea: the method of 'shingles' (fingerprints of n-grams of lines/words/characters) could be used for creating a big, shared repository of copyrighted code -- without the code. This can avoid this kind of claims in the future, without the need of manually checking for every line of code contributed to open source projects.
A 'client' program is run by people that have access to copyrighted code. Then program generates the fingerprints, that are uploaded to the repository (including information about the copyright holder, software name, version, filename, linenums, fingerprint). Whenever anyone wants to check if a piece of code is copyrighted, s/he can generate the fingerprints and compare them against the repository.
False positives?: MD5 checksums in general don't collide. Poisoning?: probably a person can upload huge amounts of fake MD5 checksums. That's why some redundancy is necessary: an MD5 checksum is valid if it has been uploaded by at least X people.
If you think big money can be made for the school and the creators by selling copies of this work (which is often NOT the case), then my recommendation would be to have the school be the copyright owner (it will be by default given that its employees are doing a work-for-hire), that the school enters with the employees a contract for sharing the royalties (meaning that e.g. if the school sells the rights for printing a book, it will give a certain fraction of the royalties to the author), and that the school releases the work under a Creative Commons Non-Commercial (CC-BY-NC) license.
If, on the contrary, you believe no one will make much money by selling this work, then you can have the author release the work under a Creative Commons (CC-BY) license indicating that the attribution should be given to both the creator and the school. This will allow wide dissemination of the work, will allow others to build upon it, will prevent others from making profit at your expense (someone can print and sell copies, but anyone else can do it, so if copies are sold that would be at near-cost prices), and will make sure the employee and the school get due credit.
Wikipedia:Credentials outlines the proposal. It comes from an idea suggested by Jimbo in 2005 and again in 2007, after the Essjay controversy. The proposal is that "Wikipedia develops a system for verifying editors' credentials, so as to encourage greater accountability for users who claim expertise in certain fields".
Swivel offers a similar service. One of the best things of Swivel is that datasets are usually shared by users under a Creative Commons License.
[T]he descriptions you hear all the time about how one gram can kill a bazillion people assumes that each person gets exactly a lethal dose
This is equivalent to the fact that a healthy man, in theory, can get over 20 million women pregnant with 1 ml of semen. The problem is not the quantity of sperms, the problems are the transportation and the logistics(((:~{>
Muhammad wearing sunglasses
(((B~{>
Muhammad as a pirate
(((P~{>
Muhammad on a bad turban day
))):~{>
Muhammad with sand in his eye
(((;~{>
Muhammad with a bomb in his turban
*-O(:~{>
Muhammad sees a Danish cartoonist
((((8~{o>
Muhammad after going quail hunting with Dick Cheney
(:(:(:((8~{>:::::::::::::
Source: Transterrestrial Musings
... or they will ask next for the logs of the Google Web Accelerator.
China's internet censorship works at several levels. It includes content-based filtering (banned terms in the text of what you are sending, including "human rights", "democracy" and "Dalai Lama"), so any attempt to bypass the filtering has to be encrypted. It also includes DNS-based filtering so some DNS lookups return the wrong IP addresses, and of course it also includes IP-based filtering that prevent Chinese users from accessing the BBC or Wikipedia, for instance.
Tor can be very effective at bypassing most of these protections, and you can choose to run it on port 443 (https) to avoid port-based filtering. Also, you can limit the amount of bandwidth you want to donate to other nodes, and the default outgoing policy prevents connections to port 25 so you can't use a Tor node for sending spam.
On the client side, using SwitchProxy for FireFox is helpful to maintain a list of proxies, including a local Tor instance, that works as a SOCKS proxy, and a list of open proxies (SwitchProxy can automatically change proxy every X seconds).
I think what was lost when the IBM PCs became popular, was the fact that you no longer started inside a programming language interpreter. In an old ZX80/Atari/Commodore, after booting you had just the prompt:
READY
The computer was inviting you to type something. Nowadays the computer invites you to explore what others have done, not to create your own stuff to make it work. And that's a huge difference.
The W3C proposed in 2003 a number of Solutions for the Inaccessibility of Visually-Oriented Anti-Robot Tests, including logic puzzles, audio captchas, credit card validation, etc. It is interesting that they also show how a federated identity system can help users with disabilities.
- A version of an article can be validated through the article validation feature (now in beta stage). This validation is a voting by users on several topics, including how neutral, complete, accurate, etc. an article is (see proposal for the interface). This is useful, for instance, for burning certain versions of articles of the Wikipedia into a CDROM, to be used where internet access is expensive or nonexistent.
- There are measures for protecting a page, and it has been done before. Vandalism has been always kept to a much lower level than alarmists think. Please see replies to common objection
Most important of all, the Wikimania Conference is an ongoing event: (from the article) "Es gibt mehrere Möglichkeiten, die wir hier in Frankfurt besprechen": "There are several possibilities, that we are discussing in Frankfurt". I think the outcome will continue in the way of protecting some pages when it is estrictly necessary and validating some versions of the articles. It will never be something limiting the freedom for anyone to edit pages, which is at the core of Wikipedia's success, and very deeply established in the community of Wikipedians.- A. There are some who use sharing networks as substitutes for purchasing
content. Thus, when a new Madonna CD is released, rather than buying the CD,
these users simply take it [...]
- B. There are some who use sharing networks to sample music before purchasing it [...] The net effect of this sharing could increase the quantity of music purchased.
- C. There are many who use sharing networks to get access to copyrighted content
that is no longer sold or that they would not have purchased because the
transaction costs off the Net are too high [...]
- D. Finally, there are many who use sharing networks to get access to content
that is not copyrighted or that the copyright owner wants to give away.
How do these different types of sharing balance out? [...] From the perspective of the law, only type D sharing is clearly legal. From the perspective of economics, only type A sharing is clearly harmful. Type B sharing is illegal but plainly beneficial. Type C sharing is illegal, yet good for society [...]The "net harm" to the industry as a whole is the amount by which type A sharing exceeds type B."
"Never leave people in peace, because when they are in peace, you are a nobody. They do not need you; your very purpose is not there. They need you when there is danger. Create danger. If there is no real danger, at least create the climate of a false danger."
--Adolf Hitler
This is the opposite idea: instead of rewarding pages that are linked from trusted Web sites, pages linking to bad Web sites are punished. See: Web Spam, Propaganda and Trust.
It was also proven years ago that Internet Explorer displays pages served by Microsoft IIS faster, by tinkering with TCP/IP.
But the common-sense recommendation is to do it one eye at a time, with 1-2 weeks between them so you can see the results. Start with your worst eye, just in case.
This is very difficult to do. Most keyword-based search systems use an inverted index for searches.
The first step involves a hash table that converts keywords into term-ids. Then the inverted index is used: it is a table that holds, for each term-id, a sorted list of document-ids that contain the term. The search process is almost instantaneous, as it only involves operations on sorted lists.
To use regexps, the search engine must convert the regexp into a series of words that match the regexp (a very large set - potentially infinite) and then look them all in the inverted index. This is very slow and, as most users never use the advanced search function, very unlikely to be added to popular search engines until some competitive data structure is discovered.
An idea: the method of 'shingles' (fingerprints of n-grams of lines/words/characters) could be used for creating a big, shared repository of copyrighted code -- without the code. This can avoid this kind of claims in the future, without the need of manually checking for every line of code contributed to open source projects.
A 'client' program is run by people that have access to copyrighted code. Then program generates the fingerprints, that are uploaded to the repository (including information about the copyright holder, software name, version, filename, linenums, fingerprint). Whenever anyone wants to check if a piece of code is copyrighted, s/he can generate the fingerprints and compare them against the repository.
False positives?: MD5 checksums in general don't collide. Poisoning?: probably a person can upload huge amounts of fake MD5 checksums. That's why some redundancy is necessary: an MD5 checksum is valid if it has been uploaded by at least X people.