What will happen is an increasing distance between laws that say what people shouldn't do and what people actually do. Any "firewall" that allows some form of two-way traffic can be circumvented. Any content filter can be circumvented by encryption. Any IP address filter (geographic or otherwise) can be circumvented by intermediaries (e.g., proxy servers). If you are really determined to filter, then you get an escalation of filter and circumvention complexity, but again, any two-way communication can be exploited. Anybody involved with circumvention had better watch out where they travel to.
Cyc is a knowledge base with a lot of facts in it. It is a product of the 80s when expert systems were going to solve all of AI's problems. The hype factor of Cyc is so large that it is almost impossible to determine what Cyc's capabilities are. No well-defined problem is being solved. All we ever see are these cute question-answer examples. If anyone can point to any coherent evaluation of Cyc (as opposed to all the anecdotes that fill the newspaper stories), we would all appreciate it.
I suppose the best thing you could say about this is that it is a lesson in licensing. Because the professor's header file is GPLed and the professor did not license it any other way, the students' work are indeed derivatives.
The next best thing is that the GPL restricts the professor, too. If a student comes up with great software, the professor can't steal the code (at least not legally).
Did you actually read what you linked to? Justin Rye is very sympathetic to "spelling reform", but he realizes it is utopian:
The flaws of the standard orthography are indefensible - but it has an extensive Installed User Base, and
can thus afford to ignore criticism in exactly the same manner as Fahrenheit thermometers, QWERTY keyboards, and certain software
packages, which can all rely on conformism, short-termism, and sheer laziness for their continued survival.
Phonetic writing is one of the greatest inventions of mankind. All a speaker needs to be literate is to learn the mapping between sounds and letters. Could anything be easier?
But like companies who still maintain their legacy software written in Cobol and who knows what else, countries and cultures hold onto their legacy alphabets, despite all their disadvantages, and despite all the moaning and groaning about education, literacy, and how hard it is to type 10,000 characters on a 100-key keyboard.
I agree there is a serious problem of understanding texts written in the "old way". There is a simple solution here, too, i.e., we just translate what's most important to the "new way" and let scholars work on the texts that don't get translated. Before anyone gets too hot here, the situation is not that much different than translating literature from one language to another. It is too much work to translate everything that is written in English into French, so one focuses on the texts that are important enough for translation.
Also, English has a lot of problems here, as it is mostly phonetic, but a large percentage is not, large enough to make learning English a lot more difficult than say learning Spanish.
I realize this is way too utopian. We Americans can't even move to metric, much less anything more "radical". I just needed to respond to the whining.
This is a misunderstanding of machine learning. Most ML algorithms attempt to find a function f(x) from a sampling of x and f(x) values. Usually, the sample is too small and the set of possible functions is too large to learn the correct function with asbolute certainty. This is a good reason for not trusting ML algorithms.
On the other hand, xmanix's problem seems to be that ML algorithms don't produce correct explanations. This is misguided. For example, I think you would prefer your airplane pilot to be able to fly the plane than to be able to explain aerodynamics. Of course, you might prefer both, but when you have neither, you probably want your ML algorithm to learn how to fly the plane and worry about explanations later, especially if you are on the plane.
This is a good point. In terms of functionality, there is no difference between a library call and a system call. They both get the job done. Libraries calls are probably more efficient, but even there the difference might not be a lot. If you have to make a lot of calls, you can simply spawn one process and sysin/sysout one line for each call.
So how about this? I turn the GPLed library into a standalone program. In my non-GPLed program, I spawn a process running the library. To make a call, I send one line to the library program and get one line back as the answer. Ok or not?
All you need is a DNA disassembler
on
The DNA Bomb
·
· Score: 3
It is not unreasonable to believe that we can create any DNA sequence we want. The problem is figuring out what the DNA sequence does.
Once we can debug DNA reliably, these science fiction fantasies will become real. Isn't that what molecular biology is doing? I see lots of reasons to be fearful of the future. Is it realistic? Currently, it appears that no one knows, but lots of people are working on the problem, which is scary.
First, you create a training set by having people categorize a lot of web pages as porn, mature, PG-13, or whatever. By a lot, I would guess that around 1,000 to 10,000 web pages would be sufficient. Then you make a list of the words from each web page in the training set, maybe also keeping track of how many times each word appears (this is called the "bag of words" representation in the literature). Now, you train your learning algorithm (neural nets in this case) to correctly categorize the training set. You use standard experimental procedures to tune your learning algorithm and to confirm its accuracy.
The whole basis of this technique is that certain combinations of keywords are more likely in porn web pages than in, say, safe sex web pages. One reason why this approach should continue to work is that web pages intentionally put various keywords into the web pages so that you can find them using any standard search engine. If they try to fool this technique, they also risk fooling anybody trying to search for these pages.
... until the monstrosity called Common Lisp was created. More features does not make a better language. Also, it has many silly compromises because of differences between Xerox and Symbolics Lisp machines. Anybody remember those?
This is almost as entertaining as the WWF. Shredder would be a great WWF name. Maybe someone already uses it.
Anyway, giving Kremnik a copy of the program is very unfair especially if the copy includes the database of computer openings. Kremnik can then try to tilt the games towards openings that he knows in advance the that the program is not good at. As I understand it, chess players spend a lot of time preparing surprises and the rules of this computer vs. human match eliminates that for the computer. Also, it will look very bad for Kremnik if he loses with this kind of advantage.
Maybe hire a lawyer and make them nervous
on
Sean In The Middle
·
· Score: 2
Standard disclaimer: IANAL
If you can afford it, you might consider hiring a lawyer to interact with the school officials. You don't need to file a lawsuit. Just have your lawyer start asking questions about the school's policy on "physical and verbal abuse" and "harassment" of kids by kids. It might not help your kid very much, but at least it will get school officials thinking about it for the other kids.
The problem you might have though is how much the school officials knew about the harassment. If Sean has kept silent the whole time, and if the school officials did not observe (or do not admit to obverving) any harassment, then you are probably SOL.
Students would be able to view previous examinations, learn exactly what questions professors ask, and learn
only those questions. This will lead to focused studying instead of the broad studying necessary for a real
education.
How is this different from studying from old exams? Anyway, there are many hidden fallacies in this assertion, in particular, that students study (much less broadly) and that students get a "real" education now.
Professors will have extra work to do in keeping the web page up-to-date.
Keeping the web site up-to-date is the TA's job.
Students would grow mad at professors who do not keep their site up-to-date, leading to lawsuits pertaining to fair education, etc.
Just like students do now to professors who don't keep their courses up-to-date. Didn't I tell you that keeping the web site up-to-date is not the professor's job?
Students with computers at home (i.e., financially stable students) will have access at all times, while others
(minorities, etc) will not, leading to an even bigger gap between upper- and middle-class.
Somehow, I think
any student who manages to find the money to go to MIT will find some way to get a computer.
Microsoft has a reputation for not playing well with others, both for having closed networking/internet protocols and for making incompatible versions of open protocols. Do you think Microsoft deserves this reputation? What is Microsoft's position on open and compatible protocols? What is Microsoft's position on reverse engineering efforts of its closed protocols?
If there is a flaw in Sunstein's arguments, it is that the information winnowing he decries has
become more and more necessary due to the sheer volume of data beamed at individual
users.
The real flaw is that Sunstein wants somebody to decide what everybody needs to know, somebody to tell us what to believe. Sorry, the "cure" is worse than the "disease".
The real problem is our schools, parents, and peers focus on telling us what to believe instead of teaching us how to determine what to believe.
IANAL, much less one on military issues, but I would guess that the military can ignore the GPL and other IP issues if the code was sufficiently important to national security.
I can be careful, I can take every
precaution, I can turn off JavaScript, and it doesn't matter. If my neighbor isn't diligent and I send him an e-mail, I'm still vulnerable.
I don't quite understand this comment. Can't you protect yourself by just deleting the Javascript in your reply? Is this nontrivial to do in these HTML mail programs?
Your neighbor could, of course, copy your message into another message with the Javascript.
Re:Don't take it so literally.
on
eWeek on Linux
·
· Score: 2
Try IBM if you need someone big and expensive to hold your hand as you go to the Linux side of the street.
Watching over their shoulders 24/7
on
Clever Girl Bess
·
· Score: 1
Filtering software legitimizes censorship and invasion of privacy. Many parents buy
filtering programs that permit them to re-trace the websites their children have
visited. They aren't teaching kids morality but Orwellian intrusions of privacy, dignity,
and, yes - morality itself.
Blocking sofware is an illusory technology. It permits the abdication of moral
responsibility - especially that of teachers and parents - to supervise their children
and provide moral direction.
So I am neither supposed to censor my children nor invade their privacy, yet I supposed to supervise my children and provide moral direction, presumably 24/7, I suppose. I'm sorry, but no parent can (or should) watch over their children's shoulders all the time. This is not an abdication of responsibility, but rather a decision of how much freedom one's children can handle. If the parents do not think the children can handle an unfiltered internet, then it seems to me that internet with censorware is better than no internet at all. Jon seems to have this idea that either freedom is unlimited or else the shackles are on.
Now I agree that current censorware does not censor very well, censoring many sites incorrectly. I think, however, that censorware can be greatly improved, and can give installers some flexibility over false negative and false positive ratios.
And the data gathering seems innocuous, as long as anonymity is carefully preserved. This is another setup; Jon complains about inaccurate censorware, and Jon also complains about gathering any data that could be used to improve the censorware. You can't have it both ways. Make your choice and stick with it, please.
But not everyone has the computer science background to understand why [it] cannot work.
And even fewer have the machine learning background to understand why it can work most of the time. There are a substantial number of papers that demonstrate that programs can learn to categorize text, including web pages, based on "combinations" of keywords. The programs are not perfect, but 90% accuracy and better is very achievable. If you want some filtering with false positives at around 1%, then you can do that, too.
There's a slight difference between a spell-checker not having a specific word in it, and a
'filtering' program that blocks thousands of legitimate sites.
I agree that current filtering programs are laughable, but that does not imply that a reasonable filtering cannot be created. The first chess-playing programs played pretty bad chess, but now they play with the best.
There are some filtering programs that work on blocking specific sites. Pretty bloody useless, too, as the main class
they're trying to block is pr0n, which change their names on a daily basis.
I find that hard to believe. Changing your site name on a daily basis doesn't seem like a good way to attract and keep any paying customers. In any case, I find it hard to have much fear in sites that change their name daily. A list of those porn sites with stable names would be a reasonable start on a filter in my opinion.
The second class, which is being pushed far more, blocks based
on 'key' words (or if other words contain these words). So forget going to a site on breast cancer, any site on
sexual diseases,... the list is huge.
Yes, simple keyword filtering is laughable. However, I'll wager that any decent machine learning algorithm could acheive 90% or better. Does anyone have a nicely differentiated list of sites?
It costs them no more for me to browse
Penthouse.com online than it does for me to browse Bookfinder.com.
Yes, it could cost them. Taxpayers will not want to fund libraries if people are using library computers for browsing penthouse.com and the like.
If they block access to a legal product
(Penthouse), then that is active censorship, unconstitutional, and illegal.
Finally, we get to the central argument. I am queasy about this kind of censorship, too, but I don't want my taxes funding porn-viewing.
I wonder if can you get issues of Penthouse via interlibrary loan. If libraries allow this, then to be consistent, they have to allow penthouse.com. However, if libraries are allowed to deny interlibrary loan of porn, they should be allowed to deny interact access to porn.
What will happen is an increasing distance between laws that say what people shouldn't do and what people actually do. Any "firewall" that allows some form of two-way traffic can be circumvented. Any content filter can be circumvented by encryption. Any IP address filter (geographic or otherwise) can be circumvented by intermediaries (e.g., proxy servers). If you are really determined to filter, then you get an escalation of filter and circumvention complexity, but again, any two-way communication can be exploited. Anybody involved with circumvention had better watch out where they travel to.
Cyc is a knowledge base with a lot of facts in it. It is a product of the 80s when expert systems were going to solve all of AI's problems. The hype factor of Cyc is so large that it is almost impossible to determine what Cyc's capabilities are. No well-defined problem is being solved. All we ever see are these cute question-answer examples. If anyone can point to any coherent evaluation of Cyc (as opposed to all the anecdotes that fill the newspaper stories), we would all appreciate it.
The next best thing is that the GPL restricts the professor, too. If a student comes up with great software, the professor can't steal the code (at least not legally).
Otherwise, it seems like a sneaky thing to do.
But like companies who still maintain their legacy software written in Cobol and who knows what else, countries and cultures hold onto their legacy alphabets, despite all their disadvantages, and despite all the moaning and groaning about education, literacy, and how hard it is to type 10,000 characters on a 100-key keyboard.
I agree there is a serious problem of understanding texts written in the "old way". There is a simple solution here, too, i.e., we just translate what's most important to the "new way" and let scholars work on the texts that don't get translated. Before anyone gets too hot here, the situation is not that much different than translating literature from one language to another. It is too much work to translate everything that is written in English into French, so one focuses on the texts that are important enough for translation.
Also, English has a lot of problems here, as it is mostly phonetic, but a large percentage is not, large enough to make learning English a lot more difficult than say learning Spanish.
I realize this is way too utopian. We Americans can't even move to metric, much less anything more "radical". I just needed to respond to the whining.
On the other hand, xmanix's problem seems to be that ML algorithms don't produce correct explanations. This is misguided. For example, I think you would prefer your airplane pilot to be able to fly the plane than to be able to explain aerodynamics. Of course, you might prefer both, but when you have neither, you probably want your ML algorithm to learn how to fly the plane and worry about explanations later, especially if you are on the plane.
So how about this? I turn the GPLed library into a standalone program. In my non-GPLed program, I spawn a process running the library. To make a call, I send one line to the library program and get one line back as the answer. Ok or not?
Once we can debug DNA reliably, these science fiction fantasies will become real. Isn't that what molecular biology is doing? I see lots of reasons to be fearful of the future. Is it realistic? Currently, it appears that no one knows, but lots of people are working on the problem, which is scary.
First, you create a training set by having people categorize a lot of web pages as porn, mature, PG-13, or whatever. By a lot, I would guess that around 1,000 to 10,000 web pages would be sufficient. Then you make a list of the words from each web page in the training set, maybe also keeping track of how many times each word appears (this is called the "bag of words" representation in the literature). Now, you train your learning algorithm (neural nets in this case) to correctly categorize the training set. You use standard experimental procedures to tune your learning algorithm and to confirm its accuracy.
The whole basis of this technique is that certain combinations of keywords are more likely in porn web pages than in, say, safe sex web pages. One reason why this approach should continue to work is that web pages intentionally put various keywords into the web pages so that you can find them using any standard search engine. If they try to fool this technique, they also risk fooling anybody trying to search for these pages.
2. we can say programs aloud.
3. programs express instructions to perform some task, similar to any how-to book or cookbook.
4. a community (computer scientists) uses programs to communicate and disseminate ideas.
I'm glad I was smart enough to think of these reasons :). This should be a no-brainer.
... until the monstrosity called Common Lisp was created. More features does not make a better language. Also, it has many silly compromises because of differences between Xerox and Symbolics Lisp machines. Anybody remember those?
Anyway, giving Kremnik a copy of the program is very unfair especially if the copy includes the database of computer openings. Kremnik can then try to tilt the games towards openings that he knows in advance the that the program is not good at. As I understand it, chess players spend a lot of time preparing surprises and the rules of this computer vs. human match eliminates that for the computer. Also, it will look very bad for Kremnik if he loses with this kind of advantage.
If you can afford it, you might consider hiring a lawyer to interact with the school officials. You don't need to file a lawsuit. Just have your lawyer start asking questions about the school's policy on "physical and verbal abuse" and "harassment" of kids by kids. It might not help your kid very much, but at least it will get school officials thinking about it for the other kids.
The problem you might have though is how much the school officials knew about the harassment. If Sean has kept silent the whole time, and if the school officials did not observe (or do not admit to obverving) any harassment, then you are probably SOL.
How is this different from studying from old exams? Anyway, there are many hidden fallacies in this assertion, in particular, that students study (much less broadly) and that students get a "real" education now.
Keeping the web site up-to-date is the TA's job.
Just like students do now to professors who don't keep their courses up-to-date. Didn't I tell you that keeping the web site up-to-date is not the professor's job?
Somehow, I think any student who manages to find the money to go to MIT will find some way to get a computer.
If so, then all you need to do is have the user download the file during installation.
Microsoft has a reputation for not playing well with others, both for having closed networking/internet protocols and for making incompatible versions of open protocols. Do you think Microsoft deserves this reputation? What is Microsoft's position on open and compatible protocols? What is Microsoft's position on reverse engineering efforts of its closed protocols?
The real flaw is that Sunstein wants somebody to decide what everybody needs to know, somebody to tell us what to believe. Sorry, the "cure" is worse than the "disease".
The real problem is our schools, parents, and peers focus on telling us what to believe instead of teaching us how to determine what to believe.
No, this is what happens when the space key stops working on your keyboard.
IANAL, much less one on military issues, but I would guess that the military can ignore the GPL and other IP issues if the code was sufficiently important to national security.
Your neighbor could, of course, copy your message into another message with the Javascript.
All you need to do is practice in your head.
Try IBM if you need someone big and expensive to hold your hand as you go to the Linux side of the street.
Blocking sofware is an illusory technology. It permits the abdication of moral responsibility - especially that of teachers and parents - to supervise their children and provide moral direction.
So I am neither supposed to censor my children nor invade their privacy, yet I supposed to supervise my children and provide moral direction, presumably 24/7, I suppose. I'm sorry, but no parent can (or should) watch over their children's shoulders all the time. This is not an abdication of responsibility, but rather a decision of how much freedom one's children can handle. If the parents do not think the children can handle an unfiltered internet, then it seems to me that internet with censorware is better than no internet at all. Jon seems to have this idea that either freedom is unlimited or else the shackles are on.
Now I agree that current censorware does not censor very well, censoring many sites incorrectly. I think, however, that censorware can be greatly improved, and can give installers some flexibility over false negative and false positive ratios.
And the data gathering seems innocuous, as long as anonymity is carefully preserved. This is another setup; Jon complains about inaccurate censorware, and Jon also complains about gathering any data that could be used to improve the censorware. You can't have it both ways. Make your choice and stick with it, please.
And even fewer have the machine learning background to understand why it can work most of the time. There are a substantial number of papers that demonstrate that programs can learn to categorize text, including web pages, based on "combinations" of keywords. The programs are not perfect, but 90% accuracy and better is very achievable. If you want some filtering with false positives at around 1%, then you can do that, too.
I agree that current filtering programs are laughable, but that does not imply that a reasonable filtering cannot be created. The first chess-playing programs played pretty bad chess, but now they play with the best.
There are some filtering programs that work on blocking specific sites. Pretty bloody useless, too, as the main class they're trying to block is pr0n, which change their names on a daily basis.
I find that hard to believe. Changing your site name on a daily basis doesn't seem like a good way to attract and keep any paying customers. In any case, I find it hard to have much fear in sites that change their name daily. A list of those porn sites with stable names would be a reasonable start on a filter in my opinion.
The second class, which is being pushed far more, blocks based on 'key' words (or if other words contain these words). So forget going to a site on breast cancer, any site on sexual diseases, ... the list is huge.
Yes, simple keyword filtering is laughable. However, I'll wager that any decent machine learning algorithm could acheive 90% or better. Does anyone have a nicely differentiated list of sites?
It costs them no more for me to browse Penthouse.com online than it does for me to browse Bookfinder.com.
Yes, it could cost them. Taxpayers will not want to fund libraries if people are using library computers for browsing penthouse.com and the like.
If they block access to a legal product (Penthouse), then that is active censorship, unconstitutional, and illegal.
Finally, we get to the central argument. I am queasy about this kind of censorship, too, but I don't want my taxes funding porn-viewing.
I wonder if can you get issues of Penthouse via interlibrary loan. If libraries allow this, then to be consistent, they have to allow penthouse.com. However, if libraries are allowed to deny interlibrary loan of porn, they should be allowed to deny interact access to porn.