For how long will real newspapers exist if the readers want everything for free online? And when will they notice that all these interesting stories, ranging from PRISM to Syria were actually being written by paid journalists? When will the blindingly obvious implication that a world without investigative journalism is a dictatorship hit them?
And now someone like Henry Blodget is trying to say that newspapers need stuff that can't be found elsewhere to survive, which basically means to become the local gossiping outlet? He should be ashamed of himself.
What to do when science reporting fails even on Slashdot? The effect found in the study relates to performance on priming tasks. The abstract explicitly says: "has yet to investigate consequences for linguistic performance". Recognition tasks usually require the subject to hit the right button when recognizing a string of characters as a word or a non-word. A naming task requires the subject to point at or pronounce the proper name for an image, which is also influenced by preceding images or words. Performance is expressed either in error rates or in the (average) time it takes, and 100ms of difference is considered a pretty large effect. Anything larger is a bit suspect.
The classical priming task is showing people two words in a row, which are either related (bakery - bread) or unrelated (spider - bread). It turns out people recognize the second word faster when the first word is related. This effect is old, and pretty stable across studies and languages, and the same holds for naming. The effect also goes by the name of facilitation, and the opposite by interference or distraction. Now, it's pretty easy to consider showing a Chinese icon as just an example of interference. It can be considered to relate more to Chinese and therefore to "prime" Chinese language recognition and consequently interfere with English language recognition. That would explain the result in line with other priming experiments without implying anything about immersion, as immersion involves a lot more than an icon or a face, and as the interference effect decays over time. The effects of language acquisition in immersion or in your own ethnic group can be easily ascribed to the frequency of use, which has a much larger and self-sustaining effect.
To my joy, I notice that no-one actually tries to support or refute the claims from the OP. And that's a good thing. It is talk from someone who considers himself visionary because he says something very 2.0 based upon acronyms and projects he doesn't understand. The kind of tech in OpenNLP has been around for 20 years now, and adding a few components that can brokerage and leverage and whateverage unstructured data is not going to improve it.
What topic are you interested in? Cogsci (which is more oriented towards how the brain does it) or more linguistics (more towards a proper description)? Words, syntax, semantics? Or more natural language processing (how to get a computer to understand language)?
He may be highly respected, but I don't buy into the stacking of assumption on assumption on assumption, without ever touching something verifiable. I've had my share of run-ins with linguists (in 20 years of cognitive psychology, specializing in syntactic analysis), and much of linguistics is arm-chair philosophy, or reverse engineering dressed up as science. Some theories describe language behavior well up until a certain level, but there is very little evidence supporting it, and reconstructing word relations based on fantasy isn't going to help that.
I don't know why people even bother to publish this kind of research. Sure, it's fun to make a tree of relations between words, but the result doesn't mean a thing. The analysis is built upon 200 entries from an etymological dictionary, which is in itself a big bag of assumptions, and they managed to exclude 10% of those, including some very high frequent words (and, in, when, where, with).
This sounds a bit off to me. Statistical NLP needs large amounts of data. How many data points do they have that can reliably be labelled "precursor of genocide" vs "no precursor of genocide"? There haven't been that many genocides, is it? And as the article says: "hate speech isn't in short supply"...
Of course it's PR. If you check the experience of the team, you'll see that they are mostly trained in PR, not in making rockets. There is one person in the team who knows a bit about space vehicles. The rest is design and social media. They're going to tweet their way to Mars.
I've already said it when Google launched Chrome: they are trying to tie the users in. Sooner or later, they're going to offer a product that is exclusively available in Chrome. They're going to do better gaming in Chrome (Javascript is too slow; think how nice Farmville can look!). That time seems to have come. And once accepted, there's no way back, and the masses will be logged into their google account forever.
No. The language was, possibly still is, a badly defined mess. It's fine to write while (<file1>), right? It's fine to write while (a && b), right? But it's not fine to use two handles with a logical in a while test. Why? Just because. And there's a lot more like that. The language design was sloppy. Like placing parentheses around an expression doesn't alter its value. Except in Perl when the context asks for a scalar, if I remember properly. And the parameter mechanism, another obnoxious hack. And all these $, %, # and whatever else the ascii table offers.
True dat. But in those cases, Perl should be used like awk: only for short scripts, preferably things that fit on the command line. As soon as a perl (or awk) scripts exceeds 99 lines, it should probably be rewritten...
About the manual, I agree. But the "interface"? It's so much cleaner than perl (which is the topic of this thread), and for some tasks pretty elegant, such as inspecting log files, processing dictionaries, tricky find/replace, etc. I wouldn't use it for anything more complex, but I certainly won't be using Perl for that either. I remember writing a Perl script that had to do some kind of diff between two files and wondering why the following line failed
The relations represent analysis of fMRI scans. Something like: if the subjects all have the same pattern of activation for object A and object B, then these objects must be related. While I don't deny that semantic relations in our brains must almost certainly have some physical correlate, the reverse doesn't hold: e.g., a "voxel", the smallest unit being measured, easily contains 10,000 neurons, so a lot of different patterns of processing cannot be distinguished. Also, fMRI measurements are very noisy, and using just 5 people is going to make that look like correlations. Most likely, these patterns are the artefacts of the visual learning process.
I do see use of this kind of method though: if you've got reliable activity patterns for a large group of people, you can try to make sense of the patterns in another experiment, or in a subject that wasn't classified yet, and it may help tackle other problems.
In psycho-linguistics, it has always been understood that parsing is an sub-conscious, automatic process. Parsing sentences consciously is extremely slow, as every 2nd language learner knows, and we can do it at a speed of about 4 words per second without any problem. But the experiments as described in the extract do not warrant the conclusions. Effects of lexical priming have been known for a looooong time (since the 1930s, I think), and it remains to be seen if none of the results can be attributed to any other kind of information that precise computation of the arithmetic problem or perfect understanding of the sentence.
I can only agree. There are people claiming Latex is a good example. Or -2. Let's hope they were all being ironical. If not, we're in for another decade of UI set back.
Apart from the usual nit-picking over audiophile and their slightly fetish-like obsessions with materials, this device is going to pick up environment sounds. A good plate reverb must be well shielded. This one isn't, and his apartment is pretty noisy, judging by the video.
And wrt convolution: the normal convolution reverbs are simple FIR convolvers, which is a highly idealized model. In reality, processes are never finite (but we don't care, if the sampled response is long enough), but more importantly they can be non-linear. Saturation effects are very much liked by musicians, but cannot be treated as a normal convolution process. There have been attempts at modelling non-linear processes with convolution, but not to great success.
Perhaps 'game changers' is an exaggeration, but only papers that make minor extensions to the existing literature are accepted on first submission. In my (ex) field, papers that challenge a certain view get their share of flak from the reviewers. I've seen papers being shot down (see what I did there?) because the reviewers belonged to a different school. It's of course not always the case, but it does happen too often. One of the reasons is that such papers usually get reviewed by at least one of the opponents, or someone closely involved. Consequently, when such papers get accepted, they generate replies, and thus citations, in contrast to the papers that are in line with the main view.
I think the conclusion that the GP has a good point and that the conclusion "peer review works" cannot be drawn on.
I worked with systems that process LPR (travel times, congestion, etc.), and they have big gaps in their data. Bad weather usually freaks out the cameras, and a bit of mud on a license plate also seems to be enough to fool them.
But storing that data for 6 years: what do they want with it? For historical comparison, the could anonymize the data.
Although Reason looks particularly annoying to me (never used it), imitating physical equipment does have a function. Usually, the controls on physical equipment are presented in a logical way, grouping by function, emphasizing important controls, etc. You can call it conventional (in the sense of "based on or in accordance with what is generally done or believed"). If the software mimics (copies, imitates) this, it's because the makers don't have a better idea, so they just copy the old and proven design. This can lead to skeuolositimorphicologition (I made that one up), but it is not so by itself.
And as the GP points out: a lot of it is beautification (making things look nice). In Logic, you can switch your plugins to a view with just the controls. Now that's functional, but very, very annoying.
12 years ago, I worked at a place where they categorized online content and looked for answers to common questions. All the "editors" had their own subset of the content assigned. So, naturally, a couple of them did sex. At first they probably thought it was great (I never asked this), but after some time it got boring and depressing, and they didn't want to do it any longer. That was well within a year. So how come Google doesn't know this? Because if looking for mainstream porn is wearing you out, the stuff mentioned has got to be killing.
For how long will real newspapers exist if the readers want everything for free online? And when will they notice that all these interesting stories, ranging from PRISM to Syria were actually being written by paid journalists? When will the blindingly obvious implication that a world without investigative journalism is a dictatorship hit them?
And now someone like Henry Blodget is trying to say that newspapers need stuff that can't be found elsewhere to survive, which basically means to become the local gossiping outlet? He should be ashamed of himself.
What to do when science reporting fails even on Slashdot? The effect found in the study relates to performance on priming tasks. The abstract explicitly says: "has yet to investigate consequences for linguistic performance". Recognition tasks usually require the subject to hit the right button when recognizing a string of characters as a word or a non-word. A naming task requires the subject to point at or pronounce the proper name for an image, which is also influenced by preceding images or words. Performance is expressed either in error rates or in the (average) time it takes, and 100ms of difference is considered a pretty large effect. Anything larger is a bit suspect.
The classical priming task is showing people two words in a row, which are either related (bakery - bread) or unrelated (spider - bread). It turns out people recognize the second word faster when the first word is related. This effect is old, and pretty stable across studies and languages, and the same holds for naming. The effect also goes by the name of facilitation, and the opposite by interference or distraction. Now, it's pretty easy to consider showing a Chinese icon as just an example of interference. It can be considered to relate more to Chinese and therefore to "prime" Chinese language recognition and consequently interfere with English language recognition. That would explain the result in line with other priming experiments without implying anything about immersion, as immersion involves a lot more than an icon or a face, and as the interference effect decays over time. The effects of language acquisition in immersion or in your own ethnic group can be easily ascribed to the frequency of use, which has a much larger and self-sustaining effect.
To my joy, I notice that no-one actually tries to support or refute the claims from the OP. And that's a good thing. It is talk from someone who considers himself visionary because he says something very 2.0 based upon acronyms and projects he doesn't understand. The kind of tech in OpenNLP has been around for 20 years now, and adding a few components that can brokerage and leverage and whateverage unstructured data is not going to improve it.
What topic are you interested in? Cogsci (which is more oriented towards how the brain does it) or more linguistics (more towards a proper description)? Words, syntax, semantics? Or more natural language processing (how to get a computer to understand language)?
He may be highly respected, but I don't buy into the stacking of assumption on assumption on assumption, without ever touching something verifiable. I've had my share of run-ins with linguists (in 20 years of cognitive psychology, specializing in syntactic analysis), and much of linguistics is arm-chair philosophy, or reverse engineering dressed up as science. Some theories describe language behavior well up until a certain level, but there is very little evidence supporting it, and reconstructing word relations based on fantasy isn't going to help that.
I don't know why people even bother to publish this kind of research. Sure, it's fun to make a tree of relations between words, but the result doesn't mean a thing. The analysis is built upon 200 entries from an etymological dictionary, which is in itself a big bag of assumptions, and they managed to exclude 10% of those, including some very high frequent words (and, in, when, where, with).
Take this one with a grain of salt...
This sounds a bit off to me. Statistical NLP needs large amounts of data. How many data points do they have that can reliably be labelled "precursor of genocide" vs "no precursor of genocide"? There haven't been that many genocides, is it? And as the article says: "hate speech isn't in short supply"...
Of course it's PR. If you check the experience of the team, you'll see that they are mostly trained in PR, not in making rockets. There is one person in the team who knows a bit about space vehicles. The rest is design and social media. They're going to tweet their way to Mars.
I've already said it when Google launched Chrome: they are trying to tie the users in. Sooner or later, they're going to offer a product that is exclusively available in Chrome. They're going to do better gaming in Chrome (Javascript is too slow; think how nice Farmville can look!). That time seems to have come. And once accepted, there's no way back, and the masses will be logged into their google account forever.
No. The language was, possibly still is, a badly defined mess. It's fine to write while (<file1>), right? It's fine to write while (a && b), right? But it's not fine to use two handles with a logical in a while test. Why? Just because. And there's a lot more like that. The language design was sloppy. Like placing parentheses around an expression doesn't alter its value. Except in Perl when the context asks for a scalar, if I remember properly. And the parameter mechanism, another obnoxious hack. And all these $, %, # and whatever else the ascii table offers.
True dat. But in those cases, Perl should be used like awk: only for short scripts, preferably things that fit on the command line. As soon as a perl (or awk) scripts exceeds 99 lines, it should probably be rewritten...
Thanks for the link. It's great. And how on earth could c++ lose from Java? Seems a certain standard template library sucks.
Ok, that should have been: while (<file1> && <file2>)
About the manual, I agree. But the "interface"? It's so much cleaner than perl (which is the topic of this thread), and for some tasks pretty elegant, such as inspecting log files, processing dictionaries, tricky find/replace, etc. I wouldn't use it for anything more complex, but I certainly won't be using Perl for that either. I remember writing a Perl script that had to do some kind of diff between two files and wondering why the following line failed
while ( && )
Horrible.
With a bit of luck, awk will outlive perl.
The relations represent analysis of fMRI scans. Something like: if the subjects all have the same pattern of activation for object A and object B, then these objects must be related. While I don't deny that semantic relations in our brains must almost certainly have some physical correlate, the reverse doesn't hold: e.g., a "voxel", the smallest unit being measured, easily contains 10,000 neurons, so a lot of different patterns of processing cannot be distinguished. Also, fMRI measurements are very noisy, and using just 5 people is going to make that look like correlations. Most likely, these patterns are the artefacts of the visual learning process.
I do see use of this kind of method though: if you've got reliable activity patterns for a large group of people, you can try to make sense of the patterns in another experiment, or in a subject that wasn't classified yet, and it may help tackle other problems.
In psycho-linguistics, it has always been understood that parsing is an sub-conscious, automatic process. Parsing sentences consciously is extremely slow, as every 2nd language learner knows, and we can do it at a speed of about 4 words per second without any problem. But the experiments as described in the extract do not warrant the conclusions. Effects of lexical priming have been known for a looooong time (since the 1930s, I think), and it remains to be seen if none of the results can be attributed to any other kind of information that precise computation of the arithmetic problem or perfect understanding of the sentence.
I can only agree. There are people claiming Latex is a good example. Or -2. Let's hope they were all being ironical. If not, we're in for another decade of UI set back.
Apart from the usual nit-picking over audiophile and their slightly fetish-like obsessions with materials, this device is going to pick up environment sounds. A good plate reverb must be well shielded. This one isn't, and his apartment is pretty noisy, judging by the video.
And wrt convolution: the normal convolution reverbs are simple FIR convolvers, which is a highly idealized model. In reality, processes are never finite (but we don't care, if the sampled response is long enough), but more importantly they can be non-linear. Saturation effects are very much liked by musicians, but cannot be treated as a normal convolution process. There have been attempts at modelling non-linear processes with convolution, but not to great success.
Anyway, it is a great project.
Perhaps 'game changers' is an exaggeration, but only papers that make minor extensions to the existing literature are accepted on first submission. In my (ex) field, papers that challenge a certain view get their share of flak from the reviewers. I've seen papers being shot down (see what I did there?) because the reviewers belonged to a different school. It's of course not always the case, but it does happen too often. One of the reasons is that such papers usually get reviewed by at least one of the opponents, or someone closely involved. Consequently, when such papers get accepted, they generate replies, and thus citations, in contrast to the papers that are in line with the main view.
I think the conclusion that the GP has a good point and that the conclusion "peer review works" cannot be drawn on.
It's not only silly, as the other responder points out, it's also something that's not common knowledge in about 70% of the world population.
I worked with systems that process LPR (travel times, congestion, etc.), and they have big gaps in their data. Bad weather usually freaks out the cameras, and a bit of mud on a license plate also seems to be enough to fool them.
But storing that data for 6 years: what do they want with it? For historical comparison, the could anonymize the data.
Although Reason looks particularly annoying to me (never used it), imitating physical equipment does have a function. Usually, the controls on physical equipment are presented in a logical way, grouping by function, emphasizing important controls, etc. You can call it conventional (in the sense of "based on or in accordance with what is generally done or believed"). If the software mimics (copies, imitates) this, it's because the makers don't have a better idea, so they just copy the old and proven design. This can lead to skeuolositimorphicologition (I made that one up), but it is not so by itself.
And as the GP points out: a lot of it is beautification (making things look nice). In Logic, you can switch your plugins to a view with just the controls. Now that's functional, but very, very annoying.
12 years ago, I worked at a place where they categorized online content and looked for answers to common questions. All the "editors" had their own subset of the content assigned. So, naturally, a couple of them did sex. At first they probably thought it was great (I never asked this), but after some time it got boring and depressing, and they didn't want to do it any longer. That was well within a year. So how come Google doesn't know this? Because if looking for mainstream porn is wearing you out, the stuff mentioned has got to be killing.
Do we really have to look for animals that photosynthesize? Why not stick with plants?