If the passengers of the Mayflower had thought the same, where would America have been now?
In the same place it's always been, you moron.
You show your true nature, that of a white supremacist, by comparing precolombian America to "a hump of red rock". There were millions of people living in America before the Europeans started their genocide and slavemongering. They had some of the more advanced civilizations in history, too-- go to my beloved Peru some day.
The Echelon member countries are not all anglophone by coincidence. Echelon was conceived as an alliance of signal intelligence agencies in anglophone countries from the very start.
I mean i thought it was the race of Locke and Jefferson and Churchill we were supposed to thank for being the torchbearers and guards of individual liberties.
Bah, you believe crap like that? You've been brainwashed. Think about the evil represented by the British Empire; the genocide of the Native Americans by the USA; the similar fate of the real Australians; and those people from New Zealand whose name I can't recall.
Who will save us from our attack on our own liberty? The Chinese?
Freedom is such a thing that we can't sit and wait for someone to come free us. We got to do it ourselves. And we need to start in our immediate environment. This is the principle of Direct Action.
Of course, we should all be using the brahmi script to write. No errors in pronunciation could be made then...
...as long as the language doesn't change. Now, what is the lesson from Ling 101 that you should recall immediately?;-)
Here is where ideographic systems trounce phonological ones. Since the relation between signs and phonolgy then becomes absolutely arbitrary, language change then doesn't degrade the correspondence between your orthography and the phonolgy.
The Japanese are the smartest people in this regard: they have 3 writing systems, one ideographic, one syllabic, and one phonological.
Got any references here to back up your claim that statistical techniques will lead us to more useful NLP systems?
Well, there are examples of "conventional" systems that, when augmented with some statistical constraints derived from corpora, exhibit an improvement in accuracy in things like picking the right parse for a syntactically ambiguous sentence. I saw a talk on this last week.
However, how is Spanish a "phonetic" language? Because it has no homophones? First, I am sure it has at least near-homophones.
It has homophones. "ha/a", for example. If you're not a spaniard, "casar/cazar" are homophones.
Secondly, the only way it's phonetic is in it spelling. And even then, it's only near-phonetic spelling. [...] THe phoetic you spoke of had to do with how accurate the alphabet represents the spoken language in wiriting.
No language is *really* written as it's pronounced, i.e. phonetics is much more complicated than orthography. A better term would be a "phonolgically faithful writing system" (Remember your phonolgy chapter from Ling 101?;-). Spanish is fairly faithful from the reading viewpoint-- just from the way it's written, you can figure out exactly how to pronounce it. From the writing point of view, it's more difficult-- there are some phonemes that can be written with more than one letter, and which word uses which letter to represent it is arbitrary. Thus, Spanish speakers can make a lot of orthographic errors, despite the orthography being "phonetic".
I just didn't understand this "4 case markers with a single noun" thingy.
I can't remember which language is it that let's you have 4 of them (my books are in my office, hehe).
The way it works is like when you have sentences like "The dog in the blue cage bit the child". These languages do not have prepositions or adjectives; nouns and verbs serve as modifiers. Here you'd have something of the sort "dog-1 AUX bit cage-2-1 child-3 blue-2-1". "1" would be the case that subjects get; "2" would be the case that locative modifiers get; 3 the case of objects (I can't remember the names of the cases here). In "cage-2-1", the "2" tells you it's a locative modifier, and the "1" tells you that it modifies the subject. In "blue-2-1", here we have a modifier of "cage", so it must agree in number, gender and case with it. (Actually, since there's no difference between nouns and adjectives in this sort of language, "cage-2-1 blue-2-1" could be "blue cage" as well as "cagey blue", I've been told by somebody who studies these languages; I'm not sure I understood what was meant by this, though.)
So there's Australian languages that have this sort of thing, called "case stacking". Some of them are friendlier to it than others-- AFAIK only one language is known that allows you to go 4 levels deep in one special case, but several allow 2 or 3. I can't construct from memory an example with 3 cases, sorry.
So while in English or Russian you know what modifies what because modifiers appear next to the head of the phrase, in a language like Warlpiri you know it because modifiers, apart from having a case that indicates the type of modification, have additional case that agrees with what is modified. There is no need for adjacency-
I've simplified the above since I only put case-- there's also agreement (person, number) morphemes. So you'd actually have something like "cage-AGR2-CASE2-AGR1-CASE1".
English or Russian must be equally mindboggling to these people as their languages are to us...
Not all information is in the net, you know, there's those pesky little things, whatchamacallem, ah, books.
The one book I've read on this, Rachel Nordlinger's Ph.D. dissertation, Constructive Case: Evidence from Australian Languages, is available in Amazon, and luckily in a friendly university library. Also, her M.A. thesis is a description of the language Wambaya.
Ken Hale has done *tons* of research in Australian languages, especially in Warlpiri.
Jane Simpson is also an important scholar in these languages.
Off the top of my head, I can't get you any more references, but these should be enough for you to find the relevant literature.
Of course, this is all academic linguistics work, so if you haven't done linguistics at all, you may simply not understand a thing...
Anyway, the main idea is that languages can identify grammatical relations (subject, object) in two main ways: configurationally, or by the arrangement of words and phrases, or nonconfigurationally, by using inflective morphemes. Nearly all european languages rely heavily on configuration to this effect, though they may use morphology a bit. A large number of Australian languages, however, rely mostly on morphology. This requires them to have a more complicated morphology, since the syntax is no help in deciding what word modifies what, or what is the subject, and such.
For example, in languages like this, a noun and its modifier do not have to be adjacent. A sentence like "The rabid dog has bitten the children" could be said something like "Rabid-1 has bitten children-2 dog-1", where 1 and 2 are different case morphemes; by the case you know that "rabid" goes with "dog". In that sentence, as long as "has" is the second word, any word order is ok.
What do you think the US, Australia, the UK and New Zealand have in common?
They are the countries member to Echelon.
This is obviously no coincidence. What we are seeing is the declassification of part of the Echelon system, without admitting its previous existence.
The fact that this is happening all at the same time in these very same countries is simply explained by the fact that we are witnessing a joint action, coordinated centrally, to officialize a system of spying on civil populations.
Why Echelon wants to make public part of what it already does beats me, frankly. Any ideas?
Giving a computer access to such a knowledge base is the goal of Doug Lenat's Cyc Project
*buzz* Failed try at (+1, Informative) (== "has an url").
Cyc is just a database of real world knowldege. You need a lot more than that to deal generally with the kinds of ambiguity resolution that the poster was talking about. You need to capture several kinds of inference-- logical entailments, presuppositions, and implicatures, topics for which the theories available are not wholly satisfactory (e.g. there are hardly any formal theories of conversational implicature).
Also, many ambiguities are not resolved by encyclopedic knowledge, but by conversational context.
And yet, there is the whole project of how to interface all this to a grammar. Conceptually, you can imagine a grammar giving you all possible parses for an input, and then a separate semantic module picking out the most likely ones, but this will most surely lead to inefficient implementations. You need to integrate the interpretation process so as to make use of partial information from many sources at all stages to guide it. This is not simple.
The past 10 years or so a new field - statistical natural language processing (SNLP) has shown a _lot_ of promise.
Oh puh-leeze. There are plenty of congnitive problems that no amount of statistics will ever get you around.
Right now, if you throw a SNLP system a bunch of parsetrees, it's able to induce a grammar - even in sufficiently complicated languages. (For simple languages, you can even induce a reasonable grammar just by giving syntactically correct string. Impressive!)
You still need to know what is a good set of morphological and syntactic categories. Statistics doesn't give you that.
The next stage after inducing syntax from training examples with tagged syntax is to induce semantics from training examples with tagged semantics.
Again the same problem. You need prior knowledge of what are the concepts involved in the semantics of natural language. In simpler terms, what are you supposed to tag, and what is your tagset? And what if many relevant factors in actual language use simply can't be captured in corpora? ("corpora" == databases of naturally occurring speech or writing.)
SNLP is not the panacea many people are pushing it to be. Believe me, to advance NL understanding, the sort of knowledge we need is what "conventional" linguistics studies.
PS I personally know a few very prominent SNLP researchers, and although they believe statistical methods are very important for understanding language, they don't lose the rest of the issue from sight.
Well, just as humans can know what "that" is refering by the context in wich "that" appears, so could computers.
But nobody is even remotely close to knowing how humans can know what "that" is in context in any amount of detail to make a workable implementatrion.
We have dynamic semantics, which embodies the notion of "context" used to resolve pronoun references, but "that" is inherently deixical, that is, it "points out" some salient object in the context. The problem is that the notion of "salient", which is psychological, is simply not well understood.
No European language is even *close* to being "the ultimate in inflection".
The ultimate in inflection would mostly be nonconfigurational languages such as Dyirbal, Wambaya or Warlpiri (all three Autralian aboriginal languages). Hell, in some of these, you can even get 4 case markers in a single noun.
mais quoi qu'tu as? pis qui qu'a d'mandé ton karma? karma, schmarma. tu penses-tu que tout le monde vit pour écrire des esties à/. pis gaigner d'karma?
/. pis l'karma, ça veulent dire que rien. moé, j'me r'câlisse d'ton crisse d'karma.
Tu m'fas souvenir d'une chanson de Plume, appellé La Marde ("On a toute sa marde mange, dans la viiiiiieeeee..."). Toé, t'as-tu ben mal choisi ta bouchée d'marde, pas vrai?
ben, j'peut pas te dédier si beaucoup d'temps. donc, enwouaille...
Oh, nothing. I guess all the times I've had to run because they were going to beat me up and they've thrown bottles at me is because there's something wrong with me.
Young, white, male, upwardly mobile English speakers are the majority in the net. Everybody else is a minority.
The internet is blind.
BS. US culture is overwhelmingly predominant. If you are clueles about it, it will show, and people will notice.
Nobody knows I'm an oppressed Lapp Ethnic Minority. Nobody cares that MyPeoples lost their timber rights to the Scandinavian lumber barons. We've been oppressed for centuries!
You should communicate with people with similar problems who have risen against them, and share your insights. I recommend that you consider the Inuits.
Despite your attempt to sound righteous by mentioning the role of the FBI in domestic surveillance, you pass over the largest of their crimes: their spying and disruption of the Puerto Rican independence movement. The surveillance to which the FBI submitted the nation of Puerto Rico was proportionately far higher than that which the US ever had to deal with, up to USSR levels.
Learn fully about the crimes of your country before speaking about them.
What is it precisely that you disagree with in the content of my post, but lack arguments agains, that you label it, in true slashbot fashion, a troll?
The UK has a long history of oppressing minorities within their territory, like the Irish. Now they want surely want to go after the many Arabs and Africans that have gone to their country.
This only shows further lie to the slasbot g**ks' idealistic crap about "freedom of speech" in the internet. Bullshit. The net is not a political body; it cannot grant rights. There is no such thing as "freedom of speech in the net"; there is only freedom of speech in legitimate political bodies.
It's obvious what happens when you take out the top four interesections in a city, you get gridlock. Yes you can go around but the side streets can't take the load. I don't see how routers are that different, so what they talk about seems to be the obvious bit.
The difference is, of course, that streets are public, but routers private. Thus, those who own the top 4 intersections have a huge amount of power over who can go where.
Actually, I don't think it would be such a bad thing if that 4% of top nodes were to go down. You would make people depend more on their local communities, which in many places, the net is destroying.
The people who run that 4% of the nodes are expanding their power and concentrating it into fewer and fewer hands. Then they will have an unprecendented amount of control over the net. They will be able to monitor our communications, and impose their rules on the rest of the net ("You host non-political-maintream sites? You can't connect to us.")
This is a real problem, and it is only getting worse every day. The "geek"'s image of a "free" Internet is vanishing fast with the massification and profitability of the net.
In my experience, FreeBSD 4.0-RELEASE was actually easier to install than Debian Slink -OR- Potato.
As a Debian user, I have to concur that the Debian install is needlessly painful. Especially dselect-- ugh.
On a slightly offtopic note, IMHO, the ports tree is superior to ALL linux packaging systems, even apt/dpkg.
Why? The only extra thing I can think of is the ability to make from source. Debian does not automate this, but you *can* download source packages with apt-get source. It would be trivial to automate this to download, build and install in a single step; I suspect it is only because of lack of interest that no such code is there by default.
OTOH, I think the BSD approach of separating the core system from the ports is superior to Debian's mess...
Really funny, actually -- I was laughing pretty hard, mainly at the subject line
Well, it was not a joke. It's dead serious. I expect the advertising industry to be placing ads in the tombs in cemeteries eventually. They want to bombard you with images every second of your life.
A thought? They will start by digitally manipulating televised news reports on the death of famous people to include "banners" on places they aren't.
In the same place it's always been, you moron.
You show your true nature, that of a white supremacist, by comparing precolombian America to "a hump of red rock". There were millions of people living in America before the Europeans started their genocide and slavemongering. They had some of the more advanced civilizations in history, too-- go to my beloved Peru some day.
The Echelon member countries are not all anglophone by coincidence. Echelon was conceived as an alliance of signal intelligence agencies in anglophone countries from the very start.
I mean i thought it was the race of Locke and Jefferson and Churchill we were supposed to thank for being the torchbearers and guards of individual liberties.
Bah, you believe crap like that? You've been brainwashed. Think about the evil represented by the British Empire; the genocide of the Native Americans by the USA; the similar fate of the real Australians; and those people from New Zealand whose name I can't recall.
Who will save us from our attack on our own liberty? The Chinese?
Freedom is such a thing that we can't sit and wait for someone to come free us. We got to do it ourselves. And we need to start in our immediate environment. This is the principle of Direct Action.
...as long as the language doesn't change. Now, what is the lesson from Ling 101 that you should recall immediately? ;-)
Here is where ideographic systems trounce phonological ones. Since the relation between signs and phonolgy then becomes absolutely arbitrary, language change then doesn't degrade the correspondence between your orthography and the phonolgy.
The Japanese are the smartest people in this regard: they have 3 writing systems, one ideographic, one syllabic, and one phonological.
Well, there are examples of "conventional" systems that, when augmented with some statistical constraints derived from corpora, exhibit an improvement in accuracy in things like picking the right parse for a syntactically ambiguous sentence. I saw a talk on this last week.
Anyway, Foundations of Statistical Natural Language Processing is a (the?) standard textbook here.
It has homophones. "ha/a", for example. If you're not a spaniard, "casar/cazar" are homophones.
Secondly, the only way it's phonetic is in it spelling. And even then, it's only near-phonetic spelling. [...] THe phoetic you spoke of had to do with how accurate the alphabet represents the spoken language in wiriting.
No language is *really* written as it's pronounced, i.e. phonetics is much more complicated than orthography. A better term would be a "phonolgically faithful writing system" (Remember your phonolgy chapter from Ling 101? ;-). Spanish is fairly faithful from the reading viewpoint-- just from the way it's written, you can figure out exactly how to pronounce it. From the writing point of view, it's more difficult-- there are some phonemes that can be written with more than one letter, and which word uses which letter to represent it is arbitrary. Thus, Spanish speakers can make a lot of orthographic errors, despite the orthography being "phonetic".
I can't remember which language is it that let's you have 4 of them (my books are in my office, hehe).
The way it works is like when you have sentences like "The dog in the blue cage bit the child". These languages do not have prepositions or adjectives; nouns and verbs serve as modifiers. Here you'd have something of the sort "dog-1 AUX bit cage-2-1 child-3 blue-2-1". "1" would be the case that subjects get; "2" would be the case that locative modifiers get; 3 the case of objects (I can't remember the names of the cases here). In "cage-2-1", the "2" tells you it's a locative modifier, and the "1" tells you that it modifies the subject. In "blue-2-1", here we have a modifier of "cage", so it must agree in number, gender and case with it. (Actually, since there's no difference between nouns and adjectives in this sort of language, "cage-2-1 blue-2-1" could be "blue cage" as well as "cagey blue", I've been told by somebody who studies these languages; I'm not sure I understood what was meant by this, though.)
So there's Australian languages that have this sort of thing, called "case stacking". Some of them are friendlier to it than others-- AFAIK only one language is known that allows you to go 4 levels deep in one special case, but several allow 2 or 3. I can't construct from memory an example with 3 cases, sorry.
So while in English or Russian you know what modifies what because modifiers appear next to the head of the phrase, in a language like Warlpiri you know it because modifiers, apart from having a case that indicates the type of modification, have additional case that agrees with what is modified. There is no need for adjacency-
I've simplified the above since I only put case-- there's also agreement (person, number) morphemes. So you'd actually have something like "cage-AGR2-CASE2-AGR1-CASE1".
English or Russian must be equally mindboggling to these people as their languages are to us...
Off the top of my head, I can't get you any more references, but these should be enough for you to find the relevant literature.
Of course, this is all academic linguistics work, so if you haven't done linguistics at all, you may simply not understand a thing...
Anyway, the main idea is that languages can identify grammatical relations (subject, object) in two main ways: configurationally, or by the arrangement of words and phrases, or nonconfigurationally, by using inflective morphemes. Nearly all european languages rely heavily on configuration to this effect, though they may use morphology a bit. A large number of Australian languages, however, rely mostly on morphology. This requires them to have a more complicated morphology, since the syntax is no help in deciding what word modifies what, or what is the subject, and such.
For example, in languages like this, a noun and its modifier do not have to be adjacent. A sentence like "The rabid dog has bitten the children" could be said something like "Rabid-1 has bitten children-2 dog-1", where 1 and 2 are different case morphemes; by the case you know that "rabid" goes with "dog". In that sentence, as long as "has" is the second word, any word order is ok.
They are the countries member to Echelon.
This is obviously no coincidence. What we are seeing is the declassification of part of the Echelon system, without admitting its previous existence.
The fact that this is happening all at the same time in these very same countries is simply explained by the fact that we are witnessing a joint action, coordinated centrally, to officialize a system of spying on civil populations.
Why Echelon wants to make public part of what it already does beats me, frankly. Any ideas?
*buzz* Failed try at (+1, Informative) (== "has an url").
Cyc is just a database of real world knowldege. You need a lot more than that to deal generally with the kinds of ambiguity resolution that the poster was talking about. You need to capture several kinds of inference-- logical entailments, presuppositions, and implicatures, topics for which the theories available are not wholly satisfactory (e.g. there are hardly any formal theories of conversational implicature).
Also, many ambiguities are not resolved by encyclopedic knowledge, but by conversational context.
And yet, there is the whole project of how to interface all this to a grammar. Conceptually, you can imagine a grammar giving you all possible parses for an input, and then a separate semantic module picking out the most likely ones, but this will most surely lead to inefficient implementations. You need to integrate the interpretation process so as to make use of partial information from many sources at all stages to guide it. This is not simple.
Oh puh-leeze. There are plenty of congnitive problems that no amount of statistics will ever get you around.
Right now, if you throw a SNLP system a bunch of parsetrees, it's able to induce a grammar - even in sufficiently complicated languages. (For simple languages, you can even induce a reasonable grammar just by giving syntactically correct string. Impressive!)
You still need to know what is a good set of morphological and syntactic categories. Statistics doesn't give you that.
The next stage after inducing syntax from training examples with tagged syntax is to induce semantics from training examples with tagged semantics.
Again the same problem. You need prior knowledge of what are the concepts involved in the semantics of natural language. In simpler terms, what are you supposed to tag, and what is your tagset? And what if many relevant factors in actual language use simply can't be captured in corpora? ("corpora" == databases of naturally occurring speech or writing.)
SNLP is not the panacea many people are pushing it to be. Believe me, to advance NL understanding, the sort of knowledge we need is what "conventional" linguistics studies.
PS I personally know a few very prominent SNLP researchers, and although they believe statistical methods are very important for understanding language, they don't lose the rest of the issue from sight.
But nobody is even remotely close to knowing how humans can know what "that" is in context in any amount of detail to make a workable implementatrion.
We have dynamic semantics, which embodies the notion of "context" used to resolve pronoun references, but "that" is inherently deixical, that is, it "points out" some salient object in the context. The problem is that the notion of "salient", which is psychological, is simply not well understood.
No European language is even *close* to being "the ultimate in inflection".
The ultimate in inflection would mostly be nonconfigurational languages such as Dyirbal, Wambaya or Warlpiri (all three Autralian aboriginal languages). Hell, in some of these, you can even get 4 case markers in a single noun.
/. pis l'karma, ça veulent dire que rien. moé, j'me r'câlisse d'ton crisse d'karma.
Tu m'fas souvenir d'une chanson de Plume, appellé La Marde ("On a toute sa marde mange, dans la viiiiiieeeee..."). Toé, t'as-tu ben mal choisi ta bouchée d'marde, pas vrai?
ben, j'peut pas te dédier si beaucoup d'temps. donc, enwouaille...
Oh, nothing. I guess all the times I've had to run because they were going to beat me up and they've thrown bottles at me is because there's something wrong with me.
Young, white, male, upwardly mobile English speakers are the majority in the net. Everybody else is a minority.
The internet is blind.
BS. US culture is overwhelmingly predominant. If you are clueles about it, it will show, and people will notice.
Nobody knows I'm an oppressed Lapp Ethnic Minority. Nobody cares that MyPeoples lost their timber rights to the Scandinavian lumber barons. We've been oppressed for centuries!
You should communicate with people with similar problems who have risen against them, and share your insights. I recommend that you consider the Inuits.
Where do you get these ideas from? Since nothing I've said remotely suggests this, it must be *your* idea. Tells us a lot about you, doesn't it?
I dont suppose the skinheads meet in chat rooms in order to go out and kick some minority ass in the streets.
If all of us were so innocent, this world might be a better place.
Reality. Have you noticed how many skinheads are there in London? And how little is done against them?
Yeah, this post has no content. Sorry. Parent had no content, so that's all there's to say.
Learn fully about the crimes of your country before speaking about them.
What is it precisely that you disagree with in the content of my post, but lack arguments agains, that you label it, in true slashbot fashion, a troll?
This only shows further lie to the slasbot g**ks' idealistic crap about "freedom of speech" in the internet. Bullshit. The net is not a political body; it cannot grant rights. There is no such thing as "freedom of speech in the net"; there is only freedom of speech in legitimate political bodies.
The difference is, of course, that streets are public, but routers private. Thus, those who own the top 4 intersections have a huge amount of power over who can go where.
Actually, I don't think it would be such a bad thing if that 4% of top nodes were to go down. You would make people depend more on their local communities, which in many places, the net is destroying.
This is a real problem, and it is only getting worse every day. The "geek"'s image of a "free" Internet is vanishing fast with the massification and profitability of the net.
As a Debian user, I have to concur that the Debian install is needlessly painful. Especially dselect-- ugh.
On a slightly offtopic note, IMHO, the ports tree is superior to ALL linux packaging systems, even apt/dpkg.
Why? The only extra thing I can think of is the ability to make from source. Debian does not automate this, but you *can* download source packages with apt-get source. It would be trivial to automate this to download, build and install in a single step; I suspect it is only because of lack of interest that no such code is there by default.
OTOH, I think the BSD approach of separating the core system from the ports is superior to Debian's mess...
Well, it was not a joke. It's dead serious. I expect the advertising industry to be placing ads in the tombs in cemeteries eventually. They want to bombard you with images every second of your life.
A thought? They will start by digitally manipulating televised news reports on the death of famous people to include "banners" on places they aren't.