Slashdot Mirror


Using The Web For Linguistic Research

prostoalex writes "The Economist says linguists are gradually adopting the World Wide Web as a useful corpus for linguistic research. Google is used, among other resources, to research how the written language evolves and how some non-standard examples of usage become more or less acceptable (The Economist quotes the phrase 'He far from succeeded,' where 'far from' is used as an adverb). LanguageLog is a resource linked in the article, where linguists discuss current peculiarities of the English language."

205 comments

  1. They should probably avoid Slashdot by Peter+Cooper · · Score: 4, Funny

    It's probably a good thing that they steer away from Slashdot as a corpus of English usage. Or, should I say, in SOVIET RUSSIA it's best Slashdot stays away from THEM! Or is it that only old people use the Internet as a corpus of the English language while pouring hot grits down a naked and petrified Natalie Portman's pants?

    1. Re:They should probably avoid Slashdot by UserGoogol · · Score: 1
      --
      "Never attribute to malice that which can be adequately explained by stupidity." -- Hanlon's Razor
    2. Re:They should probably avoid Slashdot by Frogbert · · Score: 1

      Thats true but only in japan.

    3. Re:They should probably avoid Slashdot by mizhi · · Score: 2, Interesting

      Hopefully, they'll harvest well written webpages for data and not those of 13-year old girls drooling over Orlando Bloom, AOL users, or porn sites.

      Actually, I take that back.

      It could actually be very interesting from a lexical or morphological point of view. The phenomenon of abbreviating words, such as "u" for "you" or "ur" for "you're" or "ru" for "are you." Language teachers in classrooms have been seeing it crop up in actual homework assignments. While reading such language may be like having glass wiped across the eyes of people educated before computers came into wide-spread use, it's interesting how it's affecting younger people.

      There's a collision between the high tech world children grew up with today and the way language is taught in schools in a similar way to the situation with how students speak on the street versus how they are expected to speak in the classroom or the professional world. Remember when it was proposed that ebonics be considered a valid dialect for using in the classroom?

      What would be even more interesting to study is how keyboard effect the structure of languages. It seems that people are under the assumption that languages are static and don't change, but this is incorrect.

      Because the keyboard is still the main way of inputing information into the computer, people take short cuts and I would be surprised if that didn't start to effect their use of language in other contexts.

      I'm just rambling, but such studies would be akin to socialogical studies that look at the influence of technology on social organization.

      --
      Humorless sig goes here.
    4. Re:They should probably avoid Slashdot by Joe+Tie. · · Score: 2, Interesting

      Because the keyboard is still the main way of inputing information into the computer, people take short cuts

      One thing that's always been at the front of my my mind, why aren't these kids learning how to type? Or at least to type with any reasonable amount of skill. The only computer I had as a child was a Commodore 64, and I was still faster than most of todays youth even with their abbreviations. I was somewhat lucky in that our schools somehow foresaw the advent of the home computer and made sure we knew how to type, but I'd certainly hope that held even more true in todays schools!

      --
      Everything will be taken away from you.
    5. Re:They should probably avoid Slashdot by Anonymous Coward · · Score: 1, Insightful

      One thing that's always been at the front of my my mind, why aren't these kids learning how to type?

      Because, unlike the parent's assumption, the phenomenon isn't related to computers. It's related to text messaging. It might be just as fast to type "you" instead of "u" with a keyboard, but it's noticably slower on mobile phones, especially before predictive text became popular.

      Furthermore, there is a limit on how many characters you can send in a single message. Most service providers automatically split long messages into multiple parts, but in the case where you are just scraping the limit, it might actually cost twiice as much to send a text message that says "you are" instead of "u r".

      I'm not excusing it, I hate reading it myself, it makes people look illiterate and, sadly, in many cases people really aren't able to express themselves in normal English. I know people in their mid twenties who type "his" when they mean "he is", and, to use an example I received recently through email, "gess how i sore." when they meant "guess who I saw?". No, I didn't make it up, and no the person wasn't joking.

    6. Re:They should probably avoid Slashdot by Moderatbastard · · Score: 0
      It seems that people are under the assumption that languages are static and don't change, but this is incorrect.
      I for one am not under that misconception. However I disagree with many /.ers who argue that the fluidity of language means that making it up as you go along is acceptable. "It's" is not a posessive, and you can't loose your shoe if the laces are lose.
      would be akin to socialogical studies
      "sociological".
      --
      1/3 of jokes get modded OT. If you get the joke, mod 1 in 3 insightful/interesting/underrated to restore karma balance.
    7. Re:They should probably avoid Slashdot by jez9999 · · Score: 1

      What do you think Natalie Portman would do if she actually viewed Slashdot some time?

    8. Re:They should probably avoid Slashdot by Anonymous Coward · · Score: 0

      She's naked with pants on?

    9. Re:They should probably avoid Slashdot by ggvaidya · · Score: 1

      Do you mean as an actress seeing a (rather weird) subset of fan, or as a psychiatrist?

    10. Re:They should probably avoid Slashdot by mizhi · · Score: 1
      I for one am not under that misconception. However I disagree with many /.ers who argue that the fluidity of language means that making it up as you go along is acceptable. "It's" is not a posessive, and you can't loose your shoe if the laces are lose.


      Right now, yes. But in a generation or two, perhaps they'll lose the distinction.

      "sociological"


      Whoops. :-)
      --
      Humorless sig goes here.
    11. Re:They should probably avoid Slashdot by mizhi · · Score: 1
      Because, unlike the parent's assumption, the phenomenon isn't related to computers. It's related to text messaging. It might be just as fast to type "you" instead of "u" with a keyboard, but it's noticably slower on mobile phones, especially before predictive text became popular.


      I'd still advance the argument that extensive use of computers by a larger portion of the population has contributed to the phenomenon. I remember seeing those abbreviations before cell-phone use became almost ubiquitous.

      It's also not just little things like modifying the spelling of words, but also syntax and morphological changes.
      --
      Humorless sig goes here.
    12. Re:They should probably avoid Slashdot by Hognoxious · · Score: 1

      TFAYLT states that she has a degree in psychology. That does not make her a psychiatrist.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    13. Re:They should probably avoid Slashdot by ggvaidya · · Score: 1

      Oops, my bad. Should have said psychology. Thanks.

    14. Re:They should probably avoid Slashdot by bwalling · · Score: 1

      TFAYLT states that she has a degree in psychology.

      WTFDTRLASF?



      What the f**k does that really long acronym stand for?

    15. Re:They should probably avoid Slashdot by Hognoxious · · Score: 1

      TFA is fairly standard and stands for "The F... Article". I confess that YLT is a hoggism : "You Linked To".

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    16. Re:They should probably avoid Slashdot by Anonymous Coward · · Score: 0

      I saw these abbreviations on student papers six years ago, before the advent of the cell phone with text messaging as an affordable umbilical cord to one's f4l and bf/gf. IM started the metaphorical ball rolling; SMS just gave it another push on its way down, to the dismay of those who like Sisyphus try to tell children that u c4n't rit lik dat n c00l3g3 or 4t w0rk n stuph.

      I agree with the grandparent on typing speed: I learned to type on a TI-99/4a programming BASIC. Children nowadays, however, don't type for extended periods of time, as instant messages last only a line or two. They don't naturally develop touch typing skills as a defense against fatigue or out of sheer time spent typing as we did. Perhaps those that learn to code will type faster and then realize that it takes very little time to write in English rather than in 1337.

      They also are addicted to the mouse, eschewing keyboard sequences and shortcuts that require hitting keys far apart or simultaneously.

    17. Re:They should probably avoid Slashdot by Sj0 · · Score: 1

      omg lol wtf r u tlaking about!!!!111 slshadot englsh is gret i dont know hwo you guys cud nock it!!!!11

      Yeah...

      Linux is a community made of of mostly literate folks who generally understand the language well enough to be understood. Zealots complaining about the use of apostrophe 's' as a possessive, you can do worse than slashdot in terms of grammar, easily.

      That said, only pedants can claim perfect spelling and grammar at all times, ever.

      --
      It's been a long time.
    18. Re:They should probably avoid Slashdot by gibson042 · · Score: 0

      You forgot Poland.

    19. Re:They should probably avoid Slashdot by Anonymous Coward · · Score: 0
      It's also not just little things like modifying the spelling of words, but also syntax and morphological changes.

      True dat.

    20. Re:They should probably avoid Slashdot by Anonymous Coward · · Score: 0

      I saw these abbreviations on student papers six years ago, before the advent of the cell phone with text messaging

      Hello Mr. American. The rest of the world is so far ahead of you you don't even know. Everybody outside the wonderful U.S. of A. had text messaging about a decade ago.

    21. Re:They should probably avoid Slashdot by Anonymous Coward · · Score: 0

      That said, only pedants can claim perfect spelling and grammar at all times, ever.

      My favorite is when a pedant tries to rip someone a new one for some minor grammatical infraction, and then totally makes their own glaring error. It's the funny.

    22. Re:They should probably avoid Slashdot by Effugas · · Score: 1

      Ebonics was never proposed as a valid dialect for use in the classroom. If you've got a bunch of students that can only speak French, and you want to teach them English -- it helps to know enough French to understand what they're trying to say.

      Same concept. Whole thing got hijacked by politics.

      --Dan

  2. Indeed by Pan+T.+Hose · · Score: 4, Funny

    Indeed what their sayin is true. U can learn English very well, especially grammer readin /. frist psots. Teh intarweb seems to certainly kick arse for that sorta research. Very 1337 articel. Thx d00dz.

    --
    Sincerely,
    Pan Tarhei Hosé, PhD.
    "Homo sum et cogito ergo odi profanum vulgus et libido."
    1. Re:Indeed by Anonymous Coward · · Score: 0

      If you wnat to raed an exmaple of "goolge lingiustics" on this spleling isseu, you can raed the sutdy I potsed a few dasy ago (in Frecnh) :

      http://aixtal.blogspot.com/2005/01/lexique-omnubil s-par-linfractus.html

    2. Re:Indeed by Anonymous Coward · · Score: 0

      Slashdot is the infinite source of correct English, not... And I won't fuck the fucking fuck off. Their there they're, now all together, repeat after me...

  3. I rue the day... by sandstorming · · Score: 3, Funny

    When we might actually say words like 'lol' out aloud. Imagine a deal going down between two mining companies and the CEO of one company with a straight face, and deadly serious demeanour saying to the cameras: "Despite many thinking we pwned them in the deal, we believe it came out leet for every1"

    1. Re:I rue the day... by Peter+Cooper · · Score: 3, Interesting

      When we might actually say words like 'lol' out aloud.

      I've heard it done. I've also heard 'roffle' (an attempt at pronouncing ROTFL I guess). Bizarre, really, since those terms are attempts to turn physical real-life actions into a verbal-only form.

    2. Re:I rue the day... by dapyx · · Score: 1

      I heard 'lol' actually being said a few times. :-)

      --
      I'm sorry, the number you have dialed is an imaginary number. Please rotate your phone 90 degrees and dial again.
    3. Re:I rue the day... by Rie+Beam · · Score: 1

      I guess I'm sorta immune to that stuff, then - maybe I'm the only one, but when I see something like "lol" or "rofl", it translates in my head to more of an idea than an actual sound - in essence, a loud "heh" rather than it's own word.

    4. Re:I rue the day... by initsix · · Score: 1

      Yes that day is here, here is one (lame) example.
      http://home.planet.nl/~cruij087/vin3.mp3

    5. Re:I rue the day... by Moderatbastard · · Score: 0
      I've heard it done. I've also heard 'roffle' (an attempt at pronouncing ROTFL I guess). Bizarre, really, since those terms are attempts to turn physical real-life actions into a verbal-only form.
      Huh? It's an acronym - Rolling On The Floor Laughing. First etters of the words. Nothing to do with representing physical actions, just words.
      --
      1/3 of jokes get modded OT. If you get the joke, mod 1 in 3 insightful/interesting/underrated to restore karma balance.
    6. Re:I rue the day... by JustKidding · · Score: 2, Informative

      You may be unaware that "lol" actually is a correct word in the dutch language, meaning (having) fun.

      lol (de ~) 1 [inf.] plezier
      (taken from, www.vandale.nl, an authoritive dutch dictionary)

    7. Re:I rue the day... by UserGoogol · · Score: 1

      Ell-oh-ell or lawl?

      --
      "Never attribute to malice that which can be adequately explained by stupidity." -- Hanlon's Razor
    8. Re:I rue the day... by Anonymous Coward · · Score: 0

      Huh? It's an acronym - Rolling On The Floor Laughing. First letters of the words. Nothing to do with representing physical actions, just words.

      "Rolling on the floor laughing" is a physical action. As is "laughing out loud". If you're talking to me face-to-face you shouldn't be saying "LOL", you should just be laughing.

    9. Re:I rue the day... by Moderatbastard · · Score: 0
      "Rolling on the floor laughing" is a physical action.
      Yes it is, Mr State-the-Obvious, but ROTFL relates to the action itself how? Oh that's right, it doesn't. It relates to the arbitrary linguistic units commonly used to describe said actions - spefically, in modern English. If it related directly to the action it would be the same in other languages e.g. German, not "KADBL" if memory serves well.

      If you're talking to me face-to-face you shouldn't be saying "LOL", you should just be laughing.
      And I probably would be.

      In contrast, this :-) directly represents a physical action. No word is used as an intermediary. A Chinese baby sees the same meaning as a Harvard professor. Understand the difference now?

      --
      1/3 of jokes get modded OT. If you get the joke, mod 1 in 3 insightful/interesting/underrated to restore karma balance.
    10. Re:I rue the day... by Peter+Cooper · · Score: 1

      but ROTFL relates to the action itself how? Oh that's right, it doesn't. It relates to the arbitrary linguistic units commonly used to describe said actions - spefically, in modern English.

      Good point. I'd disagree with your comment that it 'describes said actions', though. I'd say that Rolling On The Floor Laughing is more an idiom, since it's very rare anyone ever actually rolls on the floor, and isn't really describing that process ever happening. Anyway, I see the difference you're trying to pick out.

    11. Re:I rue the day... by Hognoxious · · Score: 1
      I'd say that Rolling On The Floor Laughing is more an idiom, since it's very rare anyone ever actually rolls on the floor, and isn't really describing that process ever happening.
      My brother used to do it as a kid. A bizarre side effect was that he became immune to pain during his laughing fit - I could boot the shite out of him and he just didn't feel it (which made it fairly pointless to do it). I'm certain he's not wired up properly, though.
      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    12. Re:I rue the day... by Anonymous Coward · · Score: 0

      That's "authoritative", kaaskop.

    13. Re:I rue the day... by moonbender · · Score: 1

      The latter. I do it all the time. It's fairly embarrasing. :)

      --
      Switch back to Slashdot's D1 system.
    14. Re:I rue the day... by meringuoid · · Score: 1

      I've said 'lol' occasionally. Not so often, though, since most of the time I'll just laugh; 'lol' is redundant in meatspace. Much more often, however, I'll say 'brb'...

      --
      Real Daleks don't climb stairs - they level the building.
    15. Re:I rue the day... by koko775 · · Score: 1

      Roffle is an attempt at pronouncing ROFL, not ROTFL. The t isn't capitalized or considered part of the acronym in ROFL. Old people.

    16. Re:I rue the day... by Fear+the+Clam · · Score: 1

      when I see something like "lol" or "rofl", it translates in my head to more of an idea than an actual sound

      Unfortunately, the idea conveyed is "I'm a fucking idiot."

    17. Re:I rue the day... by Anonymous Coward · · Score: 0

      And besides... ROFLMAO (roffle-mau) has such a... I don't know. Some sort of Je ne s'ais quois.

    18. Re:I rue the day... by dapyx · · Score: 1

      "lawl".

      --
      I'm sorry, the number you have dialed is an imaginary number. Please rotate your phone 90 degrees and dial again.
  4. inner city teens by Anonymous Coward · · Score: 1

    more than just web users are adjusting to this shift in language. i countinously question my co-workers (social workers) in telling the youth what is propper and not. if a launguage does not evolve then it dies. using words, moslty slang and rap song lyrics, is becoming more than just the normal, it is becoming the standard.

    1. Re:inner city teens by Kafir · · Score: 3, Insightful

      i countinously question my co-workers (social workers) in telling the youth what is propper and not.

      I'm glad they're telling the youth what is proper; you're clearly incompetent to do so.

      using words... is becoming more than just the normal, it is becoming the standard.

      Is that right? Using words is "becoming more than just the normal"? I've been using words for years now; I'm glad to hear that's becoming the standard. Your post is a perfect example of why people should learn to write in something approaching standard English. Your meaning is barely intelligible, and you sound like an idiot.

    2. Re:inner city teens by Anonymous Coward · · Score: 0

      if a launguage does not evolve then it dies.

      That makes no sense. If a language stops being used then it dies. It doesn't have to "evolve".

      Oh, and whether you like it or not, there are loads of people who noticed your atrocious English (it sticks out like a sore thumb) and automatically decided you were an idiot before they'd even considered your point. I think it's irresponsible to let kids suffer the same fate just because you don't value the ability to express yourself clearly.

      I couldn't even decipher what you meant by "using words" because your English skills are that bad!

    3. Re:inner city teens by brpr · · Score: 1

      Is that right? Using words is "becoming more than just the normal"? I've been using words for years now; I'm glad to hear that's becoming the standard.

      He said "using words, moslty slang and rap song lyrics", you dolt. You can make anyone look stupid by eliding half of what they say and refusing to countenance anything but a strict literal interpretation of what's left.

      Your post is a perfect example of why people should learn to write in something approaching standard English. Your meaning is barely intelligible, and you sound like an idiot.

      His meaning is perfectly intelligible, but some language snobs (very few of whom are actually linguists and know anything much about language) pretend not to be able to understand certain accent/dialects in order to feel superior.

      --
      Freedom is not increased by mere diminuation of government. Anarchy is freedom for the strong and slavery for the weak.
    4. Re:inner city teens by moonbender · · Score: 1

      That makes no sense. If a language stops being used then it dies. It doesn't have to "evolve".

      It's an observation. If a language has stopped changing, that's typically due to the fact that nobody uses it anymore and it is in a phase of decline.

      --
      Switch back to Slashdot's D1 system.
    5. Re:inner city teens by Anonymous Coward · · Score: 0

      Yes, but he used that observation to claim the causation goes the other way - that the language dies because it doesn't evolve, instead of the much more rational theory you presented.

    6. Re:inner city teens by Snowsphere · · Score: 1

      trollt

    7. Re:inner city teens by The+FooMiester · · Score: 1

      There's a difference between accent and dialect, first of all, and I don't think you understand them.

      Accent is a drift in the pronunciation.

      Dialect tends to bend the meanings of words.

      Dialect cannot be considered proper english due to the fact that it isn't widespread(usu. local) and confusion is likely to exist due to conflicting meanings. If I'd told you to "Rush the growler" would you have any idea what I was talking about?

      Would you

      Pet the dog?
      Fill the pail with beer?
      Get a sandwich?

      Yes, the answer is fully accessable on Google. But do you have google at your fingertips at every conversation? Would you even think to look at google if someone had posed that to you in an email?

      Also note that dialect is first a regional thing, secondly a class thing. It isn't an all-encompassing concept; not everything that doesn't fall into standard english is a dialect, there's also the concepts of "slang" and "bad".

      Why do I get the feeling that you're speaking of people who refuse to understand "bad" english and take it as what it means rather than what you think it means (the phrase "don't need no" comes to mind), rather than what the OP was saying.

      BTW, I found the OP's message a bit cumbersome. I think I grokked it after reading it several times.

      --
      The previous has been a secret message to my comrades.
    8. Re:inner city teens by chialea · · Score: 2, Interesting

      >His meaning is perfectly intelligible, but some language snobs (very few of whom are actually linguists and know anything much about language) pretend not to be able to understand certain accent/dialects in order to feel superior.

      Incomprehension often has very little to do with that. A friend of mine moved to MA from NC at the same time as I moved from CA. She could not understand most people there, most people there could not understand her. I could, on the other hand, understand both of them. I've been at at least one conference in which two non-native speakers of English could not understand each other at all, and required a native speaker to translate.

      There are simply certain grammatical patterns that I don't understand well, if at all. It has nothing to do with snobbery; I simply can't understand, most likely because I haven't been exposed to it all that much.

      When using media of international exchange, I would certainly try to make myself comprehensible. I spend quite a lot of time trying to do this in my research papers and communication. Writing in unambigious, grammatically correct English (or something approaching it) is the first step towards sharing ideas with a wide audience. People limit their communication and opportunities by the language they use.

      Lea

    9. Re:inner city teens by siphi · · Score: 0

      People limit their communication and opportunities by the language they use
      No they don't. I would find it the opposite. That other people (that don't know certain slang) are limited by not understanding others (that speak the slang). This is just my point of view. I for one like learning new slang. Your just stating that people are either unwilling or incapable of learning.

      --
      Sig (appended to the end of comments you post, 120 chars)
    10. Re:inner city teens by Anonymous Coward · · Score: 0

      That's just idiotic. It's akin to saying that a C compiler sucks because it doesn't understand every non standard GCC extension, or vice versa.

      If you want to be understood by a wide range of people, then it's up to you to ensure that the language you use is comprehensible, not up to your intended readers to first learn your idiom.
      You should write for your audience, or no one will pay any attetion to what you say.

      There's nothing wrong with learning new slang and obtaining a greater understanding of how people in different regions use the language, but you should take care to ensure that your use of non standard English is appropriate.

    11. Re:inner city teens by brpr · · Score: 1

      There's a difference between accent and dialect, first of all, and I don't think you understand them.

      Yes I do. I'm doing a degree in linguistics FWIW.

      Dialect cannot be considered proper english due to the fact that it isn't widespread(usu. local) and confusion is likely to exist due to conflicting meanings.

      It is of course non-standard (insofar as there is a definition of standard English), but there's no reason to use terms like "improper" or "incorrect". Other dialects are not in any way inferior.

      Also note that dialect is first a regional thing, secondly a class thing. It isn't an all-encompassing concept; not everything that doesn't fall into standard english is a dialect, there's also the concepts of "slang" and "bad".

      These are political issues. Your distinction between a dialect and "bad" language has no linguistic basis. It's just snobbery.

      Why do I get the feeling that you're speaking of people who refuse to understand "bad" english and take it as what it means rather than what you think it means (the phrase "don't need no" comes to mind), rather than what the OP was saying.

      Yes, people who think that double negatives are in some way illogical are very irritating.

      BTW, I found the OP's message a bit cumbersome. I think I grokked it after reading it several times.

      Appart from his spelling and punctuation, he was using pretty much standard English.

      My point is that there are of course good reasons to speak standard English sometimes, but this doesn't mean it's OK to sneer at people who speak other varieties for being stupid or uneducated or whatever. This kind of predjudice shouldn't be more acceptable than any other kind. The standard response of, "well that's just how the world works, if you don't speak standard English you must be an idiot" just serves to uphold a system of linguistic snobbery which is completely unjustifiable. We could teach schoolchildren standard English without making them feel inferior for speaking their own variety.

      --
      Freedom is not increased by mere diminuation of government. Anarchy is freedom for the strong and slavery for the weak.
    12. Re:inner city teens by brpr · · Score: 1

      There are simply certain grammatical patterns that I don't understand well, if at all. It has nothing to do with snobbery; I simply can't understand, most likely because I haven't been exposed to it all that much.

      Sure.

      When using media of international exchange, I would certainly try to make myself comprehensible. I spend quite a lot of time trying to do this in my research papers and communication. Writing in unambigious, grammatically correct English (or something approaching it) is the first step towards sharing ideas with a wide audience.

      Here's the problem. Standard English, though it's a useful standard, is not more "correct" or "grammatical" than any other variety. I've no problem with people learning standard English, but what bugs me is the pomposity some people display towards those who don't. Speaking a non-standard dialect might sometimes make you hard to understand (though not very often in my experience, especially in writing) but it doesn't make you stupid or uneducated or incorrect or a sign of slipping standards, etc. etc. Standard English should be treated as a standard in the same sense that certain sizes of washer are standard. It's an arbitrary convention that can sometimes aid collaboration, but it's not "correct", just standard.

      People limit their communication and opportunities by the language they use.

      They also limit their communication opportunities by the language they choose to listen to.

      --
      Freedom is not increased by mere diminuation of government. Anarchy is freedom for the strong and slavery for the weak.
    13. Re:inner city teens by The+FooMiester · · Score: 1

      Capital response, indeed. I commend you for sticking to the issue at hand and forming an intelligent reply. Now I'll reply to the message completely out of order.

      Yes, people who think that double negatives are in some way illogical are very irritating.

      I don't think they're not logical in the least! Double negatives are not always an ungood thing. People just need to learn to not use them when they don't mean to.

      good reasons to speak standard English sometimes, but this doesn't mean it's OK to sneer at people who speak other varieties for being stupid or uneducated or whatever. This kind of predjudice shouldn't be more acceptable than any other kind.

      I think "predjudice" is a word that is shifting in meaning and leaving a hole in the language as it does. How else does one describe their default feelings towards something? Predjudice is not always a bad thing.

      I reserve the right to be predjudiced against anyone who has to repeat their point 3 times for me to understand it. I further reserve the right to be predjudiced towards people with plesant attitudes.


      The standard response of, "well that's just how the world works, if you don't speak standard English you must be an idiot"


      That's not the case, but I think some people harbor inner feelings like that(many the same people who think that "predjudice" is a bad word). If you don't speak standard English, don't be upset when you have to rephrase things. Language does evolve, and what's nonstandard today becomes tommorow's standard.

      People usually have plenty of other ways to prove their stupidity to me. Some of them even speak perfect English!

      On the topic of Dialect


      It is of course non-standard (insofar as there is a definition of standard English), but there's no reason to use terms like "improper" or "incorrect". Other dialects are not in any way inferior.


      I never said the word "improper" as far as dialect was concerned. I merely stated "Dialect cannot be considered proper english". Perhaps I should have said "Dialect cannot be considered Proper English. I also did not favor one dialect over another, hence the word "other" in your reply leaves some question, as in "other than which".

      On the difference between dialect, slang, and bad English

      These are political issues. Your distinction between a dialect and "bad" language has no linguistic basis. It's just snobbery.

      Perhaps if the world were perfect your statement would be correct. I just want to know what dialect "You want no come in?" would fall under. I think that people, even linguists, recognize that some people fail to speak the language with any sort of pattern that can be seen in others, or even charted at all. I would consider such English bad, or broken.

      Slang has and always will be the common tongue of the lower class. That is its definition. Once a word is picked up by the upper class it is no longer slang. Again, the language evolves. Note that "lower class" does not always mean poor. Many criminals are lower class yet have quite a bit of money. Class is in how one presents themself.


      . . . serves to uphold a system of linguistic snobbery which is completely unjustifiable. We could teach schoolchildren standard English without making them feel inferior for speaking their own variety.


      When I'm up in a crane and radioing the operator, I want to be able to communicate effectively with him or her with ease. When I'm explaining an electrical circut to someone, or where holes need to be dug or concrete needs to be poured, I don't want there to be a barrier of "not being made inferior" between us. You cannot communicate with people you do not speak the language of. Language is more than an identity of who someone is. Language is a tool that betters humanity. Because of that, children need to be taught Standard English.

      --
      The previous has been a secret message to my comrades.
    14. Re:inner city teens by brpr · · Score: 1

      I don't think they're not logical in the least! Double negatives are not always an ungood thing. People just need to learn to not use them when they don't mean to.

      What I meant is the following. It's just not the case that "I didn't see nothing" is somehow less logical than "I didn't see anything". no/any are both just elements which agree with the negative auxilluary. Some dialects use any, some dialects use no. There is no question of logic involved, it's an arbitrary choice of functional vocabulary. In many nonstandard dialects, bound variables within a negative scope are required to agree with the negative element. Standard English is in fact somewhat unsual for not having this feature. If, as your example weakly suggests, you think these constructions are somehow paralell to sentences such as "John is not unhappy", you need to take an introductorary syntax course. "I didn't see [any/no]thing" involves quantification, whereas "John is not unhappy" does not; the distinction is crucial.

      I think "predjudice" is a word that is shifting in meaning and leaving a hole in the language as it does. How else does one describe their default feelings towards something? Predjudice is not always a bad thing.

      I don't really get what you're talking about here. I used the word predjudice to mean an unjustified opinion, which is a pretty common usege. Of course one cannot avoid having "default feelings", but there is nothing wrong with subjecting these feelings to examination. When people resist doing this, they are holding predjudices in the worst sense.

      That's not the case, but I think some people harbor inner feelings like that(many the same people who think that "predjudice" is a bad word). If you don't speak standard English, don't be upset when you have to rephrase things. Language does evolve, and what's nonstandard today becomes tommorow's standard.

      I don't think anybody's upset about having to rephrase things. The point is that there is nothing wrong with speaking a nonstandard dialect, and people shouldn't be criticised for it, or told that their dialect is somehow bad or wrong.

      I never said the word "improper" as far as dialect was concerned. I merely stated "Dialect cannot be considered proper english". Perhaps I should have said "Dialect cannot be considered Proper English.

      I presume that's meant to be a subtle and clever distinction of some kind. From someone who is so keen for people to be careful about the language they speak, I would suggest that if you don't think dialects are improper English, you shouldn't say that they're not proper English! Who says standard English aids clarity of communication...

      I also did not favor one dialect over another, hence the word "other" in your reply leaves some question, as in "other than which".

      Nonsense. It's obvious that I meant "other than Standard English". You only need to read the sentence prior to that containing "other dialects" in order to see this. I assume that you're trying to be pedantic, but in fact you're just be silly.

      Perhaps if the world were perfect your statement would be correct. I just want to know what dialect "You want no come in?" would fall under. I think that people, even linguists, recognize that some people fail to speak the language with any sort of pattern that can be seen in others, or even charted at all. I would consider such English bad, or broken.

      Perhaps if the world were perfect your statement would be correct. I just want to know what dialect "You want no come in?" would fall under. I think that people, even linguists, recognize that some people fail to speak the language with any sort of pattern that can be seen in others, or even charted at all. I would consider such English bad, or broken.

      No, linguistis do not think that (I am one so I should know). Apart from people who have geunine metal disabilities, everyone's language is intricately patterned. You may not be able to

      --
      Freedom is not increased by mere diminuation of government. Anarchy is freedom for the strong and slavery for the weak.
  5. Epiphany by phaln · · Score: 2, Funny

    It came to me that the English language was in deep trouble when people started saying "rotfl" and "lol" in person. There seems to be kind of a backlash brewing though, with improved email composition styles dictated by employers, and such.

    --
    SNACKS ARE AWESOME
    1. Re:Epiphany by Anonymous Coward · · Score: 0

      You've been quoted in the linked LanguageLog site.

    2. Re:Epiphany by nametaken · · Score: 1


      Oh I hope you're right. If I every hear someone actually vocalize "lol" or "rofl", I'll punch them in the face.

  6. Google does it again by vladd_rom · · Score: 3, Interesting

    This is not the first time when Google (and search engines in general) changed how we do things.

    Nowadays copyrighters use Google to search for potential violations of their intelectual property. Plagiarism is easy to detect nowadays thanks to Google as well. Instead of using rather expensive systems in order to search for duplicate work, teachers are now one search away in distinguishing original work from the rest.

    1. Re:Google does it again by BReflection · · Score: 1

      While Google made it easy to search for copyright violations, Google also made it a helluva' lot easier to violate copyright.

      --
      python -c "x='python -c %sx=%s; print x%%(chr(34),repr(x),chr(34))%s'; print x%(chr(34),repr(x),chr(34))"
    2. Re:Google does it again by Anonymous Coward · · Score: 0

      Nowadays copyrighters use Google to search for potential violations of their intelectual property.

      Forgive me if you were being ironic, but did you actually mean copyrighter? Or did you mean copy writer? I have never seen "copyrighter" used as a synonym for "copyright holder", and your sentence makes sense using "copy writer", but I get the feeling you actually meant "copyright holder".

    3. Re:Google does it again by Anonymous Coward · · Score: 0

      Do you mean copyrights, trademarks or patents? If you do then say so, don't hide behind the "IP".

  7. Hey by Anonymous Coward · · Score: 0

    Does He far from succeeded, sound totally fuckin retarded to anybody else?? Like something an idiot would say to try and seem intelligent?

    1. Re:Hey by muzzmac · · Score: 1

      Something like pluralising virus to Virii?

    2. Re:Hey by l3v1 · · Score: 1

      "It does ? ... It does." [Partridge@Equilibrium]

      So, yes, it does seem a fracking uselessly mistyped/misknown/miswrote/misthought way to express oneself. But that is changed now, because some people with too much time on their hands think it is a new form of expression and this is the way the English language is changing. So now we are supposed to treat these insentient ideas as the new ways ? Bahh, get lost.

      --
      I am putting myself to the fullest possible use, which is all I can think that any conscious entity can ever hope to do.
    3. Re:Hey by Anonymous Coward · · Score: 0

      Yup! "He was far from success.", or possibly "He was far from succeeding", sounds much better to me.

  8. Interesting by Anonymous Coward · · Score: 0

    This begs the question of how much "incorrect" use of a phrase is necessary for it to become the "correct" use of a phrase.

    NB: I'm being ironic.

  9. *BSD be dyin' by Anonymous Coward · · Score: 2, Funny
    It be now official. Netcraft gots confirmed, dig dis: *BSD be dyin'

    One mo'e cripplin' bombshell hit da damn already beleaguered *BSD community when IDC confirmed dat *BSD market share gots dropped yet again, now waaay down t'less dan some fracshun uh 1 puh'cent uh all servers. Comin' on de heels uh a recent Netcraft survey which plainly states dat *BSD gots lost mo'e market share, dis news serves t'reinfo'ce whut we've knode all along. What it is, Mama! *BSD is collapsin' in complete disarray, as fittin'ly 'esemplified by failin' wasted last in de recent Sys Admin comprehensive netwo'kin' test. Man!

    You's duzn't need t'be de Amazing Kreskin t'predict *BSD's future. De hand writin' be on de wall, dig dis: *BSD faces a bleak future. In fact dere won't be any future at all fo' *BSD a'cuz *BSD be dyin'. Doodads is lookin' real baaaad fo' *BSD. As many of us is already aware, *BSD continues t'lose market share. Red ink flows likes some riva' of blood.

    FreeBSD be de most endangered uh dem all, havin' lost 93% uh its co'e developuh's. De sudden and unpleasant departures uh long time FreeBSD developuh's Jo'dan Hubbard and Mike Smid only serve t'undersco'e da damn point mo'e clearly. Slap mah fro! Dere kin no longa' be any doubt, dig dis: FreeBSD be dyin'.

    Let's keep t'de facts and look at da damn numbers.

    OpenBSD leada' Deo states dat dere are 7000 users uh OpenBSD. How many users uh NetBSD is dere? Let's see. De numba' of OpenBSD versus NetBSD posts on Usenet be roughly in ratio uh 5 to 1. Derefo'e dere is about 7000/5 = 1400 NetBSD users. BSD/OS posts on Usenet is about half uh de volume uh NetBSD posts. Derefo'e dere are about 700 users uh BSD/OS. A recent article put FreeBSD at about 80 puh'cent uh de *BSD market. Man! Derefo'e dere is (7000+1400+700)*4 = 36400 FreeBSD users. Dis be consistent wid de numba' of FreeBSD Usenet posts.

    Due t'de troubles uh Walnut Creek, abysmal sales and so's on, FreeBSD went out uh business and wuz snatchn upside by BSDI who sell anoda' troubled OS. Now BSDI be also wasted, its co'pse turned ova' to yet anoda' charnel crib.

    All majo' surveys show dat *BSD gots steadily declined in market share. *BSD be very sick and its long term survival prospects is very dim. WORD! If *BSD be to survive at all it gots'ta be among OS dilettante dabblers. *BSD continues t'decay. Slap mah fro! Nodin' sho't uh a miracle could save it at dis point in time. Fo' all practical purposes, *BSD be wasted.

    Fact, dig dis: *BSD be dyin'

    1. Re: *BSD be dyin' by Anonymous Coward · · Score: 0

      It's easy to tell the difference between someone who knows ebonics or has used it in real life and someone who has read +1 funny websites about it.

  10. HAMMER REVOLUTION --; by clubhouse · · Score: 1
    1. Re:HAMMER REVOLUTION --; by courseB · · Score: 1

      person 1: like person 2: like person 1: --; person 2: yea

    2. Re:HAMMER REVOLUTION --; by c0dedude · · Score: 1

      Oh christ, not you whackjobs again. You've infested our forum. Sure, --; is a neat emoticon, but when could one ever use it? On a seperate note, anyone cringe when reading "He far from succeeded."? On a completely seperate note, anyone notice how programmers write with slightly different grammar? Extra punctuation always goes outside the ", never inside, as above.

      --
      Since when has this country used intellectual elite as a pejorative term?
    3. Re:HAMMER REVOLUTION --; by Hognoxious · · Score: 1
      On a completely seperate note, anyone notice how programmers write with slightly different grammar? Extra punctuation always goes outside the ", never inside, as above.
      Are you referring to "logical quoting"?.
      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    4. Re:HAMMER REVOLUTION --; by Daengbo · · Score: 1

      This is a major difference between American English and British English. American English tends to suggest usage always inside quotes, but Brits will put the quote inside if it's related, and outside if it's not. The subject is always hotly debated, though.
      I never look to "The Programmers' English Corpus" for style points...

  11. Be carefull thought... by Anonymous Coward · · Score: 3, Interesting

    There are more non native speakers on the web then
    native speakers.
    In the European community the native English
    speaking persons are by far a minority. That way
    French expressions are poring into the language
    in an unstoppable way. Those expressions are then
    used by native speaking politicians and are
    broadcasted by television. That way they enter the
    mainstream of the English language.

    Regards

    1. Re:Be carefull thought... by Anonymous Coward · · Score: 1, Insightful
      There are more non native speakers on the web then
      native speakers.

      Of course, non-native speakers have generally less trouble distinguishing "then" from "than" than the so-called "native" speakers do. You might speak it natively, but remember, you don't write it natively.

    2. Re:Be carefull thought... by Anonymous Coward · · Score: 1, Funny

      Ahhh run for the hills the French are coming!!!

    3. Re:Be carefull thought... by Anonymous Coward · · Score: 1, Funny

      Nobody runs away from the French. Not even the Eyeties.

    4. Re:Be carefull thought... by Spy+Hunter · · Score: 1
      Who needs to be careful? Hopefully the Internet *will* cause languages to merge. It could be like the Tower of Babel in reverse. Wouldn't it be great if there was a unified global language?

      Now I know some people would be quite upset at the horrible "loss" of cultural diversity implied by a single global language. But we can be just as diverse in many other ways that don't cause us to be unable to communicate with each other on a basic level. And IMHO, being able to communicate is much more important than some academic's ideal of "cultural identity".

      --
      main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
    5. Re:Be carefull thought... by Haeleth · · Score: 1

      Wouldn't it be great if there was a unified global language?
      Now I know some people would be quite upset at the horrible "loss" of cultural diversity implied by a single global language. But we can be just as diverse in many other ways that don't cause us to be unable to communicate with each other on a basic level. And IMHO, being able to communicate is much more important than some academic's ideal of "cultural identity".


      Okay... how about the complete loss of the ability to read any of the world's literature without special training? It's bad enough at the moment, when most people can read only the literature in their native language. If the current languages were no longer spoken natively by anyone, the vast majority of people would no longer know any great literature except through the lossy process of translation. We're not talking about losing cultural diversity. We're talking about losing culture itself!

      Not to mention that there is nothing academic about the link between cultural identity and language. Bloody wars have been fought over it. The recognition of a minority's language is often one of their deepest desires, and the suppression of a minority language is a common tool of oppression - see Welsh and Gaelic in English-occupied Wales and Ireland, Catalan and Basque in Spain, Chinese and Korean under the Japanese occupations, Kurdish in Turkey... the list goes on. If linguistic diversity is something that only academics care about, why do ordinary people all over the world get so upset about it?

      Finally, what's so great about a world language, anyway? I don't suffer at all in my daily life from the inability to chat with Chinese or Spaniards; I did feel the need to be able to communicate with the French and the Japanese, so I did them the simple courtesy of learning their languages. Those who need to communicate in more languages than they can learn are generally politicians or businessmen who can afford interpreters.

    6. Re:Be carefull thought... by Spy+Hunter · · Score: 2, Insightful
      You're overdramatizing. This is a process that will take hundreds if not thousands of years, even with technology helping to accelerate it. It's not like we'll wake up 10 years from now with a unified language and forget how to read today's literature!

      By the time we have a unified language, we'll have a whole new set of literature to go along with it. Today's literature will be like ancient greek literature, and yes, it will only be readable by people with special training. It will need to be translated, just like ancient greek is today. What's the big deal? The biggest difference is that only one translation would be needed, and therefore all the translation work could be focused on that.

      Furthermore, nobody will be forced to adopt a unified language. It will simply evolve. Words will travel from one language to another. Phrases will creep in from other languages. Languages will become closer, and eventually merge. You can see it happening today; at least the beginnings. It will only continue even faster, as the Internet is here to stay and the growth of the global marketplace shows no signs of slowing.

      Academics care about linguistic diversity in an abstract sense, but normal people really don't. People care about it, but in a much more practical sense of everyday communication. People will accept gradual, evolutionary changes to their language, as long as they can express themselves in a way they like. Academics often fight against change, because their theories were all developed to explain the old ways of doing things. They will fight against language unification; luckily I believe they will not be able to prevent it, or even slow it very much. [Note: this is a gross generalization about "academics", please remember that all generalizations are false.]

      You ask what's so great about a global language? The removal of all language barriers from everything! Duh!

      Maybe you don't personally notice any language barriers right now, but that doesn't mean you couldn't benefit from their removal. Maybe there are some really cool people in China right now doing brilliant work in your field that you just don't know about because it's all in Chinese. Maybe you would benefit from the increased efficiency of a global economy without language barriers. I think it's an indisputable fact that removing language barriers is a great thing.

      --
      main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
    7. Re:Be carefull thought... by violajack · · Score: 1

      It's a nice idea and all, but I just don't think it would really happen. We don't learn our language from the internet, it just influences slang amoung the young-uns. Language is learned in infancy by listening to adults, so the only real way to get a global language is to change the way all of the adults talk to their babies and then wait for the babies to grow up.

      Even if we could get that to happen, it wouldn't be long before dialects cropped up and veered away from one another. I'll never forget the time I was in Italy (as a scared little American trying to live there for a month as part of a festival) and I needed some help figuring out the bus system in Piza. Imagine my non-Italian-speaking relief when I heard a goup chatting away in English. I was going to ask them for help, when suddenly, it all turned into unitelligible babbel. I did an aural double-take. I listened carefully to see if my ears had been playing tricks on me, and I started picking farmiliar words out of a VERY British accent. It took a lot of thought to understand what they were saying, and we were supposedly speaking the same language. All it takes is an ocean and a few generations, and all that same-language-speaking-goodness goes out the window.

    8. Re:Be carefull thought... by rxmd · · Score: 1

      Such as carefull, poring, broadcasted? ;)

      --
      As a state gets corrupt, its laws multiply; the most corrupt states have the most numerous laws. (Tacitus, Annales 3:27)
    9. Re:Be carefull thought... by monecky · · Score: 2, Interesting

      > Academics care about linguistic diversity in an abstract sense, but normal people really don't.

      I think you're a bit wrong on this. There are around 6,800 languages. Most languages have developed their own culture. Do you really think millions of people around the globe would be willing to lose their identity?

      For example, after the collapse of the Soviet Union, Uzbeks started replacing Russian loan-words with the original Uzbek words.

      Paul Rodrigues

      --
      http://jones.ling.indiana.edu/~prrodrig
    10. Re:Be carefull thought... by rob_squared · · Score: 2, Funny

      Don't worry, according to the French we're doing far greater damage to their language and culture.

      --
      I don't get it.
    11. Re:Be carefull thought... by dave1g · · Score: 1

      While I agree it would be a good thing to have a unified language, I don't see it happening to the fullest extent. This is because of the different alphabets used in some languages. I can easily see the European languages merging (German and Latin based), possibly including the Slavic ones but perhaps they would form another set. And there might be some Middle Eastern languages that remerge. And the Asian languages of China, Japan, Korea, Viet. etc..

      But I can't imagine those super groups ever merging
      Of course at that point we could just teach kids "The 5 languages of Earth" and either that will be fine, or given enough generations with everyone knowing how to speak all the languages semi-fluently they might start to merge these languages too.

      Still like you said this process would occur over thousands of years.

      Being a language story, I decided to actually run my post through a spell/grammar check. :-)

    12. Re:Be carefull thought... by Spy+Hunter · · Score: 1

      Those millions of people won't lose their identity, because they will die long before the transition is complete. Remember we're talking about a process that takes a *long* time. The children of those people will grow up using a language that is slightly changed, and so on.

      --
      main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
    13. Re:Be carefull thought... by famebait · · Score: 1

      There are more non native speakers on the web then native speakers.

      Good. Less to worry about whenever they get restless.

      That way French expressions are poring into the language in an unstoppable way.

      Ah, but when you pore into the language, the language also pores back into you.

      --
      sudo ergo sum
    14. Re:Be carefull thought... by babbage · · Score: 1

      Of course, if we all move to speaking some kind of ubermetalanguage, that implies that the languages spoken today would be lost. That would be a sad thing.

      Language is a reflection of culture, and culture, to date, is a deeply regional thing. The standard example is that the Inuit people of Alaska and Canada have dozens of words for snow; while this seems to be not entirely accurate, the general point stands that different groups have richer or poorer ways of expressing concepts based on their collective experiences. This is both a reflection of and an amplification to regional cultural distinctions.

      For an example I'm a bit more confident about, Russian has no word for fun. It's a concept that can more or less be expressed with a string of other words, just as "tsunami" can be expressed in English as a string of other words, but the term itself, and so the concept it represents, isn't directly represented by a Russian word. Funny, eh?

      Language is full of these little oddities. One of the great things about English is the tendency to gladly pick up terms from other languages when we can't express something already -- cf. "tsunami". If everyone were to speak one doubleplusgooduberlanguage, then this ability to cross-pollinte will presumably go away. That would be doubleplusungood.

      But I don't think we're in any danger of that. English may be becoming a "world language", sure, but look at all the regional variations: it can be a real stretch to assume that the Hindi inflected English in India, the Chinese inflected language in Hong Kong & Singapore, the hodgepodge of influences on dialects in the Caribbean, and the diverse variants spoken in the UK, USA, and Australia are really all the "same" language. In a lot of ways, these dialects of English are only growing farther apart, just as the French spoken in places like Haiti and Cote d'Ivorie are much different from the language in France, and the German spoken in Switzerland is much different from the Hochdeutsch in Germany.

      You could argue that the Internet may bring all these streams of language closer together, but you could just as credibly argue that it will only serve to churn the already turbulent patterns that are driving languages apart.

      Personally, my hunch is that while some kind of pidgin English may come to be widely understood around the world, the predominant trend is going to be increased, not decreased, diversity in global language patterns. And I see that as nothing but a good thing.

    15. Re:Be carefull thought... by Spy+Hunter · · Score: 1
      Firstly, the languages wouldn't be lost, they simply would be known by far fewer people. I don't see that as a sad thing, because I don't see the point in making people speak different languages when a single rich language with regional variation would be just as culturally rich and much more practically useful. Of course regional variation would endure, and it might even become more prominent than it is today in English, as you suggest. But the base language will be a common language, and people from all over will be able to understand each other without too much difficulty. That's the main advance I'm hoping for.

      The ability to pick up words from other languages is moot, because in the process of creating this meta-language (remember: thousands of years), nearly all the useful, distinct concepts from the world's popular languages will be incorporated. What is left can be covered by inventing new words, as is also done in English today. (ex: blog) This process is just as interesting as cross-pollination, if not more so.

      I see language as a means to an end (understanding), not an end in itself. I shed no tears for lost languages; only lost concepts and lost understanding. I believe a single language evolved in the way I describe could provide all the concepts and understanding from all of today's languages, with far fewer language barriers between cultures. That's my perspective.

      --
      main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
  12. I've used the web for corpus linguistics research by Anonymous Coward · · Score: 2, Informative

    I've used the web for corpus linguistics research. My last big project was to look at a lot of web pages with Mexican and Chilean slang Spanish, and see if there was a difference in vocabulary usage. There was a significant difference; I could, 70% of the time, tell if a given passage was Chilean or Mexican Spanish.

    I could have gotten a higher accuracy rate, but this was just a simple undergraduate project.

  13. print "$badgram{vocab}" by pinball667 · · Score: 1, Funny

    Without RTFA my fist instint is to say why post anything related to natural language on slashdot? But the truth is, as a sysadming/webmaster/anything that plugs into an outlet for a small credit union I am appalled at the way people want to write on the web. It's hard to describe, but see (for the moment) this for a crippled example (yeah, a work site published externally, FSCK'ing horrible - more where that came from). Anyhow, it seems the second people publish shit one the web they give up on grammer/puncuation etc - in the included link originally draft had every link capitolized. No bold, color or anything - fuck it, aparently it's OK to throw proper grammer to the wind if it's on the web, even if the purpose is to manage peoples retirement. ARGH.

    side note - my bad grammer/spelling is OK only because I'm a FUCKING CODER. I don't want to hear from the grammer/spelling Nazis on the text of this post.

    anyhow - slight possibility of feedback on a complelty offsubject page I'm working on, here. Break it, fuck with it whatever. Jon.

    1. Re:print "$badgram{vocab}" by Milton+Waddams · · Score: 1

      I think you and most of the people in these comments are missing the whole point of what Linguists do. They study language. Language is what people speak/write, not what they should speak/write.

      I'm studying Computational Linguistics and when I'm doing a project that needs some kind of corpus, I usually scan a few thousands words of English from Google searches and put them into a corpus.

      Computational Linguists want to write/use programs that can somehow interact with natural human language. What's the point in training said program on "unnatural" language (language that is ought to be spoken but only makes up a fraction of actual written/spoken language)? It probably won't be able to interact as efficiently with natural language as a program that's trained on raw language sources.

    2. Re:print "$badgram{vocab}" by The_Wilschon · · Score: 1

      side note - my bad grammer/spelling is OK only because I'm a FUCKING CODER. I don't want to hear from the grammer/spelling Nazis on the text of this post.

      Actually, I may be the odd one out in this case, but I generally attribute my tendency to correct spelling mistakes immediately (rather than either correcting them later or just leaving them in) to the fact that I code. I know this may not seem to be an obvious connection, but when coding, if you misspell something (identifier, reserved word, anything) then the compiler generally barfs. So, I find my work to be much much more efficient if I correct as I go. As a consequence, when I'm using AIM or posting on Slashdot or whatever, I tend to have better spelling than most of my friends (including my gf, who is, incidentally, a linguistics major :-p).

      --
      SIGSEGV caught, terminating

      wait... not that kind of sig.
  14. Compression Prize by Baldrson · · Score: 1

    There needs to be an anual prize for the highest compression ratio using random pages from the web as the corpus. This would probably do more for real advancement of artificial intelligence than the Turing competitions.

    1. Re:Compression Prize by The+Real+Joe+Faith · · Score: 1

      eh? please explain

    2. Re:Compression Prize by Baldrson · · Score: 1

      Intelligence can be seen as the ability to take a sample of some space and generalize it to predict things about the space from which the sample was drawn. The smaller the sample and the more accurate the prediction, the greater the intelligence. This is also a short description of what a compression algorithm does.

  15. Non-official English by Anonymous Coward · · Score: 2, Informative

    Unlike French and Italian, there is no official instution that defines 'correct' English. Essentially, the English-speaking world just 'makes it up' as it goes. Thus when I see the adverb 'really' butchered into 'real' I must try not to get annoyed. i.e. It's real hard to use your mother tongue. vs. It's really hard to use your mother tongue. Please help me here - is the misuse/non-use of 'really' something that's taught in school?

    1. Re:Non-official English by Kafir · · Score: 2, Insightful
      From Merriam-Webster Online:
      real (3, adverb): VERY (he was real cool -- H. M. McLuhan)
      usage Most handbooks consider the adverb real to be informal and more suitable to speech than writing. Our evidence shows these observations to be true in the main, but real is becoming more common in writing of an informal, conversational style. It is used as an intensifier only and is not interchangeable with really except in that use.

      I'd say you're fighting a losing battle on this one. I'm not too bothered by it, either; the English language has other words that function both as adjectives and as adverbs, despite the existence of a distinct adverb form - near dead and nearly dead are both standard, for instance.
    2. Re:Non-official English by Anonymous Coward · · Score: 0

      the English language has other words that function both as adjectives and as adverbs, despite the existence of a distinct adverb form - near dead and nearly dead are both standard, for instance.

      That's a special case: "near" was originally the comparitive form of "nigh", i.e. "nigh-er", with "nearly" being a back-formation; you'd expect a word like that to have a few idiosyncracies.

    3. Re:Non-official English by Anonymous Coward · · Score: 0

      real-really-reallest, didn't you learn that in school?
      /funny

    4. Re:Non-official English by JoeBuck · · Score: 1

      These things can change over time. After all, in German there is in most cases no distinction between adverbs and adjectives, no "ly" suffix (adjectives get suffixes to agree with the gender and case of the nouns that they modify, but in some forms there is no suffix). It is possible that "ly" could disappear over time.

  16. Language Lives! by theguywhosaid · · Score: 1

    Good! Natural language is a moving target. The web is an excellent communication medium and ignoring it would be quite a
    silly move. The example reminds me of "To boldly go", which was not proper, but its elegance is hard to argue against.

    1. Re:Language Lives! by wildBoar · · Score: 1

      In fact there are some arguments about the To Boldly go etc.

      Apparently written English Grammar varies so much from how it is often spoken as the rules were written down by a Scholar in latin who firmly believed that English should conform to the same rules - even though it doesn't

      A careful poke at this 17th century book ( thereabouts - which sets the standard for modern grammar ) means that even Shakespeare wrote bad grammar, and he isn't the only one.

      So in fact correct grammar isn't so correct at all and should be taken with a pinch of salt.

    2. Re:Language Lives! by Hognoxious · · Score: 1
      the rules were written down by a Scholar in latin who firmly believed that English should conform to the same rules
      My understanding is tht the banning of split infinitives was never a hard and fast rule, even among good writers; Orwell certainly dissented. Infinitives in Latin (and French, German, Italian and Spanish[1]) can't be split anyway, as they are one word.
      So in fact correct grammar isn't so correct at all and should be taken with a pinch of salt.
      There's a middle ground. How many moles of NaCl do you need for s to be always preceded by an apostrophe?

      [1] Well, German has separable verbs and Italian has reflexive pronouns that detach[2] and go all over the place, but nothing like the bizarre "to" that English has. I've always been intrigued about its origin.

      [2] Ignore that - they're attached in the infinitive: anrufen, lavarsi.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  17. Same here by Estanislao+Mart�nez · · Score: 1

    Though I've done it at a higher level of the educational system (while doing a Ph.D. in Linguistics). The big, big advantage of using search engines is the sheer size and variety of the content available on the web. For a number of things, there is simply no other way to get enough examples, because the phenomenon you're interested in is just too rare. The downsides are repetitiveness (it's often the case that you get the same document a lot of times at many different URLs; for example, song lyrics), typos, unreliable language-dectection algorithms in search engines (search for weird stuff in Spanish in Google, and you'll often get back some Portuguese results), unreliable numbers, etc.

    1. Re:Same here by Anonymous Coward · · Score: 0
      The downsides are...

      Quite. And it's a hell of a lot worse for English because of the wider adoption as a second language.

      What about bad translations into English of corporate copy originally written in another language, Babelfish caches, common or garden typos, etc, etc?

      Linguists usually, and quite rightly, worry about prescriptivism vs descriptivism - becoming the story rather than just reporting it - but in this case they're potentially exercising a disproportionate influence on the development of the language by drawing attention to phenomena derived from a skewed set of sources.

      Timeo empiricos, et data ferentes...

    2. Re:Same here by Anonymous Coward · · Score: 0

      The thing about English is that, since it is such a widespread second language for international communication (Esperanto should have won, but because of the lazyness of people 100 years ago, it didn't), there is more standardization of English then there is of Spanish.

      Even the cuss words are mostly the same across dialects of English, and it's the cuss words that change the most quickly when dialects change.

      If studying linguistic variation, Spanish is far richer than English. English is mainly interesting when looking at errors that L2 learners of English make.

    3. Re:Same here by Hognoxious · · Score: 1
      Esperanto should have won,
      It's a joke. Latin wi' t' grammar took out.
      but because of the lazyness of people 100 years ago, it didn't
      How I envy them, working a 12 hour day down the mine and getting rickets or diptheria. Aye, they had it easy in them days. Luxury!.
      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  18. roflcopter by Kentsusai · · Score: 0

    roflcopter....

  19. 'Language' == spoken || written? by adam31 · · Score: 2, Insightful
    How do you even pronounce 'pwn3d' ? Google is not a tool to study speech patterns, and there's nothing to say that speech even resembles written text.

    The article addresses this in a weird way, where it first draws attention to the distinction, but once it reaches its crux, where google is used as a tool, the distinction is ignored entirely; instead it opts to focus on stranger things.

    1. Re:'Language' == spoken || written? by Twisted64 · · Score: 1

      I'm going to go with "pawned," a la trading something in for money. "owned" seems better, but it doesn't get across the spelling of the word as well as "pawned" :)

      --
      Consciousness is a myth. Trust me.
    2. Re:'Language' == spoken || written? by jez9999 · · Score: 1

      'Pooned' is another variant.

    3. Re:'Language' == spoken || written? by Anonymous Coward · · Score: 0

      I've heard people try to pronounce it as powned... That is, 'owned' with a 'p' in front of it. Sounds odd :P

    4. Re:'Language' == spoken || written? by woah · · Score: 1

      No, it's more like "pnud".

    5. Re:'Language' == spoken || written? by Hognoxious · · Score: 1

      It's a straightforward finger fumble, P is next to O, at least on an English keyboard. The 3 is just leetspeak for E.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    6. Re:'Language' == spoken || written? by monecky · · Score: 1

      IAAL (I am a linguist). I study computational linguistics, which uses computers to study every aspect of human language. We use statistics, ANNs, or other tools, to learn something about how people learn, interpret, and produce language.

      Linguists don't just study speech. As far as classical linguistics goes, syntax and semantics are both studied using text. Even some of the sound branches, such as phonology, use abstract representations of the sound, and not the real sound. (So your alphabet will look like this.)

      Internet-only strings like "pwn3d," "LOL," or ":)" are statistically significant to some experiments, but it isn't just newsgroups and blogs that are creating words. For example, if you look at a newspaper such as the New York Times, many new words are added to the English language every month. Verbs such as "pwn3d," verb phrases such as "lol," and emotion cues such as ":)" can be analyzed just like new verbs heard in speech (i.e. "to google." )

      Paul Rodrigues

      --
      http://jones.ling.indiana.edu/~prrodrig
    7. Re:'Language' == spoken || written? by Geoffreyerffoeg · · Score: 1

      How do you even pronounce 'pwn3d'?

      "3", of course, is "e". Since "pwned" came from "owned", the most sensible pronunciation is to rhyme the two: "powned." I supposed "pooned" is acceptable, since the only two English-accepted words that use "w" as a vowel, "cwm" and "crwth", are pronounced with an "oo" vowel.

      But "pwned" is obviously not a Welsh word. So "powned" may be preferable.

    8. Re:'Language' == spoken || written? by pinchhazard · · Score: 0
      I'm going to go with "pawned,"

      You guys are all idiots, including parent and grandparent posters. GUYS YOU JUST SAY "OWNED."

      It's not hard you morons.

      --
      Do you love freedom??? Do you love freedom!!! DO YOU LOVE FREEDOM!!!!!!!!
    9. Re:'Language' == spoken || written? by Twisted64 · · Score: 1

      What? You madman, OWNED doesn't cut it. Would you say "I am THE ownz0r?" I think not. I am TEH 0wnz0r!

      --
      Consciousness is a myth. Trust me.
    10. Re:'Language' == spoken || written? by 42forty-two42 · · Score: 1
      How do you even pronounce 'pwn3d' ?

      "Powned".
  20. OMG! by Frogbert · · Score: 1

    I woulda thght such a thng was unpossible.

    1. Re:OMG! by Anonymous Coward · · Score: 0

      You misunderestimate the power of teh intar-web.

    2. Re:OMG! by Anonymous Coward · · Score: 0

      It's intar-webs, you 'tard! What use would one be on its own?

  21. internet messaging data by Anonymous Coward · · Score: 0

    Scouring the net for written material, prose or otherwise, and studying, analyzing, tabulating it is a cool and grand idea. Lots to be learned I'm sure.

    However ... What about researching and analyzing vernacular data that is not publicly available on google, news sites, public message boards, usenet, etc? What similarities and differences can be found in what is considered to be personal or private communication?

    I'm almost sure someone has thought of this before but the obvious problem is: how is one able collect ample data categorized as private or personal communications? Afterall, it isn't possible to just google or grep ICQ or AIM logs from thousands of people...or is it?

  22. Popular usage != wanted usage by KiloByte · · Score: 2, Informative

    Yes, we can record the errors made by the uneducated public (and even those done by, uhm, me). The question is: should we do that or not?

    I was pretty taken aback when a council of linguist in Poland suddenly declared some widely-chastised and not even very popular errors to be valid usage. I've been brought up in the circles of people who not only put a lot of stress to the language you use, but also cruelly point out every incorrect word or phrase you use -- and this made me quite intolerant to bad speech.

    Being but a dirty foreigner, I know that my English can sound bad in the ears of native English speakers -- that's why I sometimes ask people to correct me if they spot errors.

    In other words: some people find careless speech repulsive. Thus, we should do whatever we can to promote correct usage as opposed to legalising incorrect uses.

    --
    The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
    1. Re:Popular usage != wanted usage by Anonymous Coward · · Score: 0

      some people find careless speech repulsive. Thus, we should do whatever we can to promote correct usage as opposed to legalising incorrect uses.

      I don't think that the fact some people find it repulsive matters much. That makes it sound like people who can't express themselves well are just annoying some hard-liners.

      The fact is, when you can't use your native language properly, people (including many people you wouldn't consider "hard-liners") judge you for it. Consciously or subconsciously, you appear less educated and less intelligent to them.

      If we change the rules, all this means is that some people are going to think that it's okay to be lax with language. It's not going to stop people thinking "what a moron" when somebody emails them with something like "ru up4 the meeting tomorrow?!?!?"

    2. Re:Popular usage != wanted usage by brpr · · Score: 1

      Yes, we can record the errors made by the uneducated public (and even those done by, uhm, me). The question is: should we do that or not?

      It's a moot point, because linguists doing corpus research aren't interested in tracking "errors". They prefer studying language structure/change to berating people for speaking in their native dialect.

      I was pretty taken aback when a council of linguist in Poland suddenly declared some widely-chastised and not even very popular errors to be valid usage.

      Why? The social stigma attatched to such "errors" generally has no linguistic basis. It's just predjudice against certain unprivaledged accents/dialects. It's no more justifiable than racism or any other kind of predjudice.

      In other words: some people find careless speech repulsive. Thus, we should do whatever we can to promote correct usage as opposed to legalising incorrect uses.

      Surely you can see that the argument "some people find X repulsive; therefore we should promote not(X)" is pretty flawed.

      --
      Freedom is not increased by mere diminuation of government. Anarchy is freedom for the strong and slavery for the weak.
    3. Re:Popular usage != wanted usage by phiwum · · Score: 1

      In other words: some people find careless speech repulsive. Thus, we should do whatever we can to promote correct usage as opposed to legalising incorrect uses.

      But linguists don't legislate use. They're only interested in existing usage. Look to other sources for normative judgments.

      And believe me: I hate what I consider linguistic abuses as much as you do. I wrote a little Gnus function to black out those goddamn smilies so I don't see them in Usenet posts.

      But neither my personal tastes nor more authoritative pronouncements about what is good language have much to do with what linguists study.

      --
      Phiwum's law: anyone that names an obvious law after himself and then puts it in his own sig is just pathetic.
    4. Re:Popular usage != wanted usage by happyhangone · · Score: 1

      Maybe is not the ideal situation but languages are made up by the use. The people that get obscesed by the "correctiveness" of a phrase, sentence or the language in general, doesn't get that the rules they are using were improper usage in the past. Language evolves, get over it!.

      P.S. and if the people at the other side understand you, does it really matters?

  23. using google as a spell checker by tinkerton · · Score: 1

    when you doubt between two spellings of a word, check the search results count in google. I've used that trick.

    Then again, my idea of fun is to use google count for finding the words that get misspelt(google ratio with misspelled 5%) the most often.

    I thought compatable was common, but i only get a 1% ratio there. Maybe there should be a category 'non native'.

    Is conneXion considered an error? I like it much better than connection.

    Just now i find out that there are lists , eg at most commonly misspelled words.

    1. Re:using google as a spell checker by Anonymous Coward · · Score: 0

      when you doubt between two spellings of a word, check the search results count in google. I've used that trick.

      Why? If it's an incorrect spelling, it tells you and suggests the correct spelling. If it's a correct spelling, it's usually got a link to the dictionary definition in the top right.

    2. Re:using google as a spell checker by Kafir · · Score: 1

      Is conneXion considered an error? I like it much better than connection.

      It's correct, but British. Just like colour/color, or theatre/theater. Or foetus/fetus, though that doesn't seem to come up so often.

      connexion
      Pronunciation: k&-'nek-sh&n
      chiefly British variant of CONNECTION

      Did it never occur to you to check an actual online dictionary? I use google to see if my usage of a word or phrase is acceptable (or at least common), but a dictionary is probably a better bet for spelling.

    3. Re:using google as a spell checker by tinkerton · · Score: 1

      Did it never occur to you to check an actual online dictionary?

      To be perfectly honest, yes. But I don't want people to think I'm a sissy.

  24. Three types of language by Dracos · · Score: 3, Interesting

    I think that for most of the 20th century, English, and most languages in the industrialized world, was largely static, dominated by the written word which was dominated by proper grammar. Since WWII, popular culture and faster communications have increasingly exposed us to local vernaculars, mostly through radio and television. The written word lagged behind in its cultural evolution.

    Thanks to the internet (initially email, BBS's and IRC, but more widely known on the Web), we now have a hybrid of the spoken and written word: the "typed word". This form of language evolves at the same rate as the spoken word, and injects its own vernacular as a side effect of the medium: acromyn and abbreviation "words" (rofl, how r u), along with common misspellings (pwned), and mixing letters with numbers or punctuation (133t, n00b). All of these serve at least one purpose, whether as a form of super shorthand, insult, the appearance of being "cool", or are merely the result of laziness on the part of the author. Most typed-word terms don't transfer well when spoken.

    One of my hobbies is studying (European) languages and how they are related. Sometimes I worry about the damage the typed word is causing to the spoken and written word (and any proper linguist should at least be interested in the phenomenon). Luckily, most typed word expressions aren't pronounceable, and the ones that are sound absurd, because they are removed from their original context when spoken, and everyone recognizes gibberish when they hear it. How the typed word affects the written word remains to be seen. Yes both are typed now, but only the written word has a chance of going through an editorial process. I think it will take a very long time for the formal lexicon and rules of grammar to embrace, however reluctantly if ever, the typed vernacular.

    1. Re:Three types of language by grahamlee · · Score: 1
      I think that for most of the 20th century, English, and most languages in the industrialized world, was largely static, dominated by the written word which was dominated by proper grammar. Since WWII, popular culture and faster communications have increasingly exposed us to local vernaculars, mostly through radio and television. The written word lagged behind in its cultural evolution.

      You do realise that most of the 20th century happened after the second world war, don't you? A condition that became false after the events of 1945 cannot be considered true for most of the period 1901-2000.

    2. Re:Three types of language by iabervon · · Score: 1

      There has always been a distinction between conversational speech and formal speech. Someone talking on the radio or giving a lecture will use different grammar from what someone interactively talking to a few people would use. They will use expressions which are considered slang or inappropriate, and will mangle the sentence structure for purposes such as getting the sentence over with as soon as its meaning has been conveyed. Diction is often traded for speed in set phrases (e.g., saying "how are you?" will confuse people, because they expect something like "oweryu?" and enunciating the words makes it sound like you think they're sick or something). Furthermore, someone in a conversation will convey meaning through non-linguistic actions, such as laughing, nodding, rolling their eyes, and so forth.

      The typed word is, therefore, a conversational form of writing. It needs a set of expressions to correspond to non-linguistic activities, which is where "rofl" and ":)" come from. Common expressions get shortened. There is also a body of slang where odd spellings are used to mean special things ("pwned" doesn't mean the same thing as "owned"; "That's the fastest computer I've ever owned" is a very different sentiment than "That's the fastest computer I've ever pwned", and, in fact, the latter would indicate that the speaker does not own the computer).

      The typed word is no more likely to damage (or even modify) formal writing than conversational speech has damaged formal speech, and is no more of a problem than earlier conversational writing ("Pls type these Thx, D" as a sticky note for a secretary is is old as sticky notes). (There is a separate issue, which is that students sometimes fail to realize that formal writing is needed in some situations, but that's different from not realizing that formal writing is different.)

      There is a wonderful piece of writing I've read (in an interactive fiction game, Narcolepsy by Adam Cadre, et al) in which the narrator encounters a person using typed-word expressions in speech and describes them as he sees and hears them ("Roffle!" for instance).

    3. Re:Three types of language by Rie+Beam · · Score: 1

      I comma for number one comma fail to see your point period.

  25. My own linguistinc research by Anonymous Coward · · Score: 0

    My favourite piece of linguistic research using Google is to search for "attention to detial".

    I then have a laugh at all the hits...

    http://www.google.com/search?hl=en&q=%22attention+ to+detial%22&btnG=Google+Search

  26. Google as a grammar checker by Hal+XP · · Score: 2, Interesting

    I've had the chance to use Google as a grammar or style checker in my day job as a glorified copy editor. I type two nearly identical expressions X and Y in the search box. If expression X gets 10,100 hits and expression Y only 500 hits, I use expression X.

    For example, as a non-native speaker, I found myself waffling between the expression (A) "run for mayor of" and the expression (B) "run as mayor of." Letting Google arbitrate, I found 14,900 hits for (A) and only 200 hundred hits for (B). I chose (A).

    I discovered there's practically a dead heat between the expressions "a new lease on life" (which, if I'm not mistaken, is the expression favored by American usage) and "a new lease of life," with the latter nosing out the former 144,000 hits to 140,000. In this instance I let my own usage arbitrate. Since I'm more exposed to American than to English, I chose on.

    --
    I'm a sci-fi vegan: I don't want the aliens to think we have as much right to live as the fried chickens we eat.
    1. Re:Google as a grammar checker by thelenm · · Score: 1

      Googlefight is great for stuff like this (saves you a few clicks and some manual comparison of results). I've used it in the exact same way.

      --
      Use Ctrl-C instead of ESC in Vim!
  27. Tongue Gymnastics by Indy+Media+Watch · · Score: 1
    Linguists are gradually adopting the World Wide Web as a useful corpus for linguistic research.

    I love a bit of cunning linguistics.

    --

    Indy Media Watch-Proctologist of the Internet

  28. Reminds me of "Meme Tree"... by Slur · · Score: 3, Informative

    ...which was this little program I wrote around the nascence of the internet. it took any sentence as input and kept a record of which words preceded each word, and which words followed each unique word. The idea was to build up a simple map of which words could precede or follow others completely without context. From this you could follow paths that made sentences or paths that looped forever, or paths that made no sense, and some interesting paths that made unintended sense.

    Why a tree? Language and geneology seem to have a common thread. Meaning is like genetics. Language is expressive. Information is a kind of tree whose branches grow as reality elaborates and past events accumulate. New terms need to be invented for the dynamics we perceive in reality, just as new names are given to individuals as they emerge into the world. Patterns, continuity, periodicity. Such things lie at the heart of material existence and provide the hooks for consciousness itself. Information theory is the next great frontier, along with particle physics. Already they have converged and diverged and converged again. And playing with artificial trees turns out to be a lot of fun.

    As for the "Meme Tree" program ... The next iteration built up a more discreet map by scoring proximity of unique words in sentences and inclusion in sentences together. Again, the idea was to build a simple statistical map free of any context, simply to get a sense of pure lexical association.

    The theory is that the internal consistency of these various lexical maps should roughly reflect many aspects of associative meaning. You could think of the statistical map as a Godelian bubble whose "truth" - if you will - is imposed by the laws governing the statistical associations. We don't derive the laws of language and meaning from these exercises, but we create an internally-complete map that reflects something about the nature of meaning.

    There is a practical aim as well. If you can derive the strength of equivalence and the various levels and colors of associative meaning you could in theory build a "Truth Machine" capable of answering any question with a high degree of accuracy. The result of any question could be computed as any other information retrieval problem would be.

    I never got around to having my little Meme Tree programs scrape the internet for random sentences. However, this should be a very simple thing to do. Google has had programming contests in the past - programs that use the Google database in interesting ways. Statistical analysis of language is basically what they do. Research projects on their data could provide stunning insights into the nature of information itself, its relation to language and to reality, and likely into our very nature as linguistic beings.

    --
    -- thinkyhead software and media
    1. Re:Reminds me of "Meme Tree"... by PostItNote · · Score: 1

      Your second version, with the probabilities, is a Markov Chain - http://mathworld.wolfram.com/MarkovChain.html

  29. BBC voices by matt+me · · Score: 2, Informative

    Link on front page of bbc.co.uk - bbc.co.uk/voices/ - their attempt at tracking accents and dialects across the UK.

  30. Another use of Google in Linguistics by Anonymous Coward · · Score: 1, Informative

    Just a month ago I finished a paper exploring using Google counts in great detail for language analysis and other forms of meaning extraction.
    "Automatic Meaning Discovery Using Google":http://arxiv.org/abs/cs.CL/0412098/

    Comments welcome, -Rudi.

  31. Exactly by sakahna · · Score: 0

    English isn't my first language, so I often use Google to verify the use of an expression by comparing the number of hits I get for various forms, or as a "spell-checker" by using Google "Did you mean" suggestions to correct my spelling mistakes.
    Lately, I find that some mistakes have become so "popular", that I can't do this anymore, because Google now recognized the mistake as a "valid" search word.

    1. Re:Exactly by Anonymous Coward · · Score: 0

      I can't do this anymore

      I'm not entirely certain *, but I believe the word "anymore" is particular to American English and is considered incorrect in other dialects. The two words "any more" are considered correct everywhere as far as I know.

      * Well, I immediately recognised it as an error, but I checked and it appeared in an American English dictionary, which was surprising. I am English, by the way.

  32. Don't trademark that! by perhj · · Score: 1

    Please refrain from trademarking your 'unique' spelling of intellectual. Thank you.

  33. Done: nous sommes desolés que notre president by new500 · · Score: 3, Insightful

    . . .

    Those expressions are then
    used by native speaking politicians and are
    broadcasted by television.


    Dude, it's worse, the French have already infiltrated as far as the advertising business and are using covert channels to spread some dangerous crack i heard was called La Liberte :

    http://french.about.com/b/a/081281.htm

    Slightly more seriously :

    Apart from pointing out that your use of the word native is rather presumptive of geographic origin in this big wide internet thing, i wonder if this linguistic adoption is more one way towards English since the internet. OK the French got Le Weekend, and tons of anglicised nouns, tried to ban them all and didn't manage. But i read Friday that a British pilot training firm lost a contract to a French one. The reason cited by the Asian airline was that, whilst the training had to be in English, the French trainers spoke better, clearer, more intelligble English than did the English. I can't argue with that. Sadly.

  34. lol by Anonymous Coward · · Score: 0

    lol @ anonymous cowards

    lol lol lol

  35. how close are we to self forming dictionaries? by Vnimam · · Score: 1
    Using Google Groups, it is pretty close to using a thesaurus. Personally, it is one of the most fruitful advances I've ever seen from the net. Being of AI-mind...

    My question to all -- so how far are we, I ask to you master linguists + computer scientists, before we will have self forming dictionaries based strictly on cached google data?

    two years?

    -o- Geoff Peters

    1. Re:how close are we to self forming dictionaries? by Anonymous Coward · · Score: 0

      We are very close: My paper at http://www.cwi.nl/~paulv/papers/amdug.pdf shows a method to create a self-forming English-Spanish translation dictionary without using search results at all, just Google page counts and math. Comments welcome. -Rudi

    2. Re:how close are we to self forming dictionaries? by matt+me · · Score: 1

      Computers can learn the meaning of words simply by plugging into Google. The finding could bring forward the day that true artificial intelligence is developed.

      NewScientist (29 January 2005) - Google's search for meaning

      http://www.newscientist.com/article.ns?id=mg1852 48 46.100

    3. Re:how close are we to self forming dictionaries? by Vnimam · · Score: 1

      *Perfect* This is exactly what I was intending (i.e. using google/web as an input device to symbolically build relation constructs to evolve dictionary definitions -- more than simply searching for preformed definitions)

      If I may, anyone out there have ideas on what other linguistic constructs can be self formed based on the massive web corpus as a simple input device? We already have:

      Sets (via Google's Sets)

      Dictionary (via earlier link)

      Thesaurus (via Google's Sets + tweaks)

      Grammar

      Self forming Encyclopedia

      Self forming identity databases (e.g. Googlezon hype)

      and thus,

      Concsciousness?

      -o- Geoff Peters

  36. Lameness by Pan+T.+Hose · · Score: 1

    Indeed what their sayin is true. U can learn English very well, especially grammer readin /. frist psots. Teh intarweb seems to certainly kick arse for that sorta research. Very 1337 articel. Thx d00dz.

    I have just read the above and I must admit it: I am teh lame, amn't I?

    --
    Sincerely,
    Pan Tarhei Hosé, PhD.
    "Homo sum et cogito ergo odi profanum vulgus et libido."
    1. Re:Lameness by Anonymous Coward · · Score: 0

      What ? Doing self-reflections on slashdot and being honest ?

      Go away, you don't belong here... :)

  37. Writing in Japanese by minairia · · Score: 3, Insightful

    I am American but have to write in Japanese for work. No matter how much one learns in school, when one writes in a foreign language, you'll hit a point of wondering if what you wrote is how native speakers say something or is even understandable. Whenever I hit a point like that, I put the sentence in question (or key fragments thereof) into a Google search. If nothing comes up, I know I have to rewrite. If only a few links come up, I know what I wrote might be a little wierd, but is at least understandable. If I get pages and pages of links, I'm golden.

    1. Re:Writing in Japanese by Anonymous Coward · · Score: 0

      Yeah. I'm french and have to deal with english quite often. I use Google in the exact same way.

      Beside that's right, when I read natives I always come to think: "gosh, no matter how much effort I put in line, I'll keep making mistakes and will never sound truly natural". Depressing.

    2. Re:Writing in Japanese by Anonymous Coward · · Score: 0

      Same here. When I get a bunch of results for a single word translation in Jim Breen's excellent WWWJDIC, I go to Google to check out which one of the alternatives is most common (and therefore most "correct" or understandable). Google is an invaluable way to gauge a word's usage.

    3. Re:Writing in Japanese by rossz · · Score: 1

      My wife is a linguist (but I am not), she would NEVER use google hits as proof that her translation is correct. In English, especially, there are far too many grammatical and spelling errors that have come into common useage (think "their", "there", and "they're" or "it's" and "its").

      A high number of google hits could mean the translation is correct, but it could also mean there are a lot of idiots using the internet.

      --
      -- Will program for bandwidth
  38. Linguistics 101 by DingerX · · Score: 2, Insightful

    I use search engines all the time for linguistics reseach: when I'm reading or translating from one language to another, and I run into an odd usage, I just type the phrase in the magic box and *poof*, I get hundreds of contextual examples. Likewise, if I'm writing in a foreign language, and I need to know if a preposition or a construction is correct (and not simply words), again all I have to do is type it in and see what comes out.

    Measuring how the internet changes world languages is only a small part of what the 'net offers those interested in linguistics and linguistic usage. Most of the web data archived on google does not consist of ROTFLMAOs and pwn3ds; it consists of everyday usage, and a good deal of that is from the last decade. Much of linguistics deals precisely with that: how the language is used in a daily basis. That's also how dictionaries come about: they're [i]descriptive[/i] accounts of usage (which is why the high school journalistic trick of beginning an article with "Webster's defines fistula as..." doesn't work. Dictionaries don't lay down the law, they describe it).

    Of course, some people have been arguing that this gives room for errors and abuses. Of course it does! just 'cos something doesn't play by the rules doesn't mean it's not in common usage. And just because people don't follow rules of orthography, grammar and style doesn't excuse us from teaching these things, or trying to follow them. After all, language is about communication, and these corruptions hinder our ability to communicate, especially communicating complex thoughts.

    So yeah, "to impact" is to make an active verb out of a passive participle, and "to impinge" should be used ihstead. There are plenty of uses of "bonified" out there. Google finds about 20,000 such occurrences. That doesn't make it correct. Nor does that make Google's suggested correction "bonafide" correct either (306,000 occurrences). The correct spelling is [i]bona fide[/i] (1,050,000 occurrences).

    And don't worry too much about purely textual forms appearing in speach. LOL is just this decade's SOB. A spoken "I R0XX0R, J00 5UXX0R" shouldn't alarm us too much when we consider all those medical shows where doctors run around yelling "Get me a boron enema STAT!", pompous academics actually say "such economic perturbations may affect the governance of a certain cryptodictatorship, VIZ the United States", and we all drop down to the pharmacist to "Fill an RX", all spoken forms of what are written Latin abbreviations (statim -- immediately, videlicet -- that is, Rx -- Respondeo, although some classicists may insist it's the symbol for Jupiter).

    One linguistic area that is interesting is the gradual adoption of worldwide slang. We hear Americans these days using terms like "Bog Standard" and "Arsed".

    What's the point of this rant? Teh intardnet is a great resource for linguistic usage, beyond the navel-reflection of IT professionals. Disciplines like linguistics deal in examples of usage, and the internet is a great stockpile of everyday language. Descriptive grammar and descriptive dictionaries are not an excuse for ignoring arbitrary rules. Most of the lingusitic phenomena we see with internet usage are not new.

  39. Alot by BabyJaysus · · Score: 0

    Alot, alot, alot, alot...

    I really, really hope that 'alot' will never become accepted usage. But its use seems to be growing... a lot.

  40. KK Phonics by muchawi · · Score: 1

    Slightly off-topic, but does anyone know where I can download (or buy) a font that uses the letters for KK Phonics?

    The IPA phonic set is widespread and available from many sources, but I'm having a hard time finding one for KK Phonics.

    Most dictionaries show pronunciation keys in both, but IPA seems to be more popular currently.

  41. mod parent up by Anonymous Coward · · Score: 0

    Excellent points. Linguists study language as it is used not as it is prescribed.

  42. Using Google as a tagged linguistical data store by saddino · · Score: 1
    My personal interest has been in using Google to return pages related to some search query and then data mining the text on the referenced pages (my company develops a product called theConcept for OS X). For example, doing keyphrase analysis on the first 100 pages returned in the results from the Google search "linus torvalds" returns key pairs such as:

    • operating system
      linux kernel
      free software

    And citations linked to those pairs such as:

    • Linus torvalds as the moving force behind the operating system that is reshaping the computing industry.

      Andrew tanenbaum has been derided for his heavy hand and misjudgements of the linux kernel such a reaction to tanenbaum is unfair.

      Respect for richard stallman's contributions to the free software movement and consider him the real pioneer in the field but I believe...linus who has turned that dream into the beginning of a reality by bringing...next level.


    IMHO, as client-end data analysis gets more sophisticated (and increased broadband used allows for quicker web data mining), linguistical tools on the desktop can leverage the raw data on the web to do some pretty interesting things.

    Really, the web is the largest corpus out there. Using Google is just a great way to get it down to a manageable size.
  43. Programmer grammar by cbr2702 · · Score: 2, Insightful

    Adding or changing characters in a literal string seems like misquoting. Traditionally in handwritten work the comma went almost directly under the quotation mark. When people shifted to typewriters and then computers, an arbitrary choice was made to put the comma first. Most programmers I meet seem to have reversed that choice.

    --


    This post written under Gentoo-linux with an SCO IP license.
    1. Re:Programmer grammar by jasonjacks0n · · Score: 1
      Adding or changing characters in a literal string seems like misquoting.

      Exactly! The comma's not part of the string literal, it's part of the surrounding sentence's punctuation. =)

      On the other hand, an exclamation mark usually belongs inside of the quotation, as it usually means the person speaking was excited. But not always; if, say, you're reporting with surprise what someone said, it belongs outside. Or maybe you need both.. For example, there's a (small, but real) semantic difference between these three:

      When he saw me, he said "hey".
      When he saw me, He said "hey!".
      When he saw me, He said "hey!"!

      I feel like the period should be there in the second version, although my past English teachers would disagree.. I could live without it I guess. They might also think the repeated exclamation mark in the third version was incorrect, although it does provide a way to communicate who was surprised (both of us).

      Does thinking about these things (and actually forming an opinion about them) make me anal? Sure, probably .. but that's what years of talking to computers in extremely tightly constrained grammars will do to a person. ;-)

      I read somewhere that the punctuation-inside-the-quotes thing actually came about because of issues with line-wrapping in early typesetting systems, not for any real semantic reason. And also that in UK grammar, the punctuation usually goes outside the quotes -- maybe they had better typesetters?

      --
      This space intentionally left blank.
  44. Tooting own horn ... by minairia · · Score: 1

    Let me pat myself on the back here .... you know how the Breen site has a link that points to Google images? It was me who suggested that to Prof. Breen a few years back. I hated finding words in Japanese on the site that meant what I wanted to say but either turned out to be obscure and never used, or actually have different meanings entirely (think of English, how lie and lay sound the same, but are very different in meaning.) If no images come up, the word is something no-one ever uses and if the images are all wrong for the meaning I want, I try another word till the images are right. It was me who also got the katakana for Lucy Liu's name listed on the site. I'm still waiting for her to be so impressed by my effort on her part that she turns up at my door to be mine ... LOL, I think it'll be a long wait ...

  45. Ethnologue by suso · · Score: 1

    There is also http://www.ethnologue.com/, which keeps track of over 6000 human languages.

  46. Re:Done: nous sommes desolés que notre presid by Anonymous Coward · · Score: 0

    I don't think there's anything unbelievable about it. I'm a non-native English speaker and I often find other non-native speakers easier to understand than native speakers. Native speakers speak faster, and they often employ more subtle distinctions between different sounds, which non-natives have difficulty hearing or reproducing accurately. Then there are dialect issues: some native speakers are near-incomprehensible even when they attempt to mimic 'standard' BBC English. Another factor is that a non-native speaker may have a more limited vocabulary. Of course, in all this I assume that the non-native speaker is close to fluent, but even a strong foreign accent can be surprisingly easy to decipher for another foreigner.

  47. LanguageLog is not limited to English by belmolis · · Score: 1
    LanguageLog is a resource linked in the article, where linguists discuss current peculiarities of the English language.

    This is misleading in suggesting that LanguageLog is limited to English. Actually, it deals with all sorts of linguistic topics and languages.

  48. Re:Using Google as a tagged linguistical data stor by Anonymous Coward · · Score: 0
    1. The Web is not tagged.
    2. Google supports only a few regular expressions -- not enough to do more than trivial searches.
  49. Types of morphology by tepples · · Score: 1

    [Esperanto is] a joke. Latin wi' t' grammar took out.

    There is no language with the "grammar took out". Every language has a grammar. Some have "fusional" morphology like Latin and Greek, with multiple meanings in a given affix; some have "agglutinative" morphology like Turkish and Esperanto, with simpler affixes stacked in a word; and some have more "isolating" morphology like Toki Pona, Chinese, and (to an extent) English, with each word being an independent unit to a large extent. Over time, isolating languages become agglutinative, agglutinative languages become fusional, and fusional languages become isolating.

    1. Re:Types of morphology by Anonymous Coward · · Score: 0

      If you were half as clever as you try to appear (which is twice as clever as you are) you'd be able to recognise irony.

    2. Re:Types of morphology by tepples · · Score: 1

      AC: If you can't tolerate people who try to dispel a poor attempt at irony that would be easily misinterpreted as anti-Esperanto FUD, then could you please give me some anti-Asperger pills?

  50. Several Things to Consider by Anonymous Coward · · Score: 0

    I have an undergrad degree in Linguistics (U of MN back when they actually had a Linguistics Department), and would like to point out a few things that probably need to be considered when doing this kind of research (in no particular order).
    1. Dialect/Sociolect/Ideolect - What may be acceptable for a black Kentucky high school girl to say to her peers may never be uttered by a white 50 something banker from Seattle. And people have various individual language-use "foibles" that can throw off a study (this is called an ideolect). So, I think this can tend to ingore just how acceptable saying a thing may be. Just because *someone* *somewhere* utters a phrase does not endow it with meaning or acceptability.
    2. Medium of communication - I agree with others here that the Internet is somewhat unique as a form of communicating. It is very interesting to me to observe that we seem to be seeing the rise of "written only" phrases. Obviously, written language evolved from spoken, but this would seem to be a different creature. Will "LOL" or "ROFL" ever really come into spoken usage? (I, for one hope not, but that's another topic). If those phrases do come into common usage, is this a new phenomenon? I can't think of other examples of this off the top of my head. Are there other ways in which this medium influences communication?
    3. Knowledge of the speaker - The Internet provides this "body of data" but how much data does is provide about the speakers (writers)? I suspect that it generally does not provide much. Look at the average Slashdot (or other) posting. You may or may not have information about the poster (I always post as AC), and that information may or may not be factual. How can you know that you're studying the writing of someone from America or Britan or India?
    4. Orthography - It is very common for orthography to be ignored by phonologists. Historical Linguists often rely on orthography to trace a word's history. How valuable is orthography? This kind of searching gives it ultimate value, which could be dangerous. It can also leave out attempts at simplification. If I seach for "laughing out loud" but don't search for "lol" am I really getting everything I am looking for? How do I know every phrase to search for? I think it's interesting to see how ppl simplify orthography for the sake of rapid (or easier?) communication. And it seems the internet goes for a more phonological (and intuitive) orthography with usages like "r u" for "Are you", etc.

    I'm sure there are more aspects to consider, but this is a start. This is an interesting direction for research, but I think it is more thorny than the article lets on.
    --Jonathan

  51. Re:Using Google as a tagged linguistical data stor by ewp123 · · Score: 1

    CBS News Sunday Morning ran a piece today on BuzzMetrics http://www.buzzmetrics.com/, a data mining company that uses Google, among other tools surely, to dig through blogs, forums, etc. to find out what people are saying about particular companies or products. It was interesting that their analysts' job included not only the data mining but helping their customers make sense of the way they said it for incorporation into their marketing campaigns.

  52. a proper linguist's response: by Anonymous Coward · · Score: 1, Insightful

    A user signing as phaln on Slashdot today remarks, apropos of a comment exchange about using the entire web as a corpus (the way we often do here at Language Log Plaza), which led to some comments on the sort of random slangy stuff on the web that might make that a bad idea for grammarians seeking information about English:

    It came to me that the English language was in deep trouble when people started saying "rotfl" and "lol" in person.

    Now, the user is being humorous, of course. But it is remarkable how often people say this sort of thing. It reaches newspaper columns and magazines as well as everyday conversations about language ("Oh, you're a linguist? What do you think about the way Internet slang is changing the language?"). I've heard a half-hour radio discussion about it on the BBC World Service (in the middle of the night; it was a real yawn, a perfect fix for my insomnia). It seems likely that at least some people really do think English might be altered radically by the intrusion of email abbreviations for phrases like "[I'm] rolling on the floor laughing" or "[I'm] laughing out loud" into regular spoken English.

    Don't worry. Nothing radical or even slightly significant will happen. Suppose, say, "rotfl" (pronounced "rotfull") became quite common in speech (which seems unlikely, since if your interlocutor falls down and rolls on the floor laughing it generally needs no comment; but maybe as a metaphor, or on the phone). What would have changed? One interjection (a word grammatically like "ouch") added. Total effect on language: utterly trivial. Not even noise level. Interjections are so unimportant to the fabric of the language that they are almost completely ignored in grammars. There's almost nothing to say. They have no syntactic properties at all -- you pop one in when the spirit moves you. And their basic meaning is simply expressive of a transitory mental state ("Ouch!" means something like "That hurt!"). Don't worry about English. It will do fine. Not even floods of email-originated phrases entering the lexicon would change it in any significant way. If phaln were to suggest such a thing seriously I would be LOL.

    From: http://itre.cis.upenn.edu/~myl/languagelog/archive s/001829.html#more

    Also, for anyone interested, Pullman's crusade against Dan Brown is simply delightful. A good entry about it (Pullman posts about Dan Brown all the time):

    http://itre.cis.upenn.edu/~myl/languagelog/archi ve s/001628.html

    1. Re:a proper linguist's response: by Anonymous Coward · · Score: 0

      Pullman?

      Pullum. :)

  53. It looks like no one read the article by JoeBuck · · Score: 2, Insightful

    It's troubling to read so many comments that worry that the linguistic researchers will find "bad language", and worse, that people have moderated such comments up. It reflects a misunderstanding of what linguists do: they want to get a description of the language as it is used, and as it changes, and historically speaking, usages that start in the gossip of teenage girls often become mainstream a couple of generations later. They need it all, and they probably need the crappy stuff most of all, because it is closer to spoken English.

    1. Re:It looks like no one read the article by mizhi · · Score: 1

      Looks like someone didn't read my entire comment.

      I specifically stated that I was retracting my initial comments, but I kept them in there as a tongue-in-cheek statement. I never said anything about "bad language." I'm well aware of the difference between how language is considered by linguists and English teachers.

      As to the relation of written language on the web to spoken language, I don't think that's been established. I know that in dialogue systems, which deal with spoken language as opposed to written language, the productions of people speaking a language are vastly different than those that they write. Therefore, I would err on that side of the argument.

      --
      Humorless sig goes here.
  54. Eschmiphany by Anonymous Coward · · Score: 0

    Backlash you say...

    But employers have always been the driving force behind stiff formality, which is the antithesis of useful productivity. They are the people that have so many of us trussed up in those idiotic suits and ties all day--and what good has that ever done anyone, outside of the textile industry?

    It's similar but even worse in Japan, where employers train all their new incoming young workers in an entire dialect. That's the "keigo" or polite speech forms, a swampland of gratuitous linguistic complexity that annoys native and foreigner alike. Only the moldiest old hyperconservative codgers keep it alive--because they have the money. Everyone in their right mind shuns keigo, until faced with the job hunt.

    Standardizing languages somewhat does make sense. But letting the [money-minded but otherwise mindless PHB] employers drive the process is hardly likely to yield an improved product for the rest of us to live with. The "back" in that "backlash" is more like the "back" in "backwards".

  55. Adverbs versus adjectives by Anonymous Coward · · Score: 0

    I can help you there, since I do understand the difference. But I'd really rather not...

    Methinks you aren't just annoyed by the really-versus-real dichotomy. There are loads of word-pairs like that in English, differentiated only by the fact that the adverbial form ends in -ly while the adjectival form does not.

    But when you start learning other languages, you find that not all of them have this Nazi attitude [am I comparing you to Hitler already? :-)] about rigidly differentiating their adverbs from their adjectives. I mean heck, all you are trying to do is modify the next word; who needs a whole separate form if that next word happens to be a verb versus a noun?

    Didn't some programming languages learn from this problem a while back? Let's hear it for operator overloading.

    Features like that -ly fascism add needless complexity to a language. For example, think of that craziness about associating gender with all nouns in the Romance languages. [Who isn't annoyed by that??] Features that maintain needless complexity are the first things to be discarded when language reform or evolution comes along--and we all say good riddance.

  56. Too much time? by Anonymous Coward · · Score: 0

    Umm, don't you think it is more likely that the "people with too much time on their hands" are the ones who have time to look up what is proper usage?

    "Fracking", you say. Yeah, I sometimes don't have time to look up the proper spelling of a word either.

    Cheers :-)

  57. Interesting; thanks by Anonymous Coward · · Score: 0

    Glanced over your paper, nice work; thanks for making your implementation available, I'll have to try this sometime.

  58. Re: Who should probably avoid Slashdot ??? by hel+shwarts · · Score: 1

    Very interesting conversation! Really don't want to criticize anybody's comment...Well say if xxv century person would read our messages now, what do you think he could say about the language we use?! He would say we speak like morons & have no respect to the proper form of word, hahaha. Now ask a modern british man what he thinks about the way americans speak, hahaha, ask american what he thinks about the way irish speak...Geographical, political/economical & time factor play the key role. Besides, the main tendency of any language is the tendecy of simplifying itself in it's development. Internet does the greatest job in this sence , cause it globalizes every new thing we get. It mirrors everything. So makes sence to use it for linguistic analisys. Also I think it's a little bit too late to be worried about preserving "the pure inglish language". Things got way too far. There's millions of people who learnED english & in english speaking society has much stronger position than many "native english speakers". They also speak chinese, japanese, hawaiian, russian, german, french, spanish italian... They grow up & "launch a Google" . And their children'll grow up bilingual as well.& No matter what you are saying about the pure great english...they'll keep chatting during the expensive american school classes & use all them : "U, cuz, gotta, thanx, ru, lol, etc..." & form the modern language. But this is not like writing, by a native speaker :" GESS HOW I SORE" (from one of the comments here, ahaha) Do you feel it?! They are the future of the world. & Many of them start their english with slashdot. Good for them. Good for slashdot. Sooner or later English will be strongly influenced by other languages, as well as will get simplified in use.

  59. Re: Who should probably avoid Slashdot ??? by mizhi · · Score: 1
    Besides, the main tendency of any language is the tendecy of simplifying itself in it's development.


    I don't think this has been established one way or another. Linguistic complexity would be an incredibly difficult term to objectively quantify.
    --
    Humorless sig goes here.
  60. Literal Minded by AWG · · Score: 1

    For those that are interested, another great linguistics/language (English, mostly) site is Literal Minded. He gets into backformation and all sorts of weird language phenomenon. The author often uses google to justify (or disprove) his theories. And no, I'm not him... just an avid reader!

  61. Re: the tendency :) by hel+shwarts · · Score: 1

    In my linguistic school I was told many times that this tendency does take place. And that's why often we pronounce one way & write the other. The written form is more fixed. Oral speech changes constantly & mirrors all the factors which influence it's development. Next the written form gets transformed, catching up after the oral. Say gorgeous. // splendidly or showily brilliant or magnificent// From gorge, gorget. Etymology: Middle English gorgayse, from Middle French gorgias elegant, from gorgias wimple. Now let's see... middle english gorgayse & modern gorge. What looks simpler ?

  62. Re: the tendency :) by mizhi · · Score: 1
    gorgeous. // splendidly or showily brilliant or magnificent// From gorge, gorget. Etymology: Middle English gorgayse, from Middle French gorgias elegant, from gorgias wimple.


    gorgeous
    Etymology: Middle English gorgayse, from Middle French gorgias elegant, from gorgias wimple, from gorge gorget
    : splendidly or showily brilliant or magnificent

    Shouldn't you be comparing gorgeous (modern) and gorgayse (middle English). It seems that, earliest to most modern it went

    gorget -> gorgias -> gorgayse -> gorgeous

    No?

    I remember from my liguistics classes that it's not really possible to say which language is more complicated than another.
    --
    Humorless sig goes here.
  63. Compression is a stricter test for AI than Turing by Baldrson · · Score: 1

    Text Compression as a Test for Artificial Intelligence, 1999 AAAI Proceedings. Matt Mahoney shows that text prediction or compression is a stricter test for AI than the Turing test. (1 page poster, compressed Postscript).

  64. Re: the tendency :) by hel+shwarts · · Score: 1

    ok gorgayse came from gorget & turned into gorgeous, which is being pronounced differently then spelled. (base -gorge ; same as fame-famous) So this base got simplified as the time went by & got the modernized ending - ous. P.S. : synthetic languages are for sure more complicated. I'd rather die then learn one.

  65. Re: the tendency :) by mizhi · · Score: 1

    Well, English has always had a problem with the spelling being not quite being consistent the actual pronunciation. I'm not sure this proves the point one way or the other though.

    I agree with you on the synthetic languages. On the other hand, it could be argued that the complexity in syntax is just moved into the morphological level.

    --
    Humorless sig goes here.