Slashdot Mirror


Mining Neologisms from Wikipedia

holy_calamity writes "Natual Language Programming researchers have developed a tool called Zeitgeist that can discover the meaning of new words for itself using Wikipedia. It looks for entries for words not in the WordNet database and works out their meaning by looking for known words linked to them. Development of the tool is focusing on using it to understand what bloggers (using slang and neologisms) are saying about companies' products."

20 of 93 comments (clear)

  1. Id love to see what it came up with... by mdhoover · · Score: 5, Funny

    if they pointed it at slashdot...
    "ass-hat" and "tard" could take on a whole new meaning

    1. Re:Id love to see what it came up with... by Harmonious+Botch · · Score: 4, Funny

      You're assuming that it would be spelled correctly.

    2. Re:Id love to see what it came up with... by truthsearch · · Score: 2, Funny

      You're right. Now that a comment with those words can be attributed to your username mdhoover will forever be synonymous with ass-hat and tard.

      Damn! Now so will truthsearch! Son of a...

  2. slashdotting (n., neolog.) by ettlz · · Score: 2, Informative
    The Slashdot, Digg, or Fark effect is the term given to the phenomenon of a popular website linking to a smaller site, causing the smaller site to slow down or even temporarily close due to the increased traffic. The name comes from the huge influx of web traffic that often results from sites being mentioned on Slashdot, Digg, or Fark.com, popular user submitted news and information sites. Typically, less robust sites are unable to cope with the huge increase in traffic and become unavailable - either their bandwidth is consumed or their servers fail to cope with the high number of requests.
    1. Re:slashdotting (n., neolog.) by Alex+P+Keaton+in+da · · Score: 2, Funny

      Slashdotting is slashdotting. It is irritating that they are trying to rename it "the Digg or Fark effect." If Digg or Fark cause a site to get hammered, it should still ne called slashdotting. Why? Because we (this community) are the originals, and still, in my opinion, the best.
      By the way, I have an odd problem with the word neology. Why? Because in my 7th grade Latin Class, one of our assignments was to be a neologist, using latin roots to make up a new word. So the word neology makes me think of 7th grade. And 7th grade (well, much like now) was a time when I was especially awkward...

      --
      And All I Ask is a Tall Ship And a Star to Steer Her By
  3. Just imagine... by packetmon · · Score: 3, Funny

    Imagine the chaos and reboots as the program analyzes a George W. Bush speech

  4. Marketing research on the net by perkr · · Score: 4, Insightful

    Figuring out what people on the net says about your products is the "new" thing apparantly. IBM has their own engine for the task too. Kind of makes you wonder how much power the net community will in fact have in day-to-day decision making in the corp head quarters' marketing strategy depts.

  5. say hello to dictionary bombing by brunascle · · Score: 4, Funny

    George W. Bush
    n.
    1. 43rd president of the United States.
    2. miserable failure.

  6. But Wikipedia seeks to avoid Neologisms! by sbaker · · Score: 4, Informative

    The trouble is that Wikipedia has a policy of not writing about (or using) Neologisms:

        http://en.wikipedia.org/wiki/WP:Neologism

    Many articles about neologisms *do* get created in violation of this policy - but they are generally put up for deletion via the Wikipedia process for deleting inappropriate material - so they only exist briefly.

    So, for example, the article entitled "Windows Rot" is being debated today, Although it looks like this one will be merged into an existing article, it won't survive as the name of an article - so Zeitgeist presumably won't be able to find it.

    It may be that enough of these kinds of articles slip through the system to be useful to Zeitgeist but that is not by design - so coverage will be patchy at best.

    A further consequence of this is that the articles that Zeitgeist does find will most likely be so new that only one person will have worked on them - which will make for poor quality.

    Also, it is very common for people such as bloggers who come up with what they consider to be clever new words to try to wedge them into common usage by writing about the word in Wikipedia. This 'vanity word' problem is one of the main reasons that Wikipedia seeks to avoid articles on neologisms.

    --
    www.sjbaker.org
  7. For slang, it is useles without a context by aadvancedGIR · · Score: 2, Informative

    For example, in french slang, the same person could use the word "batard" as either an insult or a display of respect, and neither of these meaning is related to the target's father.

    I wish them good luck...

    1. Re:For slang, it is useles without a context by RubberBaron · · Score: 2, Funny

      Yeah, you gotta admit, it's a wicked idea...

  8. omg it reads L33t? by bombastinator · · Score: 2, Funny

    31g 3r0+her iz wa+ch1ng U!

  9. What if it went in to a loop by clickclickdrone · · Score: 5, Funny

    and started creating its own gazornaplatting words that no-one but the program itself could middlybundy? It could eat up bibblys of disk space as all the new words chimmdudlied in a grawn.

    --
    I want a list of atrocities done in your name - Recoil
    1. Re:What if it went in to a loop by sbaker · · Score: 2, Funny

      started creating its own gazornaplatting words

      Gazomplat. Wow! I remember that word from the mid 1970's. Bear with me a moment...

      When I was learning to program in FORTRAN in my high school math class. Our teacher (who didn't know how to program either) was trying to teach us by the age-old process of reading the book one chapter ahead of the class she was teaching. As a consequence, she was no better at it than the rest of us and we ended up debugging her code about as often as she helped with debugging ours.

      Anyway, she was trying to write a program to sort words into alphabetical order - and something went horribly wrong and the program spat out a series of nonense words made up by chopping up and reordering the words it was given. Most of them were unpronouncable garbage but a couple sounded like real words.

      Gazomplat was one of them. It's such a nice sounding word that it's usage spread through the math class and beyond - since it had no meaning, it could be stuck into conversation at any convenient point. So it's use as a verb: "Gazomplatting" is entirely appropriate.

      --
      www.sjbaker.org
  10. chance by Jon+Luckey · · Score: 2, Funny

    Sounds like a excellect chance to inject some new perfectly cromulent words into wide use.

    --
    -- 3 events that reshaped the world in the 20th century: WW1, WW2, and WWW
    1. Re:chance by shotgunsaint · · Score: 2, Funny

      You've embiggened the Wikipedia with your cromulent entry.

      --
      The future isn't here until I can type "car keys" into Google and have it say "You left them in your pants last night."
  11. Step One is Complete by Hoplite3 · · Score: 4, Funny

    Time for step two: deliver a mild electric shock to neologism users. Then I won't have to hear "blogosphere" ever again.

    --
    Use the Firehose to mod down Second Life stories!
  12. Santorum! by mr_stinky_britches · · Score: 4, Funny

    One of my personal favorites is the word Santorum.

    --
    Censorship is obscene. Patriotism is bigotry. Faith is a vice. Slashdot 2.0 sucks.
  13. Hello? by MarkusQ · · Score: 4, Interesting
    Development of the tool is focusing on using it to understand what bloggers (using slang and neologisms) are saying about companies' products."

    You do not need a fancy program to do this. I can do it for you, without even reading the blogs in question.

    Watch.

    They are saying your products suck, and that your customer support is worthless.

    See how easy that was? Now, you might be wondering how I know this. Simple. They don't use made up words to say good things about you. I'm not sure why (maybe they aren't worried about being sued for saying good things?), but the pattern is very consistent. If somebody goes to the trouble of writing about you in their blog using made up words, they don't like you or the horse you rode in on.

    Likewise, if you are a journalist, they call you funny names (Steno Sue, Laura Dildo, Kneepads Miller, "Dollar a Word" Armstrong, etc.) because they've noticed that you consistently write to favour a certain party, position, politician, company, or lifestyle, even when this requires ignoring a pile of facts the size of Paraguay, any one of which would shred your position.

    And if you're a politician, it means that someone noticed that what you say in speeches is so unconnected to what you do with the office you hold that the only link between them is the way in which they combine to mollify your nominal constituents while maximizing the benefit to your corporate sponsors.

    If you are an industry association, they are saying they hate you, period, and that you are evil incarnate.

    See how easy this is? If you still don't get it, I am willing to come out of retirement as a consultant to explain it to you, provided the price is right.

    --MarkusQ

    1. Re:Hello? by Anonymous Coward · · Score: 3, Funny

      Mod parent doubleplusspiffy.