Tracking The (English) Words We Use
Zugok writes "Wordcount.org has an interactive presentation of the 86,800 most frequently used English words. In addition they have Query Count which is a dynamic database of what are the most queried words on Wword Count. Then there is the conspiracy corner where certain words seems to end up in some sort of eerie order. Cowboy comes 14834 and Neal comes 18928. Bebop comes 70673."
fuck is number 5598
Actually, I expected this to be higher since I watched Goodfellas last night.
Cowboy comes 14834 and Neal comes 18928
Yet google and slashdot hasn't even been indexed yet... hmm tin foil anyone?
That has to be the coolest use of Flash I have ever seen that wasn't simply an animation. I guess I won't adblock it.
Fu*k = 5598 (unbelieveably)
Cu*t = 18636
Bush = 2629
Microsoft = 4304
Hm...I would have thought things like "the", "and" or "or" would have beat out "dog" "pussy" "sex".
Hi there
At least love @384 ranks above hate @3107
I think the world isn't so bad...
All the worlds indeed a
Goatse is not found in the list!
I mean, I guess I should've known, but I didn't expect the font size to be so damned *large*!
(Not, of course that anyone would waste work time by reading
when the word "money" makes place 227 while "love" is at 384. Or maybe I am just turning into some sort of postmodern hippie. ;-)
is the last place word, as you can see, because I'm sure this Flash database will survive a /.ing.
This is so bad. No one should make their information to depend on non-free software. I will not install flash to see this.
Bloody hell, I wonder what other words are _not_ so frequently used then.
but Windows ranks at a disturbing 1169. ;-)
It would be nice if the list were available in plaintext form, instead of this slow and miserable Flash presentation.
This is a prime example of Flash being misused. It's not needed at all, and only serves to slow things down. It also makes it impossible to use the data for anything useful.
This space intentionally left blank.
"WordCount was designed with a minimalist aesthetic, to let the information speak for itself."
Which explains their logical use of Flash.
Do you have Flash 6?
Are you using the click to play extension? Click it quickly.
Having looked at the top ranking content in the query counter, I am willing to hazard that it has been subjected to a compaign of repetative queries from a group of tourettes syndrome sufferers. Or possibly just bored students. Or both.
1) que
2) centre
3) colour
4) dialogue
5) program
6) pyjamas
Why yes, I am american
"If you think you have things under control, you're not going fast enough." --Mario Andretti
I know there are already types of compression that take the most common letters of a document, and then builds a binary dictionary off of it, to create the most efficient way of storing the data. Perhaps this database could be used, as a static dictionary, and compressing documents could be even better, though the db queries might slow it down.
je suis parce que j'aime
... you know, the information visualization/presentation guy, might like their display.
"Grok is not currently in the archive"
I wonder what the counts are for "slash" and "dot" ever since this bloody site came online.
what the ranking is for the word /.?
Nice concept for a web site, but the gratuitous use of technology gets in the way.
So, how often do people look up the word, 'Wword'?
[
"Slashdot" and "effect" are located at?
we're driven:
"work" is #103
"play" is #443
and imaginative: "what if" (45&46)
77-81 seem to portend humanoids or clones "may these new also people"
"Win treats sysadmins better than users. Mac treats users better than sysadmins. Linux treats everyone like sysadmins."
...teh, noob, and haxor... I have trouble believing they aren't ranked heheh
"I'll waste 'em with my crossbow!" ~Bob Herzog, Power Gamer
1337: becomes. I 'becomes 1337'?
31337-31338: Redeeming brothels.
666: easy workers?
Founder of Mirror Moon - Tsukihime Game Trans
I really am sick of sites that require flash to get actual information. It should be part of the usability guidelines of the web that information be required to be in at least format.
Take these two sites for example. I work in the healthcare profession and we don't run our machines as administrators, and flash isn't installed default on Win2k. When you go to Ochsner's Health Plan website, you can't do anything unless we, as administrators, log in and install flash for them from the activex control, just to log in as a provider.
Also, Houston RoadRunner is the exact same.
I hate flash, a lot, and It annoys me because you can't manipulate fonts, you can't use scroll wheel most of the time, all the control is taken AWAY from the user. I love flash when used for hilarious web cartoons, but using it for content is ridiculous.
Chris
Perhaps sites like this will encourage the creation of word flashmobs. A group of people would conspire to overuse some obscure word to boost its rating. Bombing the word within blogs, web pages, and postings might help the word spread into wider use and rise in the rankings. It could even be a competitive sport -- two teams pick two words of adjacent rank and the team whose word rises the most wins.
Two wrongs don't make a right, but three lefts do.
this is a list of 86,800 words, so naturally this includes words used by people with larger vocabularies than yourself.. otherwise it would be a list of about 100 words, if that..
when the words "to slashdot", "slashdotting", and "slashbot will be included in the English dictionaries...
How about we post a front-page story to Merriam-Webster's and Oxford's?
Words 29350-29352.
Support the First Amendment. Read at -1
It seems there's a little over 40,000 words (excluding proper nouns etc.) in use in the sample text (whole english-written web?). I'm making that estimate based on the completely unscientific observation that after that point, most of the words seem to be place names etc.
I know there are proper nouns before that point, but they're presumably balanced by the non-proper noun words after that point.
Struggling to find a day everyone can make? WhenShallWe.com
I half expected this wordcount thing to, well, count real English words. OMG ranks at 43712.
:/
P.S. WTF Did not rank
The archive bills itself as "...an interactive presentation of the 86,800 most frequently used English words."
Last I checked, "Linux" is not a word in the english language.
For the same reason, you're not going to find "Slashdot", "jSyncManager", or "iPod", regardless of how many times they're used online.
Yaz.
Computer is #705, immediately followed by security at #706....
"Yes, Alex, I'll take 'Curious Coincidences' for $200."
Rule #1 -- Politics always trumps technology.
As a test I looked up the word "revile" but noticed that it is flanked by some words that I've never seen in my life. Before it comes "herbes" and two after it comes "karkason". Neither of these appear in dictionary.com. Some other of it's rather colourful neighbours include "travertine" and "lucchese". ;-)
Call me cynical but I'm pretty sure that they just made these up to make up the numbers
Motherfucker is #76086! Congrats to all who keep it alive!
I have to learn German. I need the 86,000 most-commonly used German words. This would give me a nice target of words to get to know in the process of learning it ...
; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
1941-1945:
faith establish facts requires membership
Tom Cruise hacked their website!
I'll do it for cheesy poofs.
$ GET 'http://www.wordcount.org/dbquery.php?toFind= slashdot &method=SEARCH%5FBY%5FNAME' | perl -pne 's/&/\n/g;s/=/\t/g' | less
..or has someone hacked the query count part?
Book comes in at 357, Television comes in at 1022 and TV comes in at 1577.
Ah, now I know what's wrong with it... It's "Artistic" so it doesn't have to mean anything. I mean, nobody would find it useful if the number of occurrences of a word was given.
Here's the bit that would make you choke on your cornflakes... Tell me, what was the award trophy? A chocolate tea pot?
Where's the Kaboom?
There's supposed to be an Earth-shattering Kaboom.
God = 376
Satan = 12864
encoraging !
That has to be the coolest use of Flash
It is a cool idea and it has been implemented with Flash.
I'd like to see it implemented without Flash. What is cool would then be more accessible and available faster. That would be more compelling.
.. 'Microsoft' is at a disturbing 4304 which puts this word ahead of 'Fuck' at 5589!
/. readers as a basis for working out their word collection.
This means that either:
1) That people at large think more about Microsoft than copulating. (Unlikely)
2) They used a bunch of
Only to idiots, are orders laws.
-- Henning von Tresckow
From the web site: "WordCount data currently comes from the British National Corpus, a 100 million word collection of samples of written and spoken language from a wide range of sources, designed to represent an accurate cross-section of current English usage."
The BNC gives British English usage, not English usage. It does not reflect U.S. English usage. Britain has only 59 million people. The United States has 294 million people. Britain has a small percentage of the total number of English speakers.
This is just an ad agency attempting to advertise to prospective clients. The agency is trying to impress people who aren't very knowledgeable about advertising. A knowledgeable person would be very negative about the agency's misuse of facts.
--
24 wars since WW2: Creating fear so rich people can profit.
What happened to the days when only California teenagers/surfers used 'like' for every second word. I really noticed this when I went back to university recently. It's really, like, annoying to listen to, like, the kids today, like, use the word 'like', like, five times in one, like, sentence.
... the word porn is not in the archive. I mean, how the fuck did that happen? Is this the vice that dare not speak its name?
Tubal-Cain smokes the white owl.
Encoraging = 0
Reassuring.
#SickNotWeak
Comment removed based on user account deletion
This shit is broken or it's so 1337 that only other geeks losers and nerds can use it.
The word is
More info on Just the words (Monty Python)
Apparently, the 'word' NaN is used a lot! :-)
NaNNaNNaNNaNNaN
Slashdotted?
Sig Nature
It was rejected.
"Word Count Tuesday August 03, @06:04AM Rejected "
992-995 america ensure oil opportunity
3046-3051 iraq winner, fucking smooth, nick votes
I don't speak english you insensitive clod.
In the immortal words of the Butthole Surfers:
"This here a song, is about John W Smoke Junior
It's about bein' in love and lovin' the love that's hatin' the love
the love and the love and the hate that's lovin with all
It's around the love that's hate that's the hate that's the love
And the love is the love that is the hate that's hatin' the love,
it's lovin' the hate
It's about John W Smoke's mom, it's with his mom
it's about his mom it's about his mom it's about lovin his mom
and bein' without his mom and lovin' the hate that's hatin' the love
and his mom and all the time they're there
Hatin' the hate that's lovin' the hate it's love it's the love that's hate
And it goes somethin' about like this"
(Score: -1, Stupid)
Just wanted to say it, that is all.
Langenscheidt, Oxford-Duden, and Berlitz (among others) publish small paperback Deutsche Worterbucher. The Langenscheidt version on Amazon.com is described as containing "55,000 references" while the Oxford-Duden has around 70,000 entries.
I was an exchange student in Germany *mumble* years ago and the Langenscheidt dictionary was extremely useful.
I want to drag this out as long as possible. Bring me my protractor.
The BNC gives British English usage, not English usage.
English:- " of or relating to or characteristic of England or its culture".
Use of terms such as British English is a tautology!!.
If you're going to differentiate between the English language and that variation spoken in the US [and in many places across the globe], feel free, but at least be honest about it and refer to your [linguistic] deviations as American "English".
Hi everybody!
Jon Harris here, the creator of WordCount. The server got slammed by SlashDot (thanks guys!) and and my server is down. There is a mirror of WordCount up at: http://www.fabrica.it/wordcount
Enjoy!
Best,
Jon
looks like they got out-of-words lately
"
Visitors
We are sorry but this site is experiencing difficulties at this time.
Please return shortly!
Thank you for your patience.
Webmaster - please contact support as soon as possible.
"
"There is nothing more frightful than ignorance in action." Johann Wolfgang von Goethe
Nearly everyone uses conversational particles that serve the purpose of inserting pauses into the flow of words. For the most part, they exist to allow a moment to think about wording. With a few exceptions, they aren't words and have no meaning in themselves.
One of the problems with "like" is that its meaning serves to dilute the meaning of the words that follow it. Consider the difference between saying that your coworker is an expert in XML vs. saying that he is like an expert in XML. Most of the time, "like" is simply verbal whitespace. Unfortunately, it can be used as a weasel-word to avoid committing oneself to a definite statement. And the choice between those two options can be made after the fact.
How can they have something called Conspiracy Corner in a site called number27.org? 7-2=5, but still, it's ungainly :(
I didn't think the house band in Hell would play this badly.
Already been done
Visitors
We are sorry but this site is experiencing difficulties at this time.
Please return shortly!
Thank you for your patience.
Webmaster - please contact support as soon as possible.
Offtopic, Inflammatory, Inappropriate, Illegal, or Offensive comments might be moderated up.
on TV (according to George Carlin) are: shit, piss, fuck, cunt, cocksucker, motherfucker, and tits.
Insanity: doing the same thing over and over again and expecting different results. Albert Einstein
you might want to look at one of the phrase books, usually available in the pocket dictionary form factor. It sounds like you have learned some of the Grammatik already, so you will actually understand the sentences once you get the Vokabeln down. The phrase books I have seen/used have sections for travel, restaurants, etc. - everyday situations one might face while in-country. I believe the Berlitz phrase books (vice their dictionaries) are for people who don't speak the language at all, and may do more harm than good if the translations aren't precise.
I haven't studied German for quite a few years, and I find that the grammar still makes sense but I have forgotten a lot of the vocabulary. D'oh.
Viel Glueck and viel Spass!
I want to drag this out as long as possible. Bring me my protractor.
We are sorry but this site is experiencing difficulties at this time.
Please return shortly!
Thank you for your patience. Webmaster - please contact support as soon as possible.
LMAO!
Any Mirror?
Why does yahoo do this
This list only has ENGLISH words, and as we all know, "grok" is Martian.
"If at first you don't succeed, lower your standards."
The BNC only goes up to 1990, as well. Linux wasn't a word then. Microsoft ranks 5293 on the list I've got, occurring 1704 times in 100 million words
To fight keyword stuffing, I believe keeping track of the word use distribution in an email would help us judge the spam potential.
Information: "I want to be anthropomorphized"
Although I am American, I often catch myself using the Brittish spellings. When I was in highschool, we didn't have wordperfect or MS word on our computer, we had some silly bargin bin $5 "Easy working word processor" And it's spell check perffered the brittish spellings. I blame it for my poor grades in english. My teacher qould not accept "colour" or "Centre", or "programme" or any other brittish spellings.
Well.. maybe. Or Maybe not. But Definitely not sort of.
..means 'the act of estimating as worthless.'
-To you and me, it means calling something shit.
(teehee. finally found a way to post that one)
Except 'fuck' is often found *before* 'Microsoft' on Slashdot.
500GB of disk, 5TB of transfer, $5.95/mo
Even so, I kinda agree with what you say, that the site is close to misrepresenting itself. But the greater dishonesty is surely that the bloody thing is just grandstanding with public data -- it's almost useless, presumably by design, for practical purposes. So, yes, I too would rather the authors had been clear about their American background.
Here's some stats ...
It looks to me as if the sums work like this:
375m (1st language)
375m (2nd language)
750m (learned English as a foreign language)
-----
1500m
http://www.britishcouncil.org/english/engfaqs.htm
Actually, there are at least nine priors.
I've found that my posts don't format quite right w/o a sig.
Also, it's not as if you are "correct" and the American "incorrect." Languages are fluid. Languages evolve, including English. Brits (I include Canadians here, having severed ties only quite recently) have really screwed up the proper German you were taught ~1500 years ago too. And the Norwegian you were taught ~1200 years ago. And the French you were taught 968 years ago. As such, would you consider the entire English language "incorrect?" Many words had various spellings in the 1600s when English was brought to America. As such, it's not accurate to claim that the American spelling is incorrect, when we simply chose one of the accepted spellings at the time and the Brits chose the other. It might be different if the English language had an established spelling for a certain word by 1500 and Americans changed, but this is not the case. For all the pedantic spelling and grammar correction, many Brits (and Canadians) seem to be ignorant of the history of their own language.
One might also suggest that you not engage in such displays of self-superiority - "When in Rome..." one might say. You seem to share the attitude of tourists in foreign lands who expect to have waiters (for example) speak their own native language and become irate when the waiter can't or won't. Admittedly, Americans are one of the major contributors to the image of the self-righteous tourist, and I find that disgusting too. Ultimately, one can adapt to your host nation - even if it's simply over the phone - or one can maintain self-righteousness and deal with the inevitable inaccuracies. What does one gain from this exchange, anyway?
As for the Americans in Canada you cite, their mistakes are borne of ignorance rather than self-righteousness. The difference borne of ignorance is correctable. I would politely, without condescencion, inform them that the letter they refer to as "zee" is called "zed" by the rest of the English speaking world. If they insist on maintaining their behavior, then your ire would be well-placed - if you didn't insist on doing the same, that is.
All in all, there's really no need for this "whose language is correct" debate. Language is a tool. If you can effectively with the other party, you have no problem. Your problem is you intentionally choose not to simply due to ego, which I find baffling.
-Looking for a job as a materials chemist or multivariat
"All your word are belong to us!"
-- Microsoft
Is NaN actually a word? It seems to be the most and least common
I wonder how often those two words are used, especially in that particular order.
But the popups still work... Hmmm....
These people looked deep into my soul and assigned me a number based on the order in which I joined.
Please don't present an argument about technical issues based on how you 'hate' a technology. We have to examine technologies and their implementation on their own merits, not based on emotion.
Drill baby drill - on Mars
Guybrush Threepwood doesn't even feature in the list, so I bet you have a lot of mod points left...
Oh look, I mentioned him.
I have been a user for about 10 years. This ends Feb 2014. The site's been ruined. I'm off. Dice, FU
Is anyone else getting this strange response for any inquire? This happens when I click on the occurance graph, when i type in a field, nothing happens. Is it just that my browser is out of date? or is it slashdotted to errors now.
All I know is that is /. is not in there yet, it will be after [they] realized it was /. who F'd their servers
Holy shit, and you thought the guy down the hall was crazy!
...Slashdotted.
"Destroy science and religion. Science would re-emerge exactly the same; but not religion." - Penn Jillette, paraphrased
Frank's frank.
I really am sick of sites that require flash to get actual information. It should be part of the usability guidelines of the web that information be required to be in at least format.
I'm really sick of people bitching about other people's websites.
"It should be required", hah!
...disables the back button in IE. I have a few words I could use to describe how I feel about that.
Why do they do that? Do they really think that I'm going to hang out on their site just because the back button won't work?
two words "obstinate" and "obscure" and got the result that they are not in the archive. What has happened to the language?
everybody calls it goatse...
Actually that's in fact not true. English is placed in the Germanic language group by linguists. Of course it has a strong French influence (as indicated), but it's primarily a Germanic language. Having studied German (and obviously speaking English), there are many similarities between the languages. And this was even more the case for (old?) English, particularly before the French invasion in 1066.
As for the ignorance; you're right that it's correctable, but it's ironic that we know enough to explain the differences to the ignorant individuals. I honestly attribute that to the permeation of US culture in the rest of the english speaking world. We undnestand the differences between US culture and our own, far more than Americans understand the differences between their culture and others'...
There are various reasons for this. First, 90% of Canada's population lies less than 100 miles from an American border; the same cannot be said for America. As such, it is understandable that the average Canadian comes into contact with Americans more than the other way around. Also, for right or wrong, America is large and rather powerful, with many large businesses. Canada is a major trade partner with the US, but I believe we're more a part of your trade than you are ours. Again, not a right or wrong argument, but the way it is. Since we're bigger, and dabble in world politics more, it's unavoidable that we're more on your radar.
The only things some Americans probably only know about Canada is 1)beer, 2)hockey and 3)eh!
See, we do know everything about Canada. ;) You're probably right, though. Although, Americans who live near Canada know your legal drinking age is lower than ours. So that's four things.
And I pity the poor soul who hasn't had Canadian beer :)
I keep hearing about good Canadian beer, but have never had it (All we get is Molson - I'm not impressed). As a major brew-o-phile, I'm certainly interested in better Canadian beers if they exist. I've never had good experience with Canadian beers, but I expect the Canadian beers I have available are like the American beers you have available: ie, mass-produced shit. For what it's worth, America's starting to (in the last 5-10 years) to make some GREAT beer. One of my favorites is Stone Brewing Co. I realize American beer gets a bad rap - deservedly so - but it's changing dramatically, and I'd say most of the innovation in beers today is in America - ironically enough, because we have little beer tradition to maintain.
-Looking for a job as a materials chemist or multivariat
Bender's Top Ten most frequently uttered words (and their comparison to Wordcount):
10. Chump (#60954)
9. Chumpette (not found)
8. Yours (#2376)
7. Up (#56)
6. Pimpmobile (not found)
5. Bite (#5922)
4. My (#69)
3. Shiny (#8590)
2. Daffodil (#27591)
1. Ass (#15036)
I looked up the most serious and well known obscenity in English, the 4-letter one beginning with "f", and it was not present, but that might be a choice they made. However, the most-defined word in the English language, run is not present. Two major and significant words and neither are found. So that means the system is garbage as far as I'm concerned.
The lessons of history teach us - if they teach us anything - that nobody learns the lessons that history teaches us.
Some years ago, I used ot be involved in the Camarilla roleplaying organisation (I got better). Their Mind's Eye Theatre system used disciplines based on what were reasonably obscure words - such as obfuscate, protean, auspex, diablorie, etc.... We used to refer to the system as teaching adverbs to Americans
Sara
Designer, Gamer, Macgrrl in an XP World
And then the U.S. Congress went and tried to replace "tits" with "asshole" in its most recent list of seven (HR 3687).
I'm really sick of people bitching about other people's websites.
Then vote out the legislators who created your jurisdiction's counterpart to section 508 of the Rehabilitation Act, which mandates web accessibility for businesses that do business with the United States Government.
Flash, when implemented properly
What I've seen of Flash, other than satirical or otherwise comical animation pieces (such as what is seen on web sites such as newgrounds, killfrog, jibjab, joecartoon, etc), is seldom implemented properly.
It is a lot more efficient than re-loading a whole page of HTML
But is it accessible to users with disabilities?
a quick google search reveals: http://www.fabrica.it/wordcount/main.php
I don't know why you were modded down for that, it's a perfectly reasonable comment which I happen to agree with.
Drill baby drill - on Mars
It's already been done.
DNA is a Turing machine. You, however, being dynamic and emergent, are not.
It is a foreign language. However the issue I take is that you Americans insist that you are speaking English - you are not, you are speaking American. When I'm talking to a non-English speaker, if they try speaking English and they make a mistake I might correct them (assuming its a situation where it's polite to do so). Similarly if you, as an American, insist that you are speaking English I will correct you when you make a mistake. However, tell me that you are speaking American and I'll treat it with the respect due any foreign language.
There are plenty of examples where two languages are very close but have different names e.g. Norwegian and Danish. I think that this is a good thing since it prevents the feeling that foreigners are coming in and telling you how to speak in your own country, which is something that is clearly ticking you off.
31337 = redeeming
86800 (least used word) = conquistador
As I mentioned, I suggest learning the history of your own language before engaging in any more of your pedantic rants. American English is no less correct than British English. Alternatively, if ours is incorrect because it doesn't resemble yours, then both are incorrect because neither really resembles late Middle/early late English. Read Shakespeare sometime, and realize that was the state of the language when America was founded (the first colony in 1607 was contemporaneous with Shakespeare's work). Both American English and British English have changed substantially since then, particularly with spellings. Again, had English been a non-fluid language before that time, you would have a case. As it is, you don't - I know of very few differences between American and British English that can't be found in common, accepted use in English as of 1600.
English is a language that has, in computer science terms, forked. That doesn't mean either branch is "right." Insisting otherwise really makes one come off as arrogant and condescending.
I understand that, as the British "Empire" has collapsed, you have very few opportunities to condescend toward America these days. However, if the differences in your spellings are all you have to cling to, I might abandon the idea entirely.
Or put it this way - I'll claim to speaking "American" when you claim to speak "British." Until then, you might educate yourself on linguistics and learn the difference between a dialect and a language.
-Looking for a job as a materials chemist or multivariat