'Reading Level' Filter Added To Google Search
entotre writes "A new feature has been added to the advanced Google search: reading level. From the blog post: 'The feature lets you filter or annotate the search results by reading level. The reading levels include basic, intermediate and advanced. You can either have Google label or annotate the results with those labels, only show basic results, only show intermediate results or only show advanced results.' At the time of writing, Slashdot is 1 % advanced, 64 % intermediate and 34 % basic."
How am I supposed to choose the correct filter when I don't know what the word "intermediate" means?!
http://www.google.com/search?hl=en&lr=&safe=images&tbs=rl%3A1&q=site%3Asimple.wikipedia.org&aq=f&aqi=&aql=&oq=&gs_rfai= Basic 28% Intermediate 55% Advanced 16% I think someone didn't live up to his claims!
I think this service drastically overestimates the reading level of the average Google user, specifically with regard to the comprehension of words like "intermediate."
99% advanced. On the other hand, Wikipedia is quite evenly distributed.
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
Farther proof that Google and tehir world tubes is help making us all geniusii!
I thought /. would be 0% advanced, 0% intermediate, 0% basic, and 100% kindergarten...
/me ducks
Palm trees and 8
Everyone sound smart!
Derrida began speaking and writing publicly at a time when the French intellectual scene was experiencing an increasing rift between what could broadly be called "phenomenological" and "structural" approaches to understanding individual and collective life. For those with a more phenomenological bent the goal was to understand experience by comprehending and describing its genesis, the process of its emergence from an origin or event. For the structuralists, this was a problematic and misleading avenue of interrogation, and the "depth" and originality of experience could in fact only be an effect of structures which are not themselves experiential. It is in this context that in 1959 Derrida asks the question: Must not structure have a genesis, and must not the origin, the point of genesis, be already structured, in order to be the genesis of something?
(source: http://en.wikipedia.org/wiki/Deconstructionism#Theory)
I live in constant fear of the Coming of the Red Spiders.
The Reading Level for site:simple.wikipedia.org is currently ranked 29% Basic, 52% Intermediate, 17% Advanced, implying that Slashdot is easier to read than the version of Wikipedia specifically tasked with being approachable to those with only basic English language comprehension. Google's filter fails here, though I suspect Wikipedia is failing to a small degree too.
Use my userscript to add story images to Slashdot. There's no going back.
I have a feeling most sites I frequent are going to fall into the "intermediate" category, though from a SEO perspective you typically want to keep your site content basic and easy to understand. Obviously a site dedicated to molecular physics would require pages that should probably be classified as "advanced" but not every page on the site would, so unless Google is planning on adding more site links to each domain they show in search results, I don't see how this will result in accurate listings or ultimately even add any benefit to search in general. But kudo's to thinking outside the box and testing it on the masses.
Ave Molech Setting
IT's not useless. It's a guideline.
The Kruger Dunning explains most post on
Finally, I can just set Google to "filter everything below a third grade level" and never have to see 'Yahoo! Answers' spam cluttering up my search results!
-aliterate
Slashdot editors can search the internet and actually understand the results! :p
28% advanced for middle school math, and16% advanced for college math. So.. math somehow gets less 'advanced' from middle school to college?
In a shock to nobody, Googling for 'Kanye West' clocks in with 94% basic and 1% advanced. Beat that, slashdot!
Think Liberals are the learned elite and Conservatives are intellectually bankrupt? Think again:
FoxNews.com:
Basic: 23%
Intermediate: 73%
Advanced: 2%
MSNBC.com:
Basic: 43%
Intermediate: 55%
Advanced: 1%
Win = conservatives.
My quest for advanced level porn brought me here: http://en.wikipedia.org/wiki/Progressive_outer_retinal_necrosis :(
That's great and all, but what would be *really* cool, is if Google provided some way to search for pages that contain a specific word or phrase. Yeah, that would be cool. Some kind of search engine where I type in words and the search engine returns only pages that contain those words. Can Google work on that next?
So I think you are in part correct that the simple site isn't living up to its name--it takes a lot of effort to dumb stuff down. However, when you look at the "advanced" pages you start to realize how certain material gets categorized that way: scientific words and pages with primarily people of place names.
The other problem is that it's doing it based on volume of pages. The simple site actually has relatively few number of pages in total thereby more heavily increasing the "advanced" pages.
Finally, just to be clear, it doesn't seem to be computing the percentage of content, but rather what percentage of pages (in total) fall into one or the other category.
So it's a "brightness control" that allows you to turn down the intelligence?
Honesty. Loyalty. Kindness. Laughter. Generosity. Magic!
Will it restrict the type or porn I find?
I'm not sure I'm into the advanced stuff, but I certainly do not want to get stuck in the basics. Missionary style for 10 years while married is enough for me.
Yeah my son is eight years old and reading long novels now, but when he was at pre-school age he would take DVDs he liked (say Ben-10) and type the titles one letter at a time into google to get the youtube related videos list. Then he would be set for hours. Most of it was above his reading level but all he needed to kow was that B on the title matches B on the keyboard.
And once they get the hang of reading they fly past the "levels".
http://michaelsmith.id.au
Somebody didn't read at the right grade level or higher!
They already have that option, but it's labeled Images.
Good point. :)
Is that for results that all start with the same sound?
This space intentionally left blank.
Fox News
23% Basic
73% Intermediate
2% Advanced
aliterate/litrit/
Noun: An aliterate person.
Adjective: Unwilling to read, although able to do so
I believe he meant illiterate though which is unable to read rather than unwilling to.
I'm fed up reading about feet, inches and other body parts as measures, but temperature and derived units (like "mpg") are the most annoying.
Google! Do something! (to be read in a certain villain voice)
Please! Onegai shimasu!
Maybe soon Google can cater to the truly stupid and illiterate and just replace all known words with representative pictures like they do on McDonalds cash registers now.
After all, instead of learning to read at a better level you should totally cater to their level so they don't have to learn anything.
If the only way you can accept an assertion is by faith, then you are conceding that it can't be taken on its own merits
Define irony? Maybe not, maybe it's to help you avoid sites that are overly simplistic?
I do not play in the middle of the road
Democratic National Committee: 21% Basic, 77% Intermediate, less than 1% Advanced .org site and RNC has .com? Weird)
Republican National Committee: 11, 87, less than 1 (DNC has
Whitehouse: 6, 87, 5
Or Wikileaks: 1, 42, 56
Of course the epicenter of stupid, Sarah Palin's Facebook page, 64, 33, 1
A few Slashdot worthy ones:
Microsoft: 12, 77, 9
Apple: 48, 49, 2 (anyone surprised here?)
Linux: 4, 91, 3
a coming of hipsters who flaunt around their consistent use of the "advanced reading level only" setting when they search things.
If you look at the rankings of nutter pseudo-science sites and fringe political babble, they are strongly correlated with a high "reading level". I can't imagine that it is because of the content -- the content is insane -- but because people on these sites often use big-word babble when elaborating on their delusions. They may be using fluffy prose, but there is no "there" there.
Consequently, I would take the reading level with a grain of salt.
Not counting Google Translate, I think the "difficult" reputation of German writers comes either from bad translators or, more likely, good translators trying their hardest not to lose the nuances of the German language. I think the best translators are the translators that attempt to find equivalent concepts in the target and source languages. Is it okay to lose something in the translation in the effort to make the translation read right? If a translation is too opaque, then you lose any chance of the work being read by readers who can't understand the original language.
No, I actually meant aliterate; not illiterate OR alliterate... the "aliterate" option will return images along with concise Stephenie (--- yes it's spelled like that) Meyer-esque (sordid, teeny-bopper romance-cum-pornography) summaries.
See the second link under the "Advanced" filter: Apparently reading level is not based entirely on the quality, density or accessibility of ideas in prose, but in the element of situational humor as well.
Google is 33% basic, intermediate, and advanced... http://www.google.com/search?q=site:google.com&hl=en&num=10&lr=&ft=i&cr=&safe=images&tbs=rl:1
Considering how little of slashdot is indexed well (if at all), I'm not sure those numbers have any value whatsoever. Unless they are describing the actual code that runs slashdot, in which case the numbers are total bullshit because we all know that slashdot is primarily coded by drunken monkeys.
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
It is a common mis-perception that all problems can be solved if we just advance the cause of science by a significant degree in the correct direction, but alas some things can not be remedied by any technological advancement.
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
Well, the last time I checked the total percentage should be 100, yet the summary only accounts for 99 and nobody seems to have picked up on it, so who knows? (Yes, I know there is missing data to the right of the decimal point to account for the deficit)
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
Considering that the average reading level of an adult in Canada and the US is between grades 6-9? Dunno, that's the main reason why print media(read all news papers), use a simplified format.
Om, nomnomnom...
34% basic
Oddly discarded from the reported results were 2% COBOL and 4% Lisp. C results were discarded for using the "wrong" brace style (regardless of style used).
"There is more worth loving than we have strength to love." - Brian Jay Stanley
In what sense is it a "guideline"? Perfectly clear text can get a poor readability index, incomprehensible text can get good readability.
A reading index is just like a measuring tape. It can't tell you that you built a crappy house with crooked walls and a leaky roof; it can only tell you that something is 40 feet long by 30 feet wide.
A reading index is a tool that simplifies understanding, reducing a very complex thing to a simple number that's useful for comparisons. Just like you can use the measurements of the house to figure out that it's 1,200 square feet, you can compare that to a house that is 2,400 square feet. Neither measurement tells you the quality of the construction, the color, the flooring, the lot size, or the neighborhood. But if you're looking for a home for a family of six, knowing the floor space is one thing that can help weed out the useless candidates quickly. If you're looking for a book for first graders, you don't trot out a book with a reading index of 18.
And claiming it doesn't work on incomprehensible text is like complaining that a measuring tape can't tell you the color of a house. A reading index is not an interpreter of syntax, grammar, spelling, or any other attribute of text. It just measures one simple set of dimensions of text.
A reading scoring system can only give you an indication, not a guarantee, of what kind of audience should be able to comprehend a given piece of text; and it can give you an indication of relative difficulty. For example, the widely used Flesch-Kincaid Readability Index bases its score on the average number of words per sentence and the average number of syllables per word, and outputs a "grade level". The grade levels were probably modeled on the textbooks and lesson books of the era in which it was developed. Is it still relevant? Perhaps the actual grade levels are different these days, but it's still a widely accepted model because it's useful for what it does provide.
John
I tried a few sites of mine. "Downside.com", which has financial predictions (the dot-com crash, the mortgage meltdown, the oil spike, the auto industry bankruptcies), is rated mostly "intermediate", although the material there is heavy going unless you're up to speed on finance. "Animats.com", which has theory papers on some subjects in computer graphics and physics engines, is mostly rated "intermediate".
On the other hand, my fun site for steampunk stuff, "aetherltd.com", is mostly rated as "advanced", presumably because it's deliberately written in an archaic style.
I suspect it's just one of those sentence length and word length count algorithms.
that thou are
How dare you derail a perfectly good rant...
Paying taxes to buy civilization is like paying a hooker to buy love.
Ah? Are Americans actually all alliterate?
A Sarah Palin tag on this story? Seriously? I can understand not liking her but damn, that makes Slashdot just look childish.
Love sees no species.
This is way too intellectual and shows that Google doesn't really grok the Internet. What people really want is an "unsafe search" that returns only images that have been flagged as "unsuitable for minors".
No sig today...
Actually I'm quite pleased with this, because most ultra junk pages are basic so far.
Given our front page stories, this is Google implementing this, not Yahoo. So all you have to do is put about 4 sanity-check algorithms behind it to check coherence and that should nuke most of the cheap SEO attempts for "round 1".
I'm having run searching on Advanced. I'm a cardinal member of the Teal Deer club. It's proving really funny for NSFW searches!
My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
"At the time of writing, Slashdot is 1 % advanced"
You're welcome.
Hey buddy, can i bum a karma? ~}CinderellaManson{~
Actually, alliteration almost always annoys any average American audience.
Ice Cream has no bones.
Slashdot is only 1% advanced and 1/3rd simple. Not that hard to understand for the average joe!!
Either I have too great an opinion of myself or I am grossly underestimating the average joe!!????
O this learning! What a thing it is - William Shakespeare
I feel like this could be pretty nifty if you're trying to learn a language by using the internet and you want to make sure that what you're looking at isn't going to go over your head for sure. However, while looking at google.de, it seems like the reading level isn't an option in advanced search.
A reading index is just like a measuring tape. It can't tell you that you built a crappy house with crooked walls and a leaky roof; it can only tell you that something is 40 feet long by 30 feet wide.</p></quote>
Not true!
If the measuring tape is wet, then the roof must be leaking!
If the measuring tape is swinging, then the house must have a draft!
If the measuring tape is white, then even snow is getting in!
If you can't see the measuring tape, then your electricity is out!
And if you have a candle, and you still can't see it, then it must be foggy!
I'm sure there is more than this that a measuring tape could tell you, if you would be creative!
I cannot wait until somebody writes a script to rank all Universities in the world.
I just did the top ten in my country and the results are not what we are led to believe according to the current ranking system.
I did harvard.edu and, honestly, kudos.
.
Query: Fuck Advanced tier papers.ssrn.com/sol3/papers.cfm?abstract_id=896790 en.wikipedia.org/wiki/Fuck_Off_(art_exhibition) en.wikipedia.org/wiki/Genderfuck This is a good way to get articles that you otherwise won't know about in your lifetime.
http://archeleus.com/blog
Query: Fuck
Advanced tier
papers.ssrn.com/sol3/papers.cfm?abstract_id=896790
en.wikipedia.org/wiki/Fuck_Off_(art_exhibition)
en.wikipedia.org/wiki/Genderfuck
This is a good way to get articles that you otherwise won't know about in your lifetime.
http://archeleus.com/blog
I just checked two sites I run. Seemingly the site I want to be basic is 60% basic 34% intermediate and 4% advanced. So in parts it's more advanced than slashdot so probably I have failed a little there! The other one is just my general stuff and it works out a quarter basic, half intermediate and a quarter advanced which I guess is probably about right. Thanks Google, I think that can be a great help to me even if I won't be dumbing down my searches. ~~~~
thou discernest my thoughts from afar
Slashdot is green. It is big. It has lots and lots of users.
Slashdot people talk a lot. They type words.
Slashdot is a good site. I like slashdot.
She was like chocolate when she drank... semi-sweet at first and then increasingly bitter.
I recently read a philosophy/history book about Pythagoras that was written in an excessively simple style. All sentences were "subject verb complement" and shorter than one line, with no adverbs ever and hardly any adjectives. I don't know if the author did this to imitate some ancient style, but it was hell to read. It was like having a clock ticking behind my head at every sentence since they all repeated with the same regularity. Game me headaches just like good old Proust !!!
Non-Linux Penguins ?
Slashdot is green. It is big. It has lots and lots of users.
Slashdot people talk a lot. They type words.
Slashdot is a good site. I like slashdot.
A lot.
She was like chocolate when she drank... semi-sweet at first and then increasingly bitter.
Google confirms it: nerdy in-jokes alienate most of the population
I mod down anyone who says "I will be modded down for this", regardless of the rest of their comment
Slashdot assumes anybody can compile anything.
You inarticulate clods.
After logging in slashdot still does not take you back to the page you were on. It's been that way for 20 years.
4chan is 3% advanced content compared to Slashdot's 1%. Stormfront is ahead of Slashdot also at 3%. At least we can comfort ourselves on the fact that we're ahead of Chimpout. Good job everyone.
In some portions of America this is true, but they are less PC about it, referring to it as "acting white" as if being white is the only measure of how smart someone is.
When you have a culture driven by hateful music, advancing disrespect for society and morality, how can you expect those who listen to it to get beyond it?
We can spend tens of thousands of dollars per child but if their community does not support their advancement it all goes to waste.
* Winners compare their achievements to their goals, losers compare theirs to that of others.
A reading index is just like a measuring tape. It can't tell you that you built a crappy house with crooked walls and a leaky roof; it can only tell you that something is 40 feet long by 30 feet wide.
No it isn't, and the difference is pretty basic. A measuring tape gives a direct measure of distance, a readability index gives an indirect measure of readability that is only as good as the model relating the measure to the thing you want to measure. And the model relating readability indexes to readability is really very poor indeed. They are widely used because of a near-religious obsession with supposed "objectivity", irrespective of whether what's being objectively measured is what actually matters. The result is that people write in such a way as to get the scores right instead of writing well.
The biggest problem is that few if any of the tests take sentence construction into account. A long sentence that is long because of a lot of coordinated clauses is usually easily readable. One that has a lot of subordinate clauses much less so, and even less if those subordinated clauses are embedded. Young children are particularly prone to producing long sentences that are perfectly readable at a low grade level. A child might well produce a sentence like "The man went to the bus stop and he got on the bus and he paid the driver and he went upstairs and sat down and he stayed on the bus until it came to the library and he went downstairs and he got off the bus and he went into the library and got the book he wanted then he got on another bus and went home and read the book." (Fleisch-Kinkaid grade index: 26.2.) Whatever is wrong with that sentence -- and there's a lot -- it's not that it's not readable by anybody without a postgraduate education.
Sure, measurement is a good thing if the measurement is right. Wrong measurements, though, push people into conforming with the measurement instead of doing the thing right. It's the sort of mentality that leads to buses not stopping to pick up passengers because the drivers are measured on adherence to timetables and hospitals abandoning patients that have waited more than a designated time because they've already lost their performance point for that patient. Indirect measures need a lot of care in their application, and very few people understand (or care) enough to take that care. And that makes them dangerous.
Quidnam Latine loqui modo coepi?
I put it to my own personal test, and it passed; for wolfram.com:
Basic - 1%
Intermediate - 18%
Advanced - 79%
80% of Advanced was the word "citation"
"We know what happens to people who stay in the middle of the road. They get run over." - Aneurin Bevan
This reading level filter actually works here. Last night when I noticed this story on slashdot, I decided to try it out the next time I used google.
A few minutes later I set the reading filter to 'advanced' and tried to find a technical specification article. Which surprisingly popped up in the top 3 results.
As a quick test, I turned off the filter and did the search again, all I got this time was links to various forums, a wikipedia entry, and an archived conversation on some mailing list.
I'd say it's great for hunting things down. It's just another 'what' in the 'search for what?' that search engines do.
84% advanced.. seems to be influenced by level of technical terms.
> Slashdot is 1 % advanced, 64 % intermediate and 34 % basic."
What's the missing 1%, then ? CowboyNeal ?
What a depressingly stupid machine.
The results include such a pile of broken/falsified/hardcoded data that it's not even funny.
For example:
4chan.org 39/56/3 (about same as Slashdot)
4chan.org/b/ 100/0/0
8chan.org 0/100/0
er...?
google.com 33/33/33
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
It's shows up under basic. What does that tell you?
...Some kind of search engine where I type in words and the search engine returns only pages that contain those words. Can Google work on that next?
I agree wholeheartedly. I'm sick and tired of getting hundreds of totally irrelevant search results because Google can't follow its own 'allintext:' directive, and it just plain pisses me off that there's no way of forcing the engine to perform an EXACT character match, i.e. one that matches punctuation and case. And don't get me started on Google's assumption that I MUST have wanted different search criteria than I entered, forcing me to click again to search for what I damned-well entered in the first place. Google really needs to get the basics right, instead of working hard to make results even less useful than they already are.
I used to say of Microsoft "They always just know what I want, and they're almost always wrong". Lately, I've started saying the same thing about Google.
'The Economy' is a giant Ponzi scheme whose most pitiable suckers are the youngest among us and the yet-unborn.
Anyone who's been in school or has kids in school, knows just how useless the reading level is. It's a useless measurement.
When I was in the beginning of the 2nd grade, I was sitting in the library reading a book I'd taken off the shelf. A teacher walked by and saw me with the book, gasped, and said "you can't read that!"
"Why not?" I asked.
"Well, yo ujust can't.
"Well, WHY NOT?"
"Ok, smartypants, read it out loud," she said, so I did. She gasped again, ran and got another teacher, and had me read more from it. I put the book down and asked what the big deal was; they'd taught us to read the year before, didn't they?
"That book's at a sixth grade level!"
But "reading level" is a median; a 3rd grade reading level is the best the median 3rd grader can do, 12th grade level is the best a median high school graduate can do, and a postdoctoral level means you can understand damned near anything.
The most comfortable reading level for light fiction is 8th grade level. So you can see, there's some use to it, at least.
Free Martian Whores!
I won't argue with what you say the average reading level is, you may be right. But newspapers and novels are written at that level because at an 8th grad reading level, most people can read and understand the material quickly. Especially with a novel, you don't want your readers to see words, you want your readers to see the scene you're painting with those words.
Free Martian Whores!
So what you're saying is that because the thing to measure is way more complex than the simple model beneath it, the whole tool is useless because it can't tell you when it's right and when it's wrong. I'm saying that doesn't make the model useless, because it's representing only the probability that a given subject will have comprehension of a particular text.
I'd like to see it tested and proven or disproven. Draw a fuzzy circle representing actual measured comprehension of a set of texts by a set of students, and another circle representing the readability of those texts as measured by the FKRI. I expect the circles will both be fuzzy and large, but there will be a lot of overlap - enough to make statistically significant predictions. And the FKRI is simple and fast and cheap. It has it's place even if it isn't always right.
John
It was dyinobal who said it was useless, not me. I questioned in what sense it is a guideline. A guideline is something one is supposed to follow. Looking at FKGI and considering the implications of it (such as checking passages with poor FKGI to see how readable they really are) is reasonable, setting targets and rejecting/filtering text is more questionable. But apparently even questioning a measure is enough to get modded "troll" by those who worship at the altar of pseudo-objectivity.
Quidnam Latine loqui modo coepi?
Well I certainly didn't mod you troll, and whoever did is pretty damn stupid. You're raising legitimate questions.
A guideline would be to interpret the output of FKRI as the grade level for which a given book would be appropriate. A guideline would say "an FKRI of 1-3 is appropriate for beginning readers, and an FKRI of 28 is appropriate for doctoral candidates."
But what I see you arguing is "look at these exceptions to the rule, therefore the model is wrong." You offer the example of sentences that violate the rules of grammar, and use them to say that a model isn't accurate. I'm saying that the model doesn't and can't take into account bad input. It was modeled after good input.
If I were to apply the FKRI to the output of a publishing house, I would get numbers that are pretty close to realistic, and are useful, at least most of the time. If I were to apply the FKRI to the output of a million monkeys at typewriters, I would get random, useless information.
And that might ultimately be what you're trying to say: Google's input is closer to that of a million monkeys at keyboards instead of the edited and published works of professional authors, therefore Google's number is never going to be right.
John
A guideline would be to interpret the output of FKRI as the grade level for which a given book would be appropriate. A guideline would say "an FKRI of 1-3 is appropriate for beginning readers, and an FKRI of 28 is appropriate for doctoral candidates."
That's what I understand by a guideline too, and I think that in practice it's a bad thing. If somebody says "This is intended for beginning readers, so if anything has an FKRI of higher than 3 we will refer it to a human checker to assess its readability" then it wouldn't be so bad. Unfortunately what happens in practice is that they say "This is intended for beginning readers, so if anything has an FKRI of higher than 3 will be automatically rejected" (because (a) that's cheaper and (b) the FKRI is an objective measure, never mind of what, and we have to be objective, don't we?).
The beginning readers are thereby restricted to the blandest of possible material and their reading experience suffers as a result. And so on all the way up the reading scale. And excellent authors don't get their material published (without it being dumbed down), to everybody's loss.
Quidnam Latine loqui modo coepi?