Searching for The New York Times
r.jimenezz writes "Adam L. Penenberg, an assistant professor at New York University, has written an interesting piece over at Wired about the contrast between the New York Times' relevance in the real world and the dismal rankings it gets in modern search engines' results. Penenberg discusses some very interesting ideas about opening up the Times digital archive and the impact this would have on its cyber presence."
Of course, like many things about the business operations of a traditional publisher that has ventured online, the reasons are simple but the solutions complicated. The New York Times requires that its users register, which makes it difficult for search engines to spider its content.
As a rule I do not read any newspaper online that I have to register for. In fact, I refuse to purchase the Star Tribune or Pioneer Press here in Minnesota because of their policy requiring user registration. Fake accounts be dammed, you want me to read your paper and have to look through your ads you will let me do so without a cookie linked to information, fake or otherwise.
an even more impenetrable barrier is the Times' paid archive. Because it stows material more than a week old behind an archive wall, you have to cough up $3 per article. Since few are willing to pay for content they can get free elsewhere, search engines, which often base results on relevancy (read: popularity), will continue to dis the Times -- as well as other media sites that make you register or pay for old news (The Washington Post, The Wall Street Journal).
This is a horrible problem that I have run into in recent times trying to do simple research on the web. I was trying to look for articles pertaining to a friend that currently resides in Perrysburg, OH. I did a simple search on the Toledo Blade's website only to find a link to a third-party archive company that required me to pay a fee to access more than a short blurb about the story. Unwilling to drive the 665 miles to Toledo from where I currently live just to read a hardcopy I gave up on my search for these articles due to this barrier. But while doing research about NEPA I find that The Scranton Times has a much better free searchable archive of information than does the The Times Leader which requires you to pay to visit their archive. Wonder who gets my visits?
I really think that these policies could lead to the downfall of traditional news outlets. I have absolutely no desire to pay money for information that should be easily available. Hell, if you are going to charge I can't see a $3 fee! A couple hundred words are worth $3 in storage? No way. Perhaps if I asked them to mail me the copy of the article then $3 would be reasonable.
"There isn't a compelling business argument today that would suggest that giving away our content is a good idea," Nisenholtz said. Even though the Lexis-Nexis deal is an all-you-can-eat model -- not based on usage -- the Times can ill afford to undermine its relationship with such an important customer. It simply can't charge Lexis-Nexis tens of millions of dollars while giving away the same content free over the Web.
The argument that makes sense is that people aren't going to be willing to pay you $3 for a computer copy of an article that is only a couple hundred words. Make the fee something reasonable or watch as you begin to waste a lot of money paying the third party archive to host your data and no one retrives it. Perhaps a rival newspaper would open their database up and people would start going to them instead. We can always hope.
Relevance is a highly subjective term. If you're a typical outspoken, liberal New Yorker, then its your Bible. If you live in a cabin in Montana, you probably don't give a shit. Calling something 'relevant' indicates much about the person doing the calling, as much or more than it tells anything about the item being discussed.
Personally, I think its a rag. It's old, its big, its supposedly a "standard", but no more relevant than my local paper. And probably LESS relevant than the sum total of whats available online - BBC, London Times, Die Zeit, Drudge, CNN.com, english.aljazeera.net, etc. etc.
I want to delete my account but Slashdot doesn't allow it.
Who needs the NYT! Let the New York POST open up its vast archives! Imagine searching through decades of mindless celebrity gossip and suddle right-wing propaganda?
I think you're painting with too broad of a brush, but I don't think that the New York Times has been the 'paper of record' since Watergate.
The entire idea of their *being* such a thing seems a little outdated to me.
The article assumes that the fault lies with the NYT and whether their archives are open. Perhaps the real fault lies with Google. Shouldn't there be something in Google that identifies certain sites and more reliable than others rather than basing rank solely on links? How many people link to online news articles? You're more likely to link to your friends beer-and-computer-mods page than a NYT article about Ashcroft's boot fetish.
*** *** You're just jealous 'cause the voices talk to me... ***
While the NYT may fare dismally in search rankings, I suspect their online influence is still strong. Many of the top hits on a given subject may not link to nyt.com but i'll wager that a number of them are blogs that reference Times material. Just a thought.
harmonious design
I have no problem with registering. If all I have to do is register an email address (heck, even a free hotmail address that i reserve only for spam) and my name, and maybe even my address, and I can get top quality news reporting without having to pay for the newspaper, then by all means I'm for it.
The reason why the NY Times is one of the best papers in the world is because they can afford to pay their employees what they deserve. If my registration helps up the amount of money they can get from their advertisers, then I'm all for it. People deserve to be paid for their hard work.
That said, I do believe they need to have better results on google, and don't agree with paying $3 for their archives that I can get at my local library for free.
Think of the children, people.
I think this touches upon a much larger problem.
Traditionally, libraries were the ultimate source of information. They were organised and well indexed - to help one find what they are looking for.
The internet has become an "instant library" to a lot of us. In ways, the internet is better than a library. Searching is trivial and the amount of information staggering. However, a lot of information is getting lost. I'm aware that there are Archiving sites, but often, these sites cannot index or record the information that sites present from their own MySQL/Oracle databases.
Search engines are really only good for searching a static site, and don't particularly scale well to sites that have content that change frequently.
It all boils down to this: HTML+Search Engine is not a good combination for giving people access to information over a long period of time. Web sites come and go (depending on the interest of their maintainters) and when they go, they're gone for good.
We need to start distributing the content on a global scale - the same way books distribute content among many people.
[ Monday is a terrible way to spend one seventh of your life. ]
B) Not indexed by search engines
C) Not electronically archived
Yeah, looks like they're really relevant in the 21st century. (And this is a good indication that land-grab IP attitudes have no long term positive benefit in an information society.)
the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff
This shouldn't be a surprise. Look at the headlines they give in 50 point type, and then when it turns out to be wrong it doesn't even make front page news.
Yellow cake in Niger, for example, they hail him as nearly a god when he says there was no such thing, and that turns out to be wrong...see here here here here
here and here.
They've finally run a story about it, but wouldn't it have been a lot better for them to have investigated those Wilson allegations themselves, when they first happened?
That's only one of the latest...
Tons of websites require you to register, not to mention discussion boards of every flava.
/. has the AC option... I wish more websites would offer a similar thing to people, and a few more benefits to registered users, and a few more benefits to paying customers.
I have to admit I have registration-fatigue.
At least
More people would be happy this way.
In an interesting coincidence, just an hour or so ago, I was looking for an article I read online in the NYT. Specifically, I was looking for an interesting image which was in the article. (Not for any specific use, I just wanted to show a friend.)
Besides the fact that the article is in the archive now (yet less than a month old!) and costs money, the page also informs you that:
Please Note: Archive articles do not include photos, charts or graphics. Our photos are available for purchase, please click here for more information.
Clicking the link reveals that you can order a photographic print for $95, and that's if they have it.
I don't even want a photographic print! A 200x200 pixel bitmap would be fine! (and hardly damaging to their photo sales)
As the article points out, why would anyone casually link to a NYT story? There is simply no point in linking to something most can't access without paying.
They certainly deserve that Google ranking.
I'm not so sure the NY Times is outlandish in their pricing for archived articles. Articles from the past are a niche offering, and thus come with niche prices. If you really need an article from 1964, most likely a few bucks won't be too much trouble. The idea that you'll pay a price directly reflective of the cost of goods is ludacris. If it weren't, we'd be paying 4 cents for a coke, 2 dollars for a movie, and 5 bucks a month for internet service. Take a trip down to the library and spend a few hours finding the article on microfiche, if you can, or pay a few dollars and get it immediately at home.
Dude, 99.99% of Drudge's big "scoops" are just a sentence leaked from the NY Times newsroom about some big story they're going to publish the next day. Drudge is good at collecting information, but don't kid yourself: his investigative skills are nil.
As a rule I do not read any newspaper online that I have to register for. In fact, I refuse to purchase the Star Tribune or Pioneer Press here in Minnesota because of their policy requiring user registration. Fake accounts be dammed, you want me to read your paper and have to look through your ads you will let me do so without a cookie linked to information, fake or otherwise.
So they are supposed to provide world-class journalism and post it on a world-class website and you can't be bothered to host a cookie and look at some ads (which can be easily blocked anyway) in return?
What a massive sense of entitlement you have. Either that or a severe cookie-phobia...
Stop by my site where I write about ERP systems & more
From a newspapers perspective open archives aren't always a possiblity. I work for a newspaper in a Moderately sized (~100,000 people) midwestern city. We currently have about 135 years of paper archives dating back the the late 1800's. While we do have a decent internet presence, we don't have the resources to provide this conent online for free.
A recent estimate by me showed that we would need about $20,000 to get that project started in a very barebones manner. That isn't a small amount of money to throw at a project that you want to give away for free. On the other hand their is antoher newspaper in town that charges $90/year for access to their sports archives and at last estimate they had close to 1000 subscribers. For a medium sized paper that amount of money is hard to pass up.
Now for a company like the New York Times that is a different story. They certainly have the resources to get their content online. They though, have other reasons to keep their content available on a pay basis. They maintian strict controls over all their copyrighted material. Its hard to blame them for this though, since that content is their lifeblood.
In my opinion I do feel they keep their content under too tight of a lock. Its like having a great idea but never letting anyone hear about it because you are afraid they might steal it. Papers must decide between keeping their copyrighted material secure and providing it to readers in a new medium. But it is that delecate balance that traditional print publications now face while moving into the digital era.
A pint of high-quality water can be obtained from many municipal water systems for a fraction of a penny.
Yet people are happy to pay $2 for a bottle of the same water.
Things are worth whatever you are willing to pay.
Conformity is the jailer of freedom and enemy of growth. -JFK
The Times attracts 9 million unique visitors a month, while only about 1 million read the daily paper.
I find the extensive dead-tree version convenient and end up reading more from it than the on-line version that's free.
But, not having a lot of time during the week, I end up buying the print version maybe every 3 days, and quickly scanning the on-line headlines on the off-print days.
The Times really ought to open up its archive and let everyone, including Lexis-Nexis, have free access.
Many years ago at a university library they had an entire special catalog devoted to indexing old NY Times articles that one could read from microfiche. Without the individual paying, either.
There is still a fundamental chasm between archived high-quality material (especially true for scientific journals) and what is freely available and searchable on the web.
Think about how useful it would be for the general public to have access to old, high-quality archives like the NY Times and other scientific periodicals; the pursuit of science and other research would be considerably advanced over where it is today. Then there is the reality: copyright protections and the hope by the copyright owners for a few dollars more by charging for access (that only the very wealthy or institutions can afford) still persists.
It's almost enough that I think the government ought to exercise eminent domain (link to counterpoint about possible abuse of eminent domain - just as they do for land when a freeway needs to go through Aunt Tilly's backyard) and provide some reasonable compensation to the current copyright owners and to appropriate sufficiently old works and make them available publicly.
"Provided by the management for your protection."
"The Gray Lady is a beautiful clipper ship, but it's losing steam..."
--media consultant Vin Crosbie, from TFA
The Lexis-Nexis agreement is the key bit. NYT Digital profited $25M and they have a $20M agreement with Lexis-Nexis that they wouldn't have if the archive were available free. The archive therefore clearly won't be free as long as Lexis-Nexis "owns" it.
I don't know what else is in Lexis-Nexis, but I imagine they have similar agreements with their other main sources of info. But it seems like they're the ones who are more threatened by Google, since they are so clearly in direct competition. When their first customers start making their content too free on the web, there's going to be a momentum that leads to the decline of Lexis-Nexis's current model--at which point NYT Digital will figure out some other way to make money.
Newspapers rarely make enough in issue sales to pay the cost of printing the issue. They make the money in advertising, plain and simple.
To have a paper like the New York Times, who can command advertising rates as high as any paper in the world, bitching and moaning about their web presence and hoarding their articles like some stupid info-miser shows nothing more than a complete lack of understanding somewhere in the company. There is no excuse for it.
If any website could sell enough ads to keep itself profitable it would be the website for the new york times. They could add to their revenue and readership in one fell swoop. But no.
It's dumbass media outlets like this that had better wake up and get with the program. Doing it the way you've always done it will do YOU in the end, and it won't be pretty.
ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
Your local library. Unless you're really in the middle of nowhere and your library has no budget at all, go to the library. Heck, you might not even have to go to the library, many libraries now do chat reference, ask-a-librarian, and all libraries have a phone.
There's more, MUCH more, to doing research than using google. Paid databases have it all over google for finding current and historical news information.
If you can't find something local, try the Library Of Congress, they do online chat reference.
The Drudge Report proved that the internet is better and more reliable than the New York Times?
An anonymous female intern has informed me that you are almost certainly mistaken.
Libraries are generally wonderful, amazing places: well organised, friendly and incredibly expert staff who do their best to get what you need for little or no cost.
But there is a cost - and people forget about it, because its in our taxes. (Whether or not we should pay for public libraries out of our taxes, and whether the money is well spent is another argument). But the bottom line is that we've had 100 years or so of great services because there has been a general philosophical acceptance that it's a Good Thing for everybody to throw in a few cents for a building in every town, full of good books, staffed by experts, and with an infrastructure to enable gaps in individual library stocks to be covered at a national *and* international level by an interlibrary loan service. Most developed countries now have a superbly developed system for getting paper-based information to their citizens for little cost.
My question is: would we accept paying taxes to do the same via the internet?
I think it's mainly a philosophical, rather than technical question. If we all agreed to pay additional 'library taxes' then there's no reason why existing sources couldn't be made available to all citizens (e.g. your National Insurance number is your password, now you can get the NYT online for free, NYT gets paid by the treasury for its national-to-all-citizens licence each year) and also in the same way that many library indexing systems were evolved by librarians working under public funding, why not use public funding to develop internet archiving / retrieval systems of comparable value? I think it's a philosophical issue, it depends on how you see these technical solutions being funded.