Searching for The New York Times
r.jimenezz writes "Adam L. Penenberg, an assistant professor at New York University, has written an interesting piece over at Wired about the contrast between the New York Times' relevance in the real world and the dismal rankings it gets in modern search engines' results. Penenberg discusses some very interesting ideas about opening up the Times digital archive and the impact this would have on its cyber presence."
Of course, like many things about the business operations of a traditional publisher that has ventured online, the reasons are simple but the solutions complicated. The New York Times requires that its users register, which makes it difficult for search engines to spider its content.
As a rule I do not read any newspaper online that I have to register for. In fact, I refuse to purchase the Star Tribune or Pioneer Press here in Minnesota because of their policy requiring user registration. Fake accounts be dammed, you want me to read your paper and have to look through your ads you will let me do so without a cookie linked to information, fake or otherwise.
an even more impenetrable barrier is the Times' paid archive. Because it stows material more than a week old behind an archive wall, you have to cough up $3 per article. Since few are willing to pay for content they can get free elsewhere, search engines, which often base results on relevancy (read: popularity), will continue to dis the Times -- as well as other media sites that make you register or pay for old news (The Washington Post, The Wall Street Journal).
This is a horrible problem that I have run into in recent times trying to do simple research on the web. I was trying to look for articles pertaining to a friend that currently resides in Perrysburg, OH. I did a simple search on the Toledo Blade's website only to find a link to a third-party archive company that required me to pay a fee to access more than a short blurb about the story. Unwilling to drive the 665 miles to Toledo from where I currently live just to read a hardcopy I gave up on my search for these articles due to this barrier. But while doing research about NEPA I find that The Scranton Times has a much better free searchable archive of information than does the The Times Leader which requires you to pay to visit their archive. Wonder who gets my visits?
I really think that these policies could lead to the downfall of traditional news outlets. I have absolutely no desire to pay money for information that should be easily available. Hell, if you are going to charge I can't see a $3 fee! A couple hundred words are worth $3 in storage? No way. Perhaps if I asked them to mail me the copy of the article then $3 would be reasonable.
"There isn't a compelling business argument today that would suggest that giving away our content is a good idea," Nisenholtz said. Even though the Lexis-Nexis deal is an all-you-can-eat model -- not based on usage -- the Times can ill afford to undermine its relationship with such an important customer. It simply can't charge Lexis-Nexis tens of millions of dollars while giving away the same content free over the Web.
The argument that makes sense is that people aren't going to be willing to pay you $3 for a computer copy of an article that is only a couple hundred words. Make the fee something reasonable or watch as you begin to waste a lot of money paying the third party archive to host your data and no one retrives it. Perhaps a rival newspaper would open their database up and people would start going to them instead. We can always hope.
What a bunch of bastards. Great paper though.
Buy the President
While the NYT may fare dismally in search rankings, I suspect their online influence is still strong. Many of the top hits on a given subject may not link to nyt.com but i'll wager that a number of them are blogs that reference Times material. Just a thought.
harmonious design
I think this touches upon a much larger problem.
Traditionally, libraries were the ultimate source of information. They were organised and well indexed - to help one find what they are looking for.
The internet has become an "instant library" to a lot of us. In ways, the internet is better than a library. Searching is trivial and the amount of information staggering. However, a lot of information is getting lost. I'm aware that there are Archiving sites, but often, these sites cannot index or record the information that sites present from their own MySQL/Oracle databases.
Search engines are really only good for searching a static site, and don't particularly scale well to sites that have content that change frequently.
It all boils down to this: HTML+Search Engine is not a good combination for giving people access to information over a long period of time. Web sites come and go (depending on the interest of their maintainters) and when they go, they're gone for good.
We need to start distributing the content on a global scale - the same way books distribute content among many people.
[ Monday is a terrible way to spend one seventh of your life. ]
Tons of websites require you to register, not to mention discussion boards of every flava.
/. has the AC option... I wish more websites would offer a similar thing to people, and a few more benefits to registered users, and a few more benefits to paying customers.
I have to admit I have registration-fatigue.
At least
More people would be happy this way.
An inside source told me that many of the Morris Papers actually discourage spiders and search engine referrals since it does very little for their local based advertisers.
"a specific name in that article" site:nytimes.com
in Google News and it returned me that specific article. But then, I presed "Web" search for the same phrase and it didn't return that article but a couple of older articles with the same name (I guess those were from the time before the Google News started).
In an interesting coincidence, just an hour or so ago, I was looking for an article I read online in the NYT. Specifically, I was looking for an interesting image which was in the article. (Not for any specific use, I just wanted to show a friend.)
Besides the fact that the article is in the archive now (yet less than a month old!) and costs money, the page also informs you that:
Please Note: Archive articles do not include photos, charts or graphics. Our photos are available for purchase, please click here for more information.
Clicking the link reveals that you can order a photographic print for $95, and that's if they have it.
I don't even want a photographic print! A 200x200 pixel bitmap would be fine! (and hardly damaging to their photo sales)
As the article points out, why would anyone casually link to a NYT story? There is simply no point in linking to something most can't access without paying.
They certainly deserve that Google ranking.
From a newspapers perspective open archives aren't always a possiblity. I work for a newspaper in a Moderately sized (~100,000 people) midwestern city. We currently have about 135 years of paper archives dating back the the late 1800's. While we do have a decent internet presence, we don't have the resources to provide this conent online for free.
A recent estimate by me showed that we would need about $20,000 to get that project started in a very barebones manner. That isn't a small amount of money to throw at a project that you want to give away for free. On the other hand their is antoher newspaper in town that charges $90/year for access to their sports archives and at last estimate they had close to 1000 subscribers. For a medium sized paper that amount of money is hard to pass up.
Now for a company like the New York Times that is a different story. They certainly have the resources to get their content online. They though, have other reasons to keep their content available on a pay basis. They maintian strict controls over all their copyrighted material. Its hard to blame them for this though, since that content is their lifeblood.
In my opinion I do feel they keep their content under too tight of a lock. Its like having a great idea but never letting anyone hear about it because you are afraid they might steal it. Papers must decide between keeping their copyrighted material secure and providing it to readers in a new medium. But it is that delecate balance that traditional print publications now face while moving into the digital era.
On the other hand their is antoher newspaper in town that charges $90/year for access to their sports archives and at last estimate they had close to 1000 subscribers.
That's also $7.50 a month for unlimited usage of their sports archive. That's not $3 for a single article.
So they are supposed to provide world-class journalism and post it on a world-class website and you can't be bothered to host a cookie and look at some ads (which can be easily blocked anyway) in return?
I have no problems looking through the ads. My point was that because I have to look through ads I shouldn't be required to have a trackable piece of information linked to me.
The idea that "Because they've done this, I should pay that" is simply self serving. In capitalism, sometimes you pay alot more for something than it cost the seller to procure. If you're not cool with that, you go somewhere else. Their business model is valid, and at this point I think its safe to say that alot of people consider it a valuable service. I happen to agree it is not priced correctly, and thus I don't buy articles there.
- The Times should customize its content so that readers could pick and choose which stories they want based on their own particular interests, rather than having to wade through the site's table of contents.
What is being suggested here is personalized news such as Findory News. Take advantage of the online media format. Customize each page to each reader's interests. Make it easier for online readers to find interesting news.Funny... they call Fox News Special Report "most centrist of all media outlets in our sample" (p.3).
Maybe you'd like to have a look at the other side as well:
'Examining the "Liberal Media" Claim' at fair.org.
So they are supposed to provide world-class journalism and post it on a world-class website and you can't be bothered to host a cookie and look at some ads (which can be easily blocked anyway) in return?
They are not "supposed" to do anything, they can do whatever they like. And so can I; I can choose to look at whatever web content I like. If a website isn't to my liking, because it requires registration or pay-to-view, I'll go elsewhere.
To turn around your complaint:
If someone wants to control me while I use the net, they can pay me an hourly rate to do so. Otherwise, I'll do what I want.
Two principles for web publishers:
To have a paper like the New York Times, who can command advertising rates as high as any paper in the world, bitching and moaning about their web presence and hoarding their articles like some stupid info-miser shows nothing more than a complete lack of understanding somewhere in the company. There is no excuse for it.
Uh, I don't know if you realized this, but newspapers ALSO make a lot -- a LOT -- of money on their archives. In fact, in some areas the only reason the local paper survives is an archival entity, selling their content digitally and on microfilm/fiche to universities and to services like Lexis-Nexus.
There is a big fear in the newspaper industry that opening their archives online will destroy this revenue stream without introducing a comparable new revenue. It is a very realistic fear...I used to work for an online newspaper company, and it was quite common to have customers putting up less than half of their print content after seeing massive drop offs in print sales. Many clients would ask us to clear their archives, so you could only search a month back.
I mean, the Times is a respected paper. Their articles are linked to all over the net despite the required registration, and they can expect every self respecting university to buy the year's microfilm roll. Offering the content for free could ONLY hurt them, so they'd be stupid to do so.
Hey freaks: now you're ju
Dasani is tap water.
Show me on the doll where his noodly appendage touched you.
Libraries are generally wonderful, amazing places: well organised, friendly and incredibly expert staff who do their best to get what you need for little or no cost.
But there is a cost - and people forget about it, because its in our taxes. (Whether or not we should pay for public libraries out of our taxes, and whether the money is well spent is another argument). But the bottom line is that we've had 100 years or so of great services because there has been a general philosophical acceptance that it's a Good Thing for everybody to throw in a few cents for a building in every town, full of good books, staffed by experts, and with an infrastructure to enable gaps in individual library stocks to be covered at a national *and* international level by an interlibrary loan service. Most developed countries now have a superbly developed system for getting paper-based information to their citizens for little cost.
My question is: would we accept paying taxes to do the same via the internet?
I think it's mainly a philosophical, rather than technical question. If we all agreed to pay additional 'library taxes' then there's no reason why existing sources couldn't be made available to all citizens (e.g. your National Insurance number is your password, now you can get the NYT online for free, NYT gets paid by the treasury for its national-to-all-citizens licence each year) and also in the same way that many library indexing systems were evolved by librarians working under public funding, why not use public funding to develop internet archiving / retrieval systems of comparable value? I think it's a philosophical issue, it depends on how you see these technical solutions being funded.
Scott
©20014 angrykeyboarder & Elmer Fudd. All Wights Wesewved
And probably LESS relevant than the sum total of whats available online - BBC, London Times, Die Zeit, Drudge, CNN.com, english.aljazeera.net, etc. etc.
Wow. You lump Drudge in with those other names? Please don't give him that much credit, considering 95% of his content is from those other names you list, plus the New York Times, Washington Post, and wire services.
If you're a typical outspoken, liberal New Yorker, then its your Bible.
ROFL! Go to any liberal blog, like DailyKos, and see how much bitching goes on about the NYT. Simply put, liberals are not happy with the NYT. They're a typical corporate media source like CNN, Fox, etc. Every once in a while they'll wake up with some real story or insight, but for the most part they're just doing their part to maintain the status quo and sell papers/ads. Maybe if Bush had a sex scandal they would start doing their job....
You're close, but Drudge's scoops more often come from the RNC, not the NYT. It doesn't take a genius to notice that Rush and Drudge are talking about the same thing every day. He got lucky with the Clinton scoop, and the RNC has been using him ever since. Whether or not he knows it, I can't say. He seems to view himself as a legitimate media source, but that just may be part of the act.
So would you be willing to drive 66 miles? 6 miles? 0.6 miles?
I gave up on my search for these articles due to this barrier.
And was this barrier ($3, you later say) more than fuel and parking, not to mention time spent driving to a nearby library? Heck, it's less than return subway fare in NYC. By your reasoning, unless you can walk to the nearest public library and find it, it's not worth having.
But while doing research about NEPA I find that The Scranton Times has a much better free searchable archive of information than does the The Times Leader which requires you to pay to visit their archive. Wonder who gets my visits?
Well, in your case, the answer seems obvious, but I'd pay for quality and reputation when I have to. I don't subscribe to the New York Times online (I don't think they're worth it) but I do subscribe to the Independent, and if the Guardian charged for archived material, I'd pay them too (I do pay for their crossword, in fact). And rest assured I'm not alone.