Unless your friends are all clones of you, friendships probably aren't the best predictor of your interests. Your friends are different than you. That's what makes them interesting.
What might work better is reaching out to the entire community -- beyond just your friends -- finding the people like you, and having them recommend interesting articles.
Are personalized news sites more shallow or more narrow? Compare a personalized news site to CNN. The unpersonalized front page of CNN provides only a shallow view targeting some mishmash of the general interests of millions of readers. By trying to satisfy everyone, it satisfies no one, a bland blend of interests that results in mediocrity.
Personalized news provides an opportunity to broaden reader's interests, exposing them to news sources, perspectives, and viewpoints they otherwise would never have seen. A personalized news aggregator provides both breadth and focus, sorting through huge numbers of sources and articles and helping you find what you need.
Personalized news helps you discover news you would otherwise miss. It makes it easier to get the information you need to be well-informed about the events that impact your life. If this is the future, it is a future which should excite us.
Not everyone will agree with this, but games are what will make Linux succeed as the #1 desktop in the world. When you can buy the same games for Linux that you can for Windows and anyone can install them, there will be a massive push behind Linux as an operating system.
Along these lines, I'd love to see a live CD version of Linux (e.g. Knoppix) that contains a collection of the better, easier to use, free Linux games. Even better, several emulators and collections of older games from console or older PC systems (although getting rights could be an issue here) with some kind of trivially easy to use interface. Or even just a simple live CD with nothing but Quake or another popular first person shooter that comes up clean and easy on all systems would be appealing.
I think a free CD that you can drop into your PC and start playing games would attract a lot of new users and give them their first introduction to Linux.
Excellent point. Software engineering is the design stage. The industrial stage is burning the CDs for distribution.
Another analogy might writing a novel. The actual writing is a creative task that isn't amenable to mass production. But banging out 100k copies of the book once the writing is done is an industrial process.
Could you elaborate on what you see as the parallels between mass production and software engineering? I don't see them.
Seems to me that anything that is well-understood and capable of being banged out many times repeatedly is instead coded up into a library. Reusing a library requires no coding at all.
It's not clear to me what software projects are well-understood and well-specified. Anything that is well-understood, well-specified, and been done many times before is turned into a library, reusable by all. Everything else is a custom job that requires creative thinking, innovation, and problem solving.
From the article: "One stumbling block is how difficult it is to quantify the product's value versus its price. (Right now, the technology is priced at $45 per square foot.)"
Cutting benefits can be dangerous. Typically, with any salary or benefit cut, your best employees (who have the best job prospects) leave at a disproportionate rate. It almost always has a negative impact on morale and productivity.
Moreover, benefits often are valued by employees at a level beyond the pure monetary value. One of the more interesting books I've read on employee compensation, Strategic Human Resources, makes this point:
Benefits and perks can also be particularly powerful symbols of gift exchange, moving the employment relationship from one with purely economic connotations to something more along the lines of a kin or friendship relationship. Salary, wages, and even bonus payments all have the connotation of an economic exchange in which each party should attempt to extract the best possible (narrowly selfish) deal. Some forms of benefits and perks are of an entirely different flavor and can cause the worker to respond with reciprocal gifts or by internalizing the welfare of the organization.
The psychological leverage associated with providing benefits is likely to depend on whether the employer is a pioneer in providing this perquisite or instead simply seen to be matching the competition.
Marketing folks will argue that this kind of registration data is valuable for advertisers and to better understand their readers, but I doubt they understand the costs of these kinds of hurdles. Throwing up registration requirements will reduce traffic -- some people just won't bother with it -- and lost traffic means lost advertising dollars.
What I would recommend is voluntary registration and voluntary user surveys to gather the same data on a sample of your audience. For advertising, target the ads to the content of the page, like Google AdSense. If you want to get tricky, start tracking individual behavior -- articles read and advertising viewed -- to personalize the ads to each reader. With these techniques, you'll have the data you need to understand your readers and be able to have effective, targeted advertising programs.
I've posted a deeper discussion of mandatory registration on my weblog with links off to some other interesting discussions at Wired, Poynter, BoingBoing, and elsewhere.
Exactly. Google's rise to success over AltaVista and others was based on quality over quantity. Precision and recall, traditional measures of quality in information retrieval, don't really matter. What matters is making sure the first few results or even just the first result (the "I'm Feeling Lucky" button on Google) are as relevant as possible. It's all about the relevance rank.
It'll be interesting to see if someone can do the same thing for news. Here's our attempt. Findory learns your interests, searches through thousands of news sources, and helps surface interesting news you'd otherwise miss. It's all about relevance.
Might check out Findory News. Findory doesn't require a login; you're anonymous when you use the site. The design is simple and clean, all text. And the personalization quickly and effectively learns your interests. Try it out and see what you think.
Who needs 4800 news sources? Too much information = too easy to lose the salient stuff.
If you don't read those 4800 sources, how do you know what you're missing?
Of course, no one can read 4800 news sources. That's why sites like Findory News exist. Findory learns from the news you read, searches thousands of sources, and helps you discover news that you otherwise would have missed.
After all, isn't that what computers are for? Sorting through the piles of too much information and making it easy for you to find the salient stuff?
Microsoft does excel at imitation. MSN Newsbot appears to be a combination of Google News and personalization technology like Amazon.com or Findory News.
MSN Newsbot does look a lot like Google News, but it does have something unusual, personalized news. The site watches which articles you read and attempts to find other interesting articles. Microsoft certainly isn't the first here, but they are the biggest.
If done right, personalized news can work very well. Of course, trust is a big issue. If you don't trust Microsoft, give Findory News or Memigo a try.
There's a variety of ways to deal with this issue. The solution many seem to be suggesting is to randomize request times so that there aren't big spikes in traffic every hour at the hour. That's certainly a good idea. Clients should also respect the ttl (polling at the interval that is listed in the feed), support conditional GET, and handle 304 (not modified) responses to minimize the number of requests they make for the full feed.
But the primary solution will end up being caching. With the exception of personalized RSS feeds, RSS feeds easily can be cached. Web-based RSS readers like Bloglines and My Yahoo already only read the RSS feed once, cache it, and display it to multiple readers. But popular RSS feeds are also easily proxy cached just like web pages, reducing the load on the original source servers.
I think its a classic example of building your business around your strength - the searching capability.
Google's other strength is their massive cluster which, combined with their file system, allows them to store and retrieve tremendous amounts of data. Storing digital images fits nicely into this core competency.
A good review of Blinkx with some discussion. I've also got a post about Blinkx that includes links to discussions of Linux version of something similar (Dashboard) and Microsoft's attempt (Implicit Query).
The Wired article also talks about the differences between readers online and offline. Readers online only spend 1.5 minutes/day reading the paper compared to 28.2 minutes/day offline. The author goes on to argue that the online site should be more targeted than the offline site:
The Times should customize its content so that readers could pick and choose which stories they want based on their own particular interests, rather than having to wade through the site's table of contents.
What is being suggested here is personalized news such as Findory News. Take advantage of the online media format. Customize each page to each reader's interests. Make it easier for online readers to find interesting news.
You make a great point. Because of the data size (1T/month) and the description of his needs ("generating, storing and analyzing a large amount of data"), I assumed he was not working with financial data.
The financial transactions needed for business accounting should be handled more carefully, probably both multiply replicated on-site (in a database cluster and with RAID drives) and regularly backed up to tape that is stored off-site.
Build yourself a cluster of cheap boxes with cheap IDE disks and replicate your data across them. Because the data is replicated across your cluster, no need for backups or RAID.
I don't think it's true that "Google won the mindshare a long time ago." As of Jan 2004, Google has less than 40% of the search market, nearly tied with MSN and Yahoo.
Unfortunately, all Microsoft has to do is to catch Google. If the quality is essentially indistinguishable from Google, most people will use MSN Search, since MSN Search will be the default in IE (and probably MS OFfice and WinXP soon).
As long as Google keeps innovating and stays ahead, they'll do fine. But, if they trip, Microsoft will catch up and trample over them, just like they did to Netscape.
In the article, Wayne Rosing explicitly says that Google is not planning on open-sourcing the Google code base, but that they will publish academic papers on their work. "I'm not saying we're going to open-source Google, because that would be a little dumb when we have these Microsoft guys making noise. . . We're encouraging the software engineers to submit papers where it makes sense, particularly where it is landmark work and it is really important that other people know."
Google already has published a number of papers on their systems, including descriptions of PageRank, their clustering architecture, and their high availability file system (the Google File System). Seems like this is merely an announcement that they intend to do more of the same.
Unless your friends are all clones of you, friendships probably aren't the best predictor of your interests. Your friends are different than you. That's what makes them interesting.
What might work better is reaching out to the entire community -- beyond just your friends -- finding the people like you, and having them recommend interesting articles.
Many have said that the Daily Me, a personalized newspaper, will be the future of news.
JD Lasica wrote a particularly good piece on it.
Are personalized news sites more shallow or more narrow? Compare a personalized news site to CNN. The unpersonalized front page of CNN provides only a shallow view targeting some mishmash of the general interests of millions of readers. By trying to satisfy everyone, it satisfies no one, a bland blend of interests that results in mediocrity.
Personalized news provides an opportunity to broaden reader's interests, exposing them to news sources, perspectives, and viewpoints they otherwise would never have seen. A personalized news aggregator provides both breadth and focus, sorting through huge numbers of sources and articles and helping you find what you need.
Personalized news helps you discover news you would otherwise miss. It makes it easier to get the information you need to be well-informed about the events that impact your life. If this is the future, it is a future which should excite us.
- Not everyone will agree with this, but games are what will make Linux succeed as the #1 desktop in the world. When you can buy the same games for Linux that you can for Windows and anyone can install them, there will be a massive push behind Linux as an operating system.
Along these lines, I'd love to see a live CD version of Linux (e.g. Knoppix) that contains a collection of the better, easier to use, free Linux games. Even better, several emulators and collections of older games from console or older PC systems (although getting rights could be an issue here) with some kind of trivially easy to use interface. Or even just a simple live CD with nothing but Quake or another popular first person shooter that comes up clean and easy on all systems would be appealing.I think a free CD that you can drop into your PC and start playing games would attract a lot of new users and give them their first introduction to Linux.
Excellent point. Software engineering is the design stage. The industrial stage is burning the CDs for distribution.
Another analogy might writing a novel. The actual writing is a creative task that isn't amenable to mass production. But banging out 100k copies of the book once the writing is done is an industrial process.
Could you elaborate on what you see as the parallels between mass production and software engineering? I don't see them.
Seems to me that anything that is well-understood and capable of being banged out many times repeatedly is instead coded up into a library. Reusing a library requires no coding at all.
It's not clear to me what software projects are well-understood and well-specified. Anything that is well-understood, well-specified, and been done many times before is turned into a library, reusable by all. Everything else is a custom job that requires creative thinking, innovation, and problem solving.
From the article: "One stumbling block is how difficult it is to quantify the product's value versus its price. (Right now, the technology is priced at $45 per square foot.)"
Moreover, benefits often are valued by employees at a level beyond the pure monetary value. One of the more interesting books I've read on employee compensation, Strategic Human Resources, makes this point:
- Benefits and perks can also be particularly powerful symbols of gift exchange, moving the employment relationship from one with purely economic connotations to something more along the lines of a kin or friendship relationship. Salary, wages, and even bonus payments all have the connotation of an economic exchange in which each party should attempt to extract the best possible (narrowly selfish) deal. Some forms of benefits and perks are of an entirely different flavor and can cause the worker to respond with reciprocal gifts or by internalizing the welfare of the organization.
Seems like Google understands this. They offer a particularly exceptional benefits package.The psychological leverage associated with providing benefits is likely to depend on whether the employer is a pioneer in providing this perquisite or instead simply seen to be matching the competition.
Marketing folks will argue that this kind of registration data is valuable for advertisers and to better understand their readers, but I doubt they understand the costs of these kinds of hurdles. Throwing up registration requirements will reduce traffic -- some people just won't bother with it -- and lost traffic means lost advertising dollars.
What I would recommend is voluntary registration and voluntary user surveys to gather the same data on a sample of your audience. For advertising, target the ads to the content of the page, like Google AdSense. If you want to get tricky, start tracking individual behavior -- articles read and advertising viewed -- to personalize the ads to each reader. With these techniques, you'll have the data you need to understand your readers and be able to have effective, targeted advertising programs.
I've posted a deeper discussion of mandatory registration on my weblog with links off to some other interesting discussions at Wired, Poynter, BoingBoing, and elsewhere.
There is some question about whether Microsoft has an explicit strategy of using patents as a weapon against open source.
Exactly. Google's rise to success over AltaVista and others was based on quality over quantity. Precision and recall, traditional measures of quality in information retrieval, don't really matter. What matters is making sure the first few results or even just the first result (the "I'm Feeling Lucky" button on Google) are as relevant as possible. It's all about the relevance rank.
It'll be interesting to see if someone can do the same thing for news. Here's our attempt. Findory learns your interests, searches through thousands of news sources, and helps surface interesting news you'd otherwise miss. It's all about relevance.
Might check out Findory News. Findory doesn't require a login; you're anonymous when you use the site. The design is simple and clean, all text. And the personalization quickly and effectively learns your interests. Try it out and see what you think.
- Who needs 4800 news sources? Too much information = too easy to lose the salient stuff.
If you don't read those 4800 sources, how do you know what you're missing?Of course, no one can read 4800 news sources. That's why sites like Findory News exist. Findory learns from the news you read, searches thousands of sources, and helps you discover news that you otherwise would have missed.
After all, isn't that what computers are for? Sorting through the piles of too much information and making it easy for you to find the salient stuff?
Microsoft does excel at imitation. MSN Newsbot appears to be a combination of Google News and personalization technology like Amazon.com or Findory News.
Memigo is very cool with a ton of power user features. It's a fantastic site.
If you're looking for something simple and easy to use, Findory News is another good choice.
MSN Newsbot does look a lot like Google News, but it does have something unusual, personalized news. The site watches which articles you read and attempts to find other interesting articles. Microsoft certainly isn't the first here, but they are the biggest.
If done right, personalized news can work very well. Of course, trust is a big issue. If you don't trust Microsoft, give Findory News or Memigo a try.
There's a variety of ways to deal with this issue. The solution many seem to be suggesting is to randomize request times so that there aren't big spikes in traffic every hour at the hour. That's certainly a good idea. Clients should also respect the ttl (polling at the interval that is listed in the feed), support conditional GET, and handle 304 (not modified) responses to minimize the number of requests they make for the full feed.
But the primary solution will end up being caching. With the exception of personalized RSS feeds, RSS feeds easily can be cached. Web-based RSS readers like Bloglines and My Yahoo already only read the RSS feed once, cache it, and display it to multiple readers. But popular RSS feeds are also easily proxy cached just like web pages, reducing the load on the original source servers.
- I think its a classic example of building your business around your strength - the searching capability.
Google's other strength is their massive cluster which, combined with their file system, allows them to store and retrieve tremendous amounts of data. Storing digital images fits nicely into this core competency.A good review of Blinkx with some discussion. I've also got a post about Blinkx that includes links to discussions of Linux version of something similar (Dashboard) and Microsoft's attempt (Implicit Query).
- The Times should customize its content so that readers could pick and choose which stories they want based on their own particular interests, rather than having to wade through the site's table of contents.
What is being suggested here is personalized news such as Findory News. Take advantage of the online media format. Customize each page to each reader's interests. Make it easier for online readers to find interesting news.You make a great point. Because of the data size (1T/month) and the description of his needs ("generating, storing and analyzing a large amount of data"), I assumed he was not working with financial data.
The financial transactions needed for business accounting should be handled more carefully, probably both multiply replicated on-site (in a database cluster and with RAID drives) and regularly backed up to tape that is stored off-site.
Build yourself a cluster of cheap boxes with cheap IDE disks and replicate your data across them. Because the data is replicated across your cluster, no need for backups or RAID.
I don't think it's true that "Google won the mindshare a long time ago." As of Jan 2004, Google has less than 40% of the search market, nearly tied with MSN and Yahoo.
Unfortunately, all Microsoft has to do is to catch Google. If the quality is essentially indistinguishable from Google, most people will use MSN Search, since MSN Search will be the default in IE (and probably MS OFfice and WinXP soon).
As long as Google keeps innovating and stays ahead, they'll do fine. But, if they trip, Microsoft will catch up and trample over them, just like they did to Netscape.
In the article, Wayne Rosing explicitly says that Google is not planning on open-sourcing the Google code base, but that they will publish academic papers on their work. "I'm not saying we're going to open-source Google, because that would be a little dumb when we have these Microsoft guys making noise. . . We're encouraging the software engineers to submit papers where it makes sense, particularly where it is landmark work and it is really important that other people know."
Google already has published a number of papers on their systems, including descriptions of PageRank, their clustering architecture, and their high availability file system (the Google File System). Seems like this is merely an announcement that they intend to do more of the same.