Managing Last.FM's "Mountain of Data"
Rob Spengler writes "Last.FM co-founder Richard Jones says the biggest asset the company owns is 'hundreds of terabytes of user data.' Jones adds, '... playing with that data is one of the most fun things about working at the company.' Last.FM, for those who have been living on Mars for the last two years, is the largest online radio outlet, with millions of listeners per day. The company surpassed Pandora and others largely due to its unique datamining features: 'Audioscrobbler,' the company's song/artist naming algorithm, can correctly determine a track even with tens of thousands of false entries. Jones says sitting on that much data has even helped police: 'thieves listening to music on an Audioscrobbler-powered media player have helped police in the US, UK, and other countries track down users' stolen laptops.' Does sitting on a mountain of data make Last.FM powerful enough to start making a stand against the record industry? CBS certainly thinks so — they bought the company for £140 (~$200) million last year."
How I see it: there are people with tons of money. Literally, tons. You can't use only money to make more money - no matter what you do with it, it just won't multiply sitting it's ass on the couch all day, watching TV or in a safe somewhere. So what do you need to give that money more value? The answer is simple: information. The only way to make money multiply is if know what to do with it. You can write the best software in the world, the best OS with the best tools ever, but if you don't know how to make it popular, it will never become popular on it's own. The only way to make it popular is to give people as much information about it as possible. Why do we have ads? To send people information about products. Sure, almost every ad is misleading and they give you fake information, but they do tell you something, which you take into account when you make decisions and you are more likely to buy an advertised product instead of an obscure "noname" (I was cheap enough, often enough, to buy "noname" computer-related products and I was amazed at their quality and I wish someone told me they exist so I wouldn't feel so bad and cheap before buying them).
This is the age of communication and nothing is more valuable than information and manipulating that information. How do you manipulate it? To know that, you need another kind of information, which is usually based on statistics on large amounts of data (like Last.FM's database, for example).
So, in today's society, there are three valuable entities: money (manipulated by information, everyone wants it), information (manipulated by more information, any company's dream) and more information (based on statistics, like the Last.FM database) controlling each other in a cascade. Once you have the source you can easily trace it to see how things are flowing, so you may know how to invest your money.
Repeat after me: "I will not disclose the information I have. Information is more valuable than money. If I own a valuable piece of information and I don't make money off it, I'm stupid."
If you could (accurately) answer that question, then you'd act upon the answer...
Why do you think Google ads are Google's bread and butter as far as cashflow goes? The reason is that Google has a treasure trove of user data, probably more than anyone else, so they can really make contextual ads work. Anyone can write an ad engine, but not everyone has access to mountains and mountains of user data.
You might be surprised at how important context is when you're trying to promote something. Say you're trying to promote an online RPG like Game!, if you took a random collection of people, probably less than 5% of them would be interested in playing, but if you can target gamers specifically, that number might jump to 50%. If you're paying for every impression, that makes a world of difference.
So not only do you need to understand your audience, you also need to effectively target them. Now, how do you do that? Data mining of course, and the more data the better.
Pretty much all data has value, figuring out how to turn that data into money is extremely subjective and might involve some black magic, and definitely requires luck too.
Game! - Where the stick is mightier than the sword!
Last.fm Has all this data and yet so much gets missed. For instance: why doesn't last.fm have a feature to email you when a band you like comes out with a new album?
Last.FM, for those who have been living on Mars for the last two years, is the largest online radio outlet, with millions of listeners per day.
You know, I'm not exactly what you'd call a Luddite, yet I've never heard of Last.FM. Am I the only one? I kind of doubt it.
I have a general gripe about anyone who writes "for those who have been living on Mars" anytime they reference some moderately popular company, service, or product. It smacks of arrogance, as if to say, if you don't have the same interests as I do, you're obviously disconnected from the mainstream.
Or perhaps I'm just annoyed for being called out on being a bit older and out of touch? Bah!
>>goes back to guarding lawn with a shotgun from an old rocking chair...
Irony: Agile development has too much intertia to be abandoned now.
The company surpassed Pandora and others largely due to its unique datamining features
I would think that being available outside of the USA may have helped quite a bit as well.
Good points. You had me until you said "entity" (do you know what that means? I doubt it) in the place of, I assume, "commodity".
Oh and the repeat after me bit is silly. The "information" you have is worthless on its own. It only becomes valuable when it's coupled with lots of other similar "informations" from other people. By retaining this information you're only preventing someone from making money, without any benefit for yourself, which is arguably dickish. Oh and saying that "information is more valuable than money" is stupid. You can't say that something is superior to what measures it.
You just got troll'd!
Sounds like a slight variation on those people who have TB's of movies/music/videos/TV episodes/etc that they will never have the time to watch/listen to.
So your contribution, then, is noise.
But this noise does not affect the signal, which is still there. It's just harder to find.
Nobody ever said mining a mountain of data like this would be a trivial task.
Kid-proof tablet..