Managing Last.FM's "Mountain of Data"
Rob Spengler writes "Last.FM co-founder Richard Jones says the biggest asset the company owns is 'hundreds of terabytes of user data.' Jones adds, '... playing with that data is one of the most fun things about working at the company.' Last.FM, for those who have been living on Mars for the last two years, is the largest online radio outlet, with millions of listeners per day. The company surpassed Pandora and others largely due to its unique datamining features: 'Audioscrobbler,' the company's song/artist naming algorithm, can correctly determine a track even with tens of thousands of false entries. Jones says sitting on that much data has even helped police: 'thieves listening to music on an Audioscrobbler-powered media player have helped police in the US, UK, and other countries track down users' stolen laptops.' Does sitting on a mountain of data make Last.FM powerful enough to start making a stand against the record industry? CBS certainly thinks so — they bought the company for £140 (~$200) million last year."
Its not that its not legal, but it *definitely* is not enforceable anywhere in the US, period.
And no, none of it is really worth the service of a company that sells mined data to third parties and needs to launch a viral marketing campaign on slashdot just to make their numbers next quarter.
How is Pandora lying to me?
I'm sort of half and half. I collect mountains of ebooks from alt.binaries.ebooks.technical (currently close to 84GB). I probably have fifty thousand dollars worth of stuff. Ninety nine percent of it I will never read, because who has the time? BUT it's there as a reference,too. If I need to know something in more depth than I can get with a quick Google, I've got my huge library. And I have a reading list of the most important stuff that I do need to read. I also capture tons of Web pages with information about the subjects I'm interested in.
OTOH, I also download a lot of interview videos of various hot babes off the various talk shows (I should point out I don't have a TV or cable although Comcast is in the building). And most of them I haven't listened to more than once, either. I also download tons of babe photos - but there at least the best end up in my wallpaper rotator - and I've had vague notions of setting up my own ad-supported babe blog someday based on that collection (like the world needs another one).
And of course I have a collection of MP3's and music videos, most of which, other than my top fifty or so favorites, I don't listen to at any given time.
All of these are cheap hobbies - except in terms of time. But they also provide entertainment and information. Periodically I do watch or use downloaded stuff - it's just usually a small percentage of what I've downloaded over time.
And since my current system has a max of a terabyte of HD with about 350GB+ free I don't expect to have much more than that for some time to come.
Pure quantity doesn't interest me - quality is important, too. For instance, some people download any crappy photo of their favorite babes. I only collect larger HQ shots and periodically weed out the less quality stuff. That keeps the collection more manageable and the quality up.
Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!