Managing Last.FM's "Mountain of Data"
Rob Spengler writes "Last.FM co-founder Richard Jones says the biggest asset the company owns is 'hundreds of terabytes of user data.' Jones adds, '... playing with that data is one of the most fun things about working at the company.' Last.FM, for those who have been living on Mars for the last two years, is the largest online radio outlet, with millions of listeners per day. The company surpassed Pandora and others largely due to its unique datamining features: 'Audioscrobbler,' the company's song/artist naming algorithm, can correctly determine a track even with tens of thousands of false entries. Jones says sitting on that much data has even helped police: 'thieves listening to music on an Audioscrobbler-powered media player have helped police in the US, UK, and other countries track down users' stolen laptops.' Does sitting on a mountain of data make Last.FM powerful enough to start making a stand against the record industry? CBS certainly thinks so — they bought the company for £140 (~$200) million last year."
A buddy of mine used to run this matching website for teachers & students. Free for teachers, and the students had to pay a nominal amount to get the teachers' contact info, and after that, it was up to them to arrange for lessons. The site was popular, and he made decent money at it. I bugged him and bugged him to organize parties, and eventually he came around to my way of thinking (he wanted to make some money without his parasite partner getting it). He used the list of emails from his website to send party invitations for a monthly get-together. He made more money from the parties than he did from the website.
Shutting down free speech with violence isn't fighting fascism. It IS fascism!
what i find most interesting is the order certain songs "go together", like listening to a song from Slayer, followed by, say, "someday i suppose" from the bosstones. when composing songlists, i appreciate how similar songs and moods can flow, but also how the contrast of dissimilar songs can SOMETIMES compliment each other.
a large database could ferret out such instances that might occur frequently in multiple playlists.
I have a similar site that I wrote (pre-audioscrobbler). Granted it's crap, but I have mountains of data also. Closer to 1 tb than hundreds of tb. The question is, how do you monetize the data?
I just don't see how this data is "worth" 200 million bucks. I have some amazing algorithms to do similar cleaning, caching, and recommendations, but still what is that worth?
This is a fairly legit question. If you can figure it out, I can explain to my wife why I have 3 servers in my closet.
The summary wasn't insulting enough, so I think I'll just add a bit extra.
Last.FM is so popular that if you aren't familiar with the service, you must be a drooling, knuckle dragging luddite.
Apparently I'm not one of the cool kids. I'm sad now, and my feelings are hurt.
Last.fm Has all this data and yet so much gets missed. For instance: why doesn't last.fm have a feature to email you when a band you like comes out with a new album?
CBS certainly thinks so -- they bought the company for £140 (~$200) million last year.
Which is why whatever comes of them, at best it will be evolutionary. CBS is part of the old guard RIAA corps, they are just one of the faces of Viacom - all controlled by Summer Redstone. They may have brought some money to the table, but they brought a whole ton of baggage with them too. Enough baggage to make this privacy freak decide they couldn't be trusted with all that data they've been collecting (for example, if they can track down a stolen laptop, they can track down someone playing an MP3 from an illegally leaked pre-release album).
When information is power, privacy is freedom.
Last.FM, for those who have been living on Mars for the last two years, is the largest online radio outlet, with millions of listeners per day.
You know, I'm not exactly what you'd call a Luddite, yet I've never heard of Last.FM. Am I the only one? I kind of doubt it.
I have a general gripe about anyone who writes "for those who have been living on Mars" anytime they reference some moderately popular company, service, or product. It smacks of arrogance, as if to say, if you don't have the same interests as I do, you're obviously disconnected from the mainstream.
Or perhaps I'm just annoyed for being called out on being a bit older and out of touch? Bah!
>>goes back to guarding lawn with a shotgun from an old rocking chair...
Irony: Agile development has too much intertia to be abandoned now.
The company surpassed Pandora and others largely due to its unique datamining features
I would think that being available outside of the USA may have helped quite a bit as well.
with last.fm is how it feeds my OCD issues regarding song playcounts. I nearly lost it when the stupid scrobbler started randomly recording excess playcounts on one album. It screwed with my numbers. Then it stopped counting that album's plays all together.
Seriously though, I have found using the site to be pretty enjoyable. And the advertisements are actually worth keeping AdBlock turned off for. I found a few new artists, some unsigned, that way. I like all the various widgets and things that can crunch my data. Songbird has a last.fm plugin/addon that makes for very easy integration. It's just really useful. I've also found concerts on the site.
I rarely use the social side of it, except with friends I already know. But that's me.
http://transformativeworks.org/
Comment removed based on user account deletion
Anyway, here is a quote from their Terms of Use agreement.
"It is important for you to refer to these Terms of Use from time to time to make sure that you are aware of any additions, revisions, or modifications that we may have made to these Terms of Use. Your continued use of the Website constitutes your acceptance of the new Terms of Use."
Is this a common practice. One has to agree to something that can change and you are obligated to adhere to these changes, too? How can this be legal?
There also spell out later their claim to intellectual property rights. Including "database rights." Is that a real right or are they just making that up?
They also state "You are responsible for... restricting access to your computer so that others may not access any password-protected portion of the Website or other Properties using your name..."
Yuck. Yuck. Yuck. Is this really worth their service?
So, I got PHORM monitoring my browsing habits and Audioscrobble monitoring what I listen to. Does anyone here, apart from me, find that just a little bit creepy ..
'Without privacy, there cannot be freedom. And without freedom, there cannot be personal or social growth'
davecb5620@gmail.com
Speaking of websites that have lost their reason for existing and have nothing to offer but user data, does anyone remember slashdot?
I'm concerned about their recent attitude towards Intellectual Property, their Terms of Use used to say "Your pseudonymous listening habit data will be available to other Last.fm users for non-commercial use under a Creative Commons license" [1] and you could even download snapshots from the database in the past [2]. One day the database snapshots went away but the Terms of Use didn't change until very recently, now they claim property. I'm not a lawyer, but that sounds like "doing evil" to me
On the other hand this data is probably very valuable to Last.fm and CBS (they wouldn't risk a lawsuit otherwise) but the main benefit to the user is supposedly "discovering artists similar to those you like", and there is an easy and less privacy-invasive way of getting this based on the amount of times two artists appear together in a Google / Yahoo / whatever search.
I tried that and the results are as coherent as the ones you get from Last.fm, I'm just too lazy to automate the whole thing. If anyone wants to DIY, You can get a huge database of artists for free from MusicBrainz (Last.fm gets a lot of information from there too). Besides, the quality of the information in MusicBrainz is much better that the one Last.fm gives you, they are still trying to fix the misspelling problems and they don't seem to be able to fix the "artists share name" problem at all.
The good thing about Last.fm is music streaming, but you don't need to send them your data for that, in fact you don't even need to visit their web.
[1] See:
http://www.slideshare.net/trebor/how-the-social-web-came-to-be-part-2
http://www.last.fm/group/STOP+MOD+ABUSE/forum/88174/_/392258/1#f6054783
[2] See:
http://www.last.fm/group/Last.fm+Web+Services/forum/21604/_/239661/1#f3198554
I always see that from the writer's viewpoint, as if he's saying "Look, I know this isn't news, and I'm just getting around to writing about it a few years later, but I really do have something interesting to say about it! So I will acknowledge its apparent staleness with a jokey aside before I get to the point."
Good thing writing isn't some sort of Rorschach test where we can each imbue it with our own insecurities, eh?
The company surpassed Pandora and others largely due to its unique datamining features: 'Audioscrobbler,'
I'd say they surpassed Pandora only because Pandora locked out all non-US users a while back. For people who just wanted to listen to music and find out about new artists, Pandora was so much better IMO, last.fm has a clunky, overloaded UI and is too much like myspace ...
"I love my job, but I hate talking to people like you" (Freddie Mercury)
I've heard of Last.FM and I have been living on Mars, you insensitive clod!
Oddly enough, even here on Mars, just as in the US, they have a 3-listen limit on any track thanks to the RIAA.
So, three shall be the number of the counting. Thanks CBS, thanks Last.FM and thanks RIAA. In fact, I've said thank you back by turning off the autoscrobbler and reducing the data that you can use to make money off of me.
Speaking for my fellow Martians - you're welcome, Last.FM!
Pathological kinda promises Path + Logical - but instead, you get stuck with pathetic.
"CBS certainly thinks so - they bought the company for £140 (~$200) million last year."
Why someone uses current exchange rates? Should be £140 (~$280).
i dont think its fair we got mark trolls for that
Then why the hell is it that when I run the "Recommendations" stream the algorithm occasionally freaks out and starts pushing one unlistenable noise attack after another at me with tags like brutal death metal, cybergrind, czech, death metal, deathgrind, goregrind, grind, grindcore, noisecore, porngrind, pornogrind, etc. No matter how many times I click the "Do Not Want" button the stuff just keeps coming. It's like a neighbour from hell. And then there's the days when I get nothing but lesbian deathcore vegan grind.
The Last.FM brainfarts seem to persist no matter how many times yoy try to train the recommendation engine using the like/ban buttons and the only way to get them to "reset" to something vaguely approximating normality is to log out, log back in, and run the Library stream for a while.
Still, even with this weirdness it's still better than Pandora at finding new music I actually like.
Da Blog
Comment removed based on user account deletion
I've found it frustrating to get heard on last.fm. Our music is all freely download-able (http://www.last.fm/music/The+Willing+Mind and http://www.last.fm/music/Brian+Silberbauer), but we're just not getting hits..
Surely there should be a good way for Creative Commons licensed music to be promoted as we're not making money out of the downloading, its difficult to justify buying into the last.fm promotions.
I just think there is something missing out there, I make music, I'm not a marketer.
Or maybe our music just sucks??