Slashdot Mirror


Using Computers for Sophisticated Music Analysis

Tom Avril writes "Need an accompaniment for your melody? Seeking a virtual dancer to try out your new choreography? Or perhaps you're making a new TV commercial, and you need a snippet of music that sounds something like Radiohead, but a bit more mellow. Increasingly, sophisticated software can help with these sorts of tasks. We got a look at the latest from the nascent field of Music Information Retrieval, after its conference in Philadelphia: 'A key part of the conference each year is the announcement of results from a sort of software shoot-out — a competition in which various universities pit their music-analysis algorithms against one another. Entrants from more than a dozen countries competed in 18 tasks, using their computers to "listen" to selections of music, then identify such things as the genre, mood, composer or title. The eventual goal: to help people search for music they might like by combing through millions of audio files in a database. ... In another task, the computer had to identify tunes that someone hummed. "The idea is, you go into the karaoke bar and start humming, and the computer retrieves your song," Downie said.'"

2 of 97 comments (clear)

  1. Recognising tunes from a simple rendition by HuguesT · · Score: 5, Informative

    The tune recognition task is easier than it sounds (ha). In fact it's enough to hum the *contour* of the music, i.e. whether it simply goes up or down, for a couple of bars, ignoring the rythm even.

    This way of indexing and recognising music is called the Parson Code and is quite effective.

  2. Re:Not new tech by ahankinson · · Score: 5, Informative

    If you think this technology is like a midi->wav converter only better, you're off by orders of magnitude.

    "Simply" data mining for music is a significant problem. What data do you mine? The audio signal does not contain all of the perceptual cues we understand as humans, and so things like "rhythm" and "tempo"; i.e. the things in music that get us to dance or tap our feet to it, are hard to pinpoint and even harder to extract.

    Other problems, such as the Query-by-humming problem, are further complicated by two intractable problems: 1. People can't sing well out of their head, and 2. What they do sing may or may not bear any resemblance to the actual song they're remembering.

    This research uses the latest advances in signal processing, machine learning, psychoacoustics, computer vision and pattern recognition. To compare it to a midi to wave converter is like comparing a paper airplane to the space shuttle.