Yamaha Releases Singing Synthesis Software
loopdloop writes "The world's first singing synthesis software, Vocaloid, was released by Yamaha this month at the Los Angeles NAMM show. Simply type in the lyrics and notate the vocal expressions to create a completely computer-generated singer. There are also audio demos of the product available." Update: 01/26 21:14 GMT by S : An earlier NYT-authored preview of this software has also been covered on Slashdot.
"I was going to do that?"
This was something I was really interested in when choosing a college major, and thought that I'd get into EE CS and do this. Somehow, I've found myself coding web applications instead.
I'm glad to see somebody's doing it, but man, I think I took a wrong turn somewhere.
Tweet, tweet.
Vocaloid has been covered on Slashdot before. It is one of the many impressive projects to have at least in part come out of the Music Technology Group at Institut Universitari de L'Audiovisual in Barcelona.
This is one of many impressive Music Technology groups in the world who is kind enough to provide us with open source software such as CLAM. Similarly there are some groups out there doing interesting things. Needless to say, I could link all day...
I am a graduate student in this field
Comment removed based on user account deletion
That's quite amazing. Now we need a computer to write music and songs.
-Tim
Shpongle (trance group) used Vocal Writer in their CD that was released in 1998.
...this is where Britney Spears' talent comes from!
Edward@Tomato - /home/Edward/ man woman
man: no entry for woman in the manual.
"Qua!?"
man, at least Milli Vanilli had singers.
It had multiple voices... and was fun. MC Hawking style.... "mmmmmmmMMMMMMMMM ya"
You talk better than you fool!
Wow, this must make the RIAA's day. An artist who needs absolutly no pay and who really is property...
Since when has this country used intellectual elite as a pejorative term?
"The world's first singing synthesis software, Vocaloid, was released by Yamaha this month at the Los Angeles NAMM show.
Feh! They might be able to program something that sings better than Britney, but until they integrate it with something like this, Ms Spears' talents will continue to be in demand...
*** Where are we going? And what's with this handbasket?
I guess Britney is going to be out of a job now. We all know that any computer can sing better than her, and since neither of them play an instrument, I guess she's screwed.
Not to mention that the computer is far sexier.
So will we finally get to replace the prime-time T.V. show American Idol??
The logical next step would be a program that would listen to, and enjoy, the music that other computers write and sing.
Think of the time it would free up, and the money it would save - you would never have to buy CDs. *cough* of course, some people have already eliminated that expense.
...first, dancing robots and now singing computers
sigh... dancing, singing robots?, its been done
Steven Hawking is trying to start up a band.
Just listened to these.
While their "Volcaloid" tech is nice, their "Lyricoid" tech needs work.
I had the chance to try it out at NAMM and it is VERY difficult to get it to "sing." It can probably be used adequately for backup vocals, but again, it takes a lot of work to get it to sound human. Nevertheless, a step in the right direction.
A blog like any other.
I think my Sound Blaster Pro came with this software. It fit on a floppy disk, and you could make the computer sing whatever you typed in. In fact it also came with a psychiatrist named Dr Sbaitso. Just don't cus at him, he gets offended very easily.
Seriously, I'm sure that this new software is much better. At least I sure hope so...
Interestingly enough, this made me think of something I read in William Gibsons blog a long time ago. I don't know where it is now though. It was about how in the future, people will be able to take a movie or something on their computer, and tell the computer to replace all the actors heads with dog heads for example, and change what they do and say with simple commands. Perhaps this software is the lower level beginning of making that happen, we'd just need some higher-level controls to make it easy for everybody to use.
Buy Steampunk Clothing Online!
Since they're so old and obsolete, if you find one in your attic, I'll take it off your hands for a cool $20 USD.
(No disrespect intended).
Sigs are bad for your health.
Fitter, happier, more productive
Comfortable
Not drinking too much
Regular exercise at the gym
(3 days a week)
Getting on better with your associate employee contemporaries
At ease
Eating well
(No more microwave dinners and saturated fats)
A patient better driver
A safer car
(Baby smiling in back seat)
Sleeping well
(No bad dreams)
No paranoia
Careful to all animals
(Never washing spiders down the plughole)
Keep in contact with old friends
(Enjoy a drink now and then)
Will frequently check credit at (moral) bank (hole in the wall)
Favours for favours
Fond but not in love
Charity standing orders
On Sundays ring road supermarket
(No killing moths or putting boiling water on the ants)
Car wash
(Also on Sundays)
No longer afraid of the dark or midday shadows
Nothing so ridiculously teenage and desperate
Nothing so childish - at a better pace
Slower and more calculated
No chance of escape
Now self-employed
Concerned (but powerless)
An empowered and informed member of society
(Pragmatism not idealism)
Will not cry in public
Less chance of illness
Tires that grip in the wet
(Shot of baby strapped in back seat)
A good memory
Still cries at a good film
Still kisses with saliva
No longer empty and frantic like a cat tied to a stick
That's driven into frozen winter shit
(The ability to laugh at weakness)
Calm
Fitter
Healthier and more productive
A pig in a cage on antibiotics
there's no place like ~
I can't be the only one that thinks the background vocals sound like Electric Light Orchestra??
Just like many Yamaha firsts, this one may have been overhyped a bit. This sounds like a real person singing in the way that a synth brass pad sounds like a trumpet. There is no way in hell you would ever even consider that these noises were made by a human being. Yes, I understand that most of the samples are in Japanese and might not sound normal to me anyway. But, even if you listen to the ONE in English alone, it sounds like the Bell Labs female voice, but screechy and obnoxious instead of like a drugged out cigarette smoker after a trachyotemy (sp?).
Webmaster Wanted - Entropic Reactions
Machines can have emotions if we want them to. There was even an article in last month's Scientific American by an AI researcher who claims that machines will need emotions for real AI to work. There have been several robotics/AI projects that have attempted to incorporate emotions, Cynthia Breazeal's robot Kismet being the most famous.
Emotions are an information processing system that works holistically, priming the logical parts of the brain for the kind of work they will need to do. Big orange and black stripey thing running towards you? Prime the brain for a flight or fight response rather than curiosity, i.e. "Run, it's a tiger!" not "I wonder if this orange and black stripey thing wants to play?"
There remains the problem of qualia. That is, a robot may look for all intents and purposes as if it is having emotions, but does it feel the same things internally as we do? Unfortunately, there's no real way of knowing if even other humans feel the same thing we do.
When the day comes that a robot belts out a blues song about someone done it wrong and broke it's heart, we will judge it in the same way we judge human singers: Does it look and sound authentic, or is it faking it? If it looks and sounds authentic, I believe that we will take it for granted that it feels the same as we do, just as we take it for granted in other humans.
- None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
Now we can finally get rid of these whiny musicians, always complaining about "I need to feed my family" and "I'm a professional and should be paid like one." Now all of those unskilled morons can be sent to fill up the thousands of food preparation and customer service jobs that our public school system can't seem to find enough people for.
Sorry about the offtopic (tongue firmly in cheek) rant. You're right. this does sound like a fun toy.
I can't find a link to an actual demo of it simulating a human voice, but here's a page that documents its use to reproduce the sound of a suling (javanese wooden flute). Does a good job too. I've heard it demo'd with a human voice, and it was pretty good (though the neural net needed additional input - the syllable being sung - obviously).
i'm sure that many of the other academic computer music labs around the world had similar software long before yamaha introduced this package. still cool, though.
I'm sure Britney's "talent" has absolutely nothing to do with her ability to do vocals, and absolutely everything to do with her abilility to take off her clothes...
"Freedom means freedom for everybody" -- Dick Cheney
Talk is cheap, so if you can't implement your invention, punt! Explain it in detail in public, or among a relevant developer group. Then you at least have a chance of being in on the creation, along your lines of vision, and at least get your world bettered, even if you can't cash in by doing the hard part. The joy of coinvention sure beats the bitterness of "coulda, woulda, shoulda". Most of the open source process is based on that crosspollination and mutual assist. Hell, if we all did this better, maybe the docs would get written *first*, rather than never-quite.
--
make install -not war
The one thing I expected they wouldn't get right, was what they did the best.
When people hold notes, there are natural fluctuations in the tone, nobody can hold a perfect tone without some audible wax or wane.
But you can hear this simulated amazingly if you listen to that one japanese song with the single male "vocalist".
Vonal Declosion
So it fits right in with most of the pap on the top 40.
The real sham is all the manufactured music that's been out there for years and increasing. Just program it a dictionary and it'll do rap, too.
A feeling of having made the same mistake before: Deja Foobar
I totally agree with you 100%. I mean, come on, the Terminator had a hella thick accent, and he came from like 30 years in the future.
Ah'll be BECK.
It's bad enough we have Cher and Madonna doing the fake electronic yodel dance, now every dipthong with a little cash will be creating them.
"I'm just here to regulate funkiness."
I've done my time in a "philosophy of AI" class -- and frankly, as far as I'm concerned, the Chinese Room argument and the like are bogus. If it looks like a duck, walks like a duck, quacks like a duck -- it's a duck, at least until someone finds out that it doesn't in fact keel over when given cyanide-laced bread and has to plug itself in to charge its batteries every so often.
:)
Consider: For the purposes of this thread, AI is considered incomplete due to inability to simulate emotions why? Because it's argued that the desired output (emotionally-charged vocalization) is impossible without a precondition which is argued to be impossible. If the output were indeed achieved, then for the purpose of its singing, would the system not for all intents and purposes be emotional, even if one is unable to demonstrate that the system actually experiences qualia? (After all, if we're unable to make this demonstration wrt other humans, why make it a requirement for nonhuman intelligence?)
Granted, I also think that a machine which can pass an unrestricted Turing Test over an extended period can be safely be considered capable of thought, so it's obvious which side of this debate I land on.
So there is prior art spewing out all over the place.
and how could I leave off:
eddie and eedie? (search for "Some Velvet Morning"
-- Real Stupidity is the Artificial Intelligence of the 21st century
The "webmaster" who wrote the linked page of demos is linking to ASX files, which in turn link straight to the self-named mp3 files on the server.
In case the direct "save/play" links do not work with your browser and OS, just replace the asx with mp3, and enjoy.
user@host$ diff
why eliminate musicians? is there no room in life for art anymore? also, if those "unskilled morons" flooded the job market, where would you go? musicians put hard work into their jobs too. although, pop idols today don't seem to. sorry, i had an offtopic rant as well. I'd be glad to see less petty musicians as well, but i don't see why all musician should be eliminated.
You clearly don't understand. This is the last piece in the puzzle of completely eliminating musicians. We have had a drum machine to replace you for a while, electronic instruments and MIDI.
Now we can finally get rid of these whiny musicians, always complaining about "I need to feed my family" and "I'm a professional and should be paid like one."
Close...this is just the piece of the puzzle that gets rid of those money-grubbing vocalists. Combine this latest development with a computer composition engine and we won't need any musicians at all!
Now I can finally make some real sounding prank calls for once! Oh the joy!
Although...most idiots out there fall for the midi based voices anyhow...
/* sig */
I have several vocoders, software and hardware, and it is obviously a very different creature if you took the time to listen to the demos. Also, a proper vocoder needs an carrier, and it does not generate the vocal qualities. It merely functions as a formant filter (where the constanants are provided by the vocalist, and the pitch by usually a synthesizer).
Frankly, this thing just really needs a good plug-in format, like TDM or VST and it will be a gold-mine- not unlike those god-awful pitch-correction plugins that were reputed to give Cher that plastic effect to her voice (like she doesn't have enough plastic as it is). As a standalone app, it is doomed.
Those that suggest you "dance like no one is watching" really want to see you make a complete fool of yourself.
Something similar was done by Alexei Shulgin in 1998, on a 386. Sure, he did it by writing the phonetic instructions for the speech synthesis engine by hand, but Yamaha's solution is just a much more sophisticated (and better funded) version of that.
Check out 386DX, his band project. Which includes the 386. That only has 4 MB of RAM and also has to do visualizations and MIDI sound at the same time.
I've had the fortune of seeing him perform live in Linz as well as chatting with him a little, and he came to our school for a lecture. He has a few brilliant projects, maybe you might like WIMP which he developed with a friend.
This just brings us another day closer to when engineers start having groupies.......
I've been playing for about 15 some odd years. I have both an acoustic set and some Roland V drums. The V drums sound very very real if you have them connected to a good PA or nice a recording set up. If you don't have good equipment, and you don't sound check this shit out of them before you play, they sound kind'a fake.
With V drums you can virtually alter drum woods, alter cymbal metals, alter instrument sizes, switch drum heads (pin strip, coated, etc), place tape or foam on your heads, use brushes, grab and mute cymbal crashes, add a custom levels to the snare gate, tight or loosen head, change room acoustics, stick stuff in the bass drum, etc etc.
If you know what your doing you can make them sound real and imperfect just like an acoustic set. However, you -need- a good PA, and you need to sound check the shit out of them before you play (quick set up, long sound check).
"Things are more moderner than before- bigger, and yet smaller- it's computers-- San Dimas High School football RULES!"
No, that's alright. I'm in a band that used Fruityloops on a laptop to provide drums until we managed to find a human drummer, but it kept terrible time - it was always out of time with the rest of the band by the end of the song...
++ Say to Elrond "Hello.".
Elrond says "No.". Elrond gives you some lunch.