Synthesized Singers

← Back to Stories (view on slashdot.org)

Posted by michael on Sunday November 23, 2003 @04:53PM from the max-headroom dept.

ctwxman writes "Over the past few decades, advances in computer hardware and software have eliminated many jobs... some technical, some menial, but none artistic. As an on-camera performer in television, I've always was believed that I was 'bulletproof' as far as replacement through technology was concerned. Not so fast. Recently, The Sinclair television stations began using 'central casting' to bring news and weather anchors from a central location (near Baltimore) to the local outlets. Still, real people are needed, just not as many. But now, even real performers may be replaced. The New York Times (inhalation of airplane glue required) reports on a new technology which allows synthesized singers to sing. Imagine having a singer with a world-class voice at your disposal, any hour of any day. She's just standing at the ready, game to perform whatever silly song you might make up for her: a ballad about her love for you, a tribute to your best friend's golf game, a stirring rendition of the evening's dinner menu. Scary."

18 of 383 comments (clear)

Min score:

Reason:

Sort:

Scary? by FatRatBastard · 2003-11-23 17:00 · Score: 2, Insightful

She's just standing at the ready, game to perform whatever silly song you might make up for her: a ballad about her love for you, a tribute to your best friend's golf game, a stirring rendition of the evening's dinner menu. Scary.

Imagine a composer getting up in the middle of the night, going to his newfangled magical "keyboard" and whipping up an entire symphony without the need for a full orchestra..... ooooh... scary.

Man, for a bunch of geeks sometimes the /. crowd come off as downright luddites.
1. Re:Scary? by cgranade · 2003-11-23 17:25 · Score: 4, Insightful
  
  This isn't about tech. It's about the need for human creativity and artistry being diminished. I, as a geek, like tech to the extent that it reduces the tedium and frees us to be creative. This is realizing that the very thing we love can be used to work against us. And that is the realization that is truly and deeply scary.
  
  --
  #define DRM chmod 000
2. Re:Scary? by John+Miles · 2003-11-23 17:39 · Score: 2, Insightful
  
  It's about the need for human creativity and artistry being diminished.
  
  Aw, c'mon. They said the same thing about player pianos.
  
  I, as a geek, like tech to the extent that it reduces the tedium and frees us to be creative. This is realizing that the very thing we love can be used to work against us. And that is the realization that is truly and deeply scary.
  
  This sort of artistic Luddism has no place in today's world. If you're worthy of the self-applied title "geek," you'll find ways to use this technology to create sounds and effects, maybe even entire musical genres, that were never possible or practical before.
  
  This isn't going to put Loreena McKennitt out of business, but I could see it giving Enya the willies.
  
  --
  Dahlmann tightly grips the knife, which he may have no idea how to use, and steps out into the plain.
3. Re:Scary? by Anonymous Coward · 2003-11-23 21:21 · Score: 1, Insightful
  
  This is realizing that the very thing we love can be used to work against us. And that is the realization that is truly and deeply scary.
  
  I never post here, but I just had to respond to this.
  
  Please. That same, impotent argument can be made against any tool, any technology. And it's always just as pointless and futile. Yes, it's possible that this technology may cost jobs; every new technology does. But they also create new jobs at the same time. But beyond that, there are two other important points to consider that are more specific to this particular technology:
  
  Since the advent of recording and the mass distribution of recorded music it's been as much about who as what. Consumers often care more about who is creating their entertainment than they care about the quality or substance of the entertainment itself. If that wasn't the case, our culture certainly wouldn't have the star-crazy media landscape it does today: Where it must be some kind of odd coincidence that massively popular dreck is always presented by young, attractive people with intriguing and often scandalous personal lives. But this rule still holds true for music that isn't massively popular. Computers have been able to draw pictures for a long time, but they haven't been displacing human artists.
  
  Secondly, this is an *amazing* tool for people who create music. Not because it allows them to kick the lead singer they've always hated out of their band, but because it opens new possabilities. Wow, I can add a quick background vocal in a few minutes. Or hey, I can whip up a quick prototype and see how the tune sounds before I use up my singer's time rehearsing and refining the track. I can use this for all sorts of interesting applications. I can't sing, but I can track songs in Reason or Cakewalk. Now I can add some cool vocals. Wow, I've just added a new dimension to my one-man-band. For the independent home-recording artist in particular, this is an exciting new technology.
  
  So drop the Aldous Huxley BS (no offense to a great author intended) and just accept it. This is here, it has lots of valuable applications, it's never going to replace flesh-and-blood humans, which are inherently social entities.
  
  There's nothing "scary" about it.
This isn't really NEW by Arker · 2003-11-23 17:06 · Score: 4, Insightful

It sounds like they've gone to much greater lengths on this project than any I'm aware of in the past, but the basic thing here has been out for a long time. Most any keyboard you can buy has human voices. A single sample can be spread out over your keyboard and sing any pitch you want, even glides and stuff, pretty easily. But it's generally fairly rudimentary - 'ahh' and 'ohh' or similar, you can actually do some nice sounding background vocals but not sing verses.

From the description in the article, this 'new' thing is really just an inevitable extension of that - they spend about 5 days with a singer, recording her singing many different phonemes and different effects, so that you can then piece together the words to your own song and put it to your own melody in her voice. And, for the moment, they're still aiming at producing background vocals, just more complex ones with the ability to do actual lyrics instead of a oohs and aaahs. Could be kind of cool, but it definately doesn't sound like a 'quantum leap' - just an extension of long-existing technology. I've been expecting to see someone do this for well over 10 years now, ever since I first got to play around with a digital synthesizer.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Friends don't let friends enable ecmascript.
1. Re:This isn't really NEW by iabervon · 2003-11-23 19:38 · Score: 4, Insightful
  
  It might actually be easier to do singing than normal speech, because singing replaces intonation, tempo, and some of stress, all of which otherwise have to be determined from a syntactic and semantic analysis of the text in order to really sound right. There have been people who have learned to sing songs in languages they didn't know at all, while I have yet to hear of someone giving a lecture (as convincingly) in a language they didn't know.
Who would do this? by Anonymous Coward · 2003-11-23 17:09 · Score: 2, Insightful

They said that they needed someone to sing 5 hours a day for a week...so that they could make him/her obsolete? In what life would you make yourself obsolete in your chosen profession for a weeks pay?!
Re:This is not new technology... by farrellj · 2003-11-23 17:35 · Score: 1, Insightful

I would remove Christina from that list, that chic can sing! At 5 years old that chic could sing better than 90% of the people on the charts today.

I'm not a fan of her music, but credit where credit is due!

--
CAN-CON 2019 - Ottawa's only book oriented Science Fiction Convention! October 18-20, Sheraton Hotel, Ottawa, Canada h
Not really buying it. by Reteo+Varala · 2003-11-23 17:49 · Score: 3, Insightful

I don't know about you guys, but personally, I don't think there's that big a risk of performers really being replaced. At least, not en toto.

Now, "popular music" notwithstanding, it takes more than just hitting the right notes and holding them to make music. This applies muchly to instruments, and doubly so for voices.

First of all, just any combination of notes are not what makes music... artists have to play with hundreds of variations of tones to find "that perfect sequence," the collection of tones in a specific order, length, and style that produce a pleasing arrangement. Once that has been found, further arrengments of music are patterned and fitted to that sequence. You can have a synthesizer, but someone's still programming it... and not with numbers, either.

Voices are many times more complex than musical instruments, because not only is there tone, volume, and length, but there is, for lack of a better term (in my own knowledge), shape of the sound. The artist Karl Jenkins (of "Adiemus" fame) used singers and a nonsensical language specifically to capitalize on that very set of qualities... using the human voice and speech as another "Instrument," rather than as lyrics.

Now, you could synth using the phonemes and vocal qualities of a singer, but ultimately, without the feeling behind the voice, no amount of coding will put any life to it.

--
The Penguin Producer
1. Re:Not really buying it. by Reteo+Varala · 2003-11-23 18:44 · Score: 4, Insightful
  
  I can see where you're going with that argument, and to be quite honest, I don't put much faith in AI, either. The best example of what I think about it is based in an old Infocom game, "A Mind Forever Voyaging."
  
  Artificial intelligence isn't truly artificial sentience until it has the capability of experiencing it's own existance. Living organisms that posess such self-awareness have thousands of input devices, known as nerve receptors, which alert them to the presence of anything to their immediate position. By this, one must learn to recognize the receptors' data. After a long time of learning the abilities of those receptors, and their cousins, the motor nerves (which activate muscle groups for the purpose of movement), self-awareness becomes available, because everywhere on the human body has such receptors, and what doesn't isn't really the human body.
  
  With this knowledge, the person then begins to learn what is a pleasant experience to those receptors, and what is pain. With pleasure/pain, over time, the person begins to develop affections and apprehensions, which give way to full emotional response. Some additional functions in the body help this along, such as endorphins which improve the pleasure state in the brain, and thus, the body... further enhancing the personal experiences.
  
  Now, a computer would have to have MASSIVE amounts of electric and processing power to activate and stimulate such receptors, should miniturization ever allow such devices to be manufactured cheaply and at such quantity to compare with the human's nervous system. And without that system, a computer cannot develop the deep, intricate levels of affection/apprehensions that would allow for emotional responses.
  
  Add to this the fact that a computer would have to be able to process all of this in realtime, over approximately 12-18 years to truly mature into a true artificial sentience.
  
  Now, what does this have to do with music?
  
  Music is all about experience. People write what they know, and they sing how they feel. Experience is a byproduct of sentience, which most definitely means that computerized music, which can please and FOOL audiences, is yet a long time in coming.
  
  --
  The Penguin Producer
Re:Scary, or progress? by Anonymous Coward · 2003-11-23 17:56 · Score: 4, Insightful

What is wonderful about this is that (initially at least) it will devalue the type of generic boy/grrl band trash music which so saturates the current pop market. That's got to be scary for organisations like the RIAA, who actively market the interchangeable swill.

When pitch perfection and standardised voices are available from a $300.00 software package, music made by people with interesting voices and offbeat musical philosophies will be that much more valuable.

After all, it seems unlikely that there'll be a software Tom Waits or a digital Johnny Rotten in our immediate future. Punk revival anyone?
There are more artists than performance artists by LuxFX · 2003-11-23 18:15 · Score: 5, Insightful

Over the past few decades, advances in computer hardware and software have eliminated many jobs... some technical, some menial, but none artistic

Ever hear of a cel animator?

--
Punctanym: alternate spelling of words using punctuation or numerals in place of some or all of its letters; see 'leet'
Walk before you run, talk before you sing by shirai · 2003-11-23 19:19 · Score: 4, Insightful

Considering that normal speech synthesis has not been done well, singing seems to be hard. Already people can take a bad singer and turn them into a good singer but complete synthesis seems unlikely.

Furthermore, this tech is likely not going to be what you think. What makes a singer good is their INTERPRETATION of the notes. Even with proper synthesis, at its best, it will be like computer animation. It could be very good and maybe even perfect but it would be TIME CONSUMING. Watch the making of Making Nemo on the DVD to get an idea of how hard it is to understand emoting.

You would really need to spend a large amount of time figuring out how to make the voice sound EMOTIONAL.

--
Sunny
Be my Friend
1. Re:Walk before you run, talk before you sing by CmdrGravy · 2003-11-23 21:03 · Score: 2, Insightful
  
  "You would really need to spend a large amount of time figuring out how to make the voice sound EMOTIONAL."
  Not if you were wanting to create a virtual Kylie, Atomic Kitten, Gareth Gates, Westlife, Robbie Williams, Christina Aguilerra, Madonna, Cher, Britney Spears, J-Lo, Random Pop Muppet you wouldn't.
Re:Macintosh speech synthesis by silentbozo · 2003-11-23 19:25 · Score: 2, Insightful

From the samples I've heard, Apple TTS has only made incremental improvements. We're still talking mostly about stringing pre-recorded phonemes together, guided by a semi-intelligent system for decoding written text into the audio equivalent for the speech engine.

What I'd like to see is physical modeling of the speech apparatus - lungs, vocal cords, mouth, tongue, teeth, lips, where you can vary parameters such as articulation, etc. We have the computational power to drive such a simulation, witness the brain-dazzling graphics $200 3d cards can pump out. Couple that on-demand speech engine with a decent text to speech translator, and say goodbye to poor phoneme transitions and inappropriate articulations.

Of course, this technology would probably find it's first applications in interactive 3d porn... Grant applications anybody?
Re:I hate to shoot your ego, but... by littlerubberfeet · 2003-11-23 19:39 · Score: 2, Insightful

Ouch.

But you are mostly correct unfortunately. Most Discovery/TLC programs use library or "needle drop" music. It is susually pretty crappy. The studio I work at writes new music for shows, and taylors it to scenes. There is a cymbal crash when "Dr. Brady" gets bitten by a viper. A bowed gong plays when the rodent gets squeezed to death by an Anaconda. This is a simplification, but it is all possible with computers, instead of a 1,000 dollar gong. But yes, most TV music is utter crap. I will shut up about now...

--
Sig (appended to the end of comments you post, 120 chars)
Re:Jazz by jellomizer · 2003-11-24 00:15 · Score: 2, Insightful

That is a good point. So far as I see it Jazz has seemed to be the pinnacle of good intelligent music (lyrics aside). Forms after Jazz have seemed to become more minimalist and simpler to produce, and much easier to be analyzed using music theory. Heck techno music can be analyzed by a cartoon character. Jazz as an art form is a very complex and demanding performance where every player is allowed their self expression but yet they need to work in a team to make sure that they are fitting in with the other players. It requires a strung understanding on what sounds good, a keen ear, and understanding of the other players and their own style. These are a lot of features that cannot be broken down so mathematically. Unlike most other forms of music Jazz requires real teamwork. Classical music requires perfection which a computer can reproduce, Rock, well more Pop rock since a lot of older Rock had more of a Jazz element in it, Is a very simple type of music with a simple rhythm followed by simple cords, with a solo on top of it, Just the solo although can be complex just has to fit into the simple music behind it, so it is still easy for a computer to follow this because it knows it.

--
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
Re:At last, the ultimate weapon against the RIAA by Zeriel · 2003-11-24 04:06 · Score: 3, Insightful

You're misreading the bolded statement.

Basically, that means that the copyright owner must have released the song for sale in some form...if it's on an album you could have bought at some point, the artist HAS to let you cover it for the stated fees--that's the point of compulsory licensing, the songwriter doesn't get a choice.

The clause you bolded is to prevent me from doing something like singing a previously unreleased Johnny Cash (for example) song without permission by citing the compulsory licensing law.

--
"America has done some terrible things. But I know that Americans don't cheer when innocents die." -Dave Barry