Digital Mouths, Synthetic Faces at MIT and Lucasfilm

← Back to Stories (view on slashdot.org)

Digital Mouths, Synthetic Faces at MIT and Lucasfilm

Posted by chrisd on Wednesday May 15, 2002 @02:12PM from the as-if-memory-wasn't-flawed-enough dept.

jfengel writes "Two separate articles about generating faces automatically. From the Boston Globe, there is a story about MIT scientists putting words into somebody's mouth by splicing together footage. In the samples, I couldn't tell the difference between the synthetic footage and the same person really saying the same thing. (Though it's a little hard to tell at only 81kbps video). And Wired as a lengthy article about generating purely synthetic faces at Lucasfilm. It discusses some of the difficulties in getting it right."

15 of 150 comments (clear)

Min score:

Reason:

Sort:

In related news... by Fat+Casper · 2002-05-15 14:15 · Score: 3, Funny

Forrest Gump and Max Headroom will be hosting a morning show starting next month.

--
I spent a year in Iraq looking for WMD and all I found was this lousy sig.
1. Re:In related news... by sharkey · 2002-05-15 14:24 · Score: 3, Funny
  
  Forrest Gump and Max Headroom will be hosting a morning show starting next month.
  
  "L-l-l-i-f-f-fe is l-l-like a box of ch-ch-ch-ocolates. You n-n-never kn-kn-know what you're gonna get-get-get."
  
  --
  
  --
  "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
great news! by larry+bagina · 2002-05-15 14:23 · Score: 3, Funny

this is great. Maybe the lip-syncing in Britney Spears' videos won't be so obvious.

--
Do you even lift?
These aren't the 'roids you're looking for.
1. Re:great news! by cymraeg · 2002-05-15 14:49 · Score: 3, Funny
  
  I don't know about you, but I'm not looking at her mouth...
  
  --
  you don't have to outrun the bear, just the slowest person in your group.
Damn the ethics- full speed ahead! by Fat+Casper · 2002-05-15 14:24 · Score: 5, Interesting

The last thing we need is for the ethical arguments to shut down any of this public research. The uses of it are ethically scary, but I'd feel a lot better with MIT pushing forward with the research than any company doing it. The school will keep people updated on where they've gotten with it, and the world will be better able to judge how much to believe video. It'll be really interesting to see what constitutes proof in 20 years. If the research is done in the open, we might even still be able to believe in it.

--
I spent a year in Iraq looking for WMD and all I found was this lousy sig.
1. Re:Damn the ethics- full speed ahead! by DrSkwid · 2002-05-15 14:38 · Score: 3, Funny
  
  It's easy to spot when a politician is lying - his lips are moving. boom boom!
  
  they may as well be spared having to turn up in person to read today's lies out.
  
  --
  There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
George W. Bush by Devil's+BSD · 2002-05-15 14:25 · Score: 3, Funny

Read my lips: Strategerie means no new taxes. P-o-t-a-t-o-e.

--
I'm the Devil the Windows users warned you about.
subsurface scattering and the bssrdf by kawaldeep · 2002-05-15 14:29 · Score: 4, Informative

henrik wann jensen is developing some of the most usable algorithms for skin and other translucent materials. He gave a talk last month at Cal as a prospective faculty member. It was fairly impressive.

his home page

rendering skin

rendering smoke

--
replace 'berserkeley' with 'berkeley' to respond via email.
Rendering of surface is also critical by Zergwyn · 2002-05-15 14:36 · Score: 3, Informative

I work with 3D design, and can certainly attest to the difficulty in mimicking people. The huge numbers of muscles and tiny details of morphology that make up a human face is a tremendously important part of making realism. However, ultimately a surface is needed, as it is, in the end, the light that is reflected back to our eyes. How real the surface looks is a required part of the equation, and some of the new advancements being made in rendering are quite exciting to me. For instance, many older raytracers only handle how light directly reflects off the surface of a texture. But in reality, things like human skin are not opaque, but are slightly translucent. The light passes into the skin, reflects off things like blood vessels, and exits again. Light also behaves in other interesting ways in certain situations. And some effects are simply dependent on computational power. Radiosity, for instance, can make scenes look much more realistic, but is too cycle-hungry to be used all the time in full-screen video. Being able to set these sorts of properties without having to program complex custom render modules for each movie will go a long way towards making artificial people more common.
Does this mean... by jesser · 2002-05-15 14:41 · Score: 3, Interesting

we'll soon see a video of Dan Rather singing Rocked by Rape?

--
The shareholder is always right.
Not that hard to tell by MikeLambert · 2002-05-15 14:41 · Score: 4, Interesting

I don't think I'm special in this respect, but I didn't find the example clips that were given too hard to discern.

Look for enunciation of certain latters such as P and M, and you should be able to tell the difference. The generated image gives a sense of moving the mouth but not enunciating the words clearly. Almost as if she is gliding over the words. With the real movie, however, you can see the woman completely changing her mouth formation to form the sounds required to pronounce the words.
Uses in classic sci fi literature & entertainm by edo-01 · 2002-05-15 14:44 · Score: 5, Interesting

This reminds me of the novel Stainless Steel Rat for President by Harry Harrison. In it Slippery Jim DiGriz is rigging an election, and at one point cuts into the local news broadcast and replaces the newsreader with a digital version that reads the results he wants. It was written 10, 20 years ago? Seem almost prescient considering what happened in Florida in 2000 :-)
Another, more benign use of the tech could be in entertainment. There was that episode of Star Trek: Deep Space Nine where they integrated the actors in with footage from the classic ep, Trouble With Tribbles. Great fun, but they were limited to using footage that exisited from the original series for intereacting with Kirk, Spock et al. Imagine being able to track Shatner's 60's face onto an actor and use this tech to lipsync 21st century Shatner's dialog. Best. Time Travel. Episode. Ever.
And I don't even like Trek that much :-)
Re: LOTR with your choice of actors.... by texchanchan · 2002-05-15 16:01 · Score: 5, Interesting

the year is 2095. the reviewer speaks:

Let me begin by once again repeating the truism: no video whatsoever can match the scenes as they appear to your imagination during a simple, unaided reading of the three volumes of Tolkien's original text.

With that out of the way, I will say that my own favorite among the video versions is the recent blockbuster edition, followed by the "Midlands" OSc 2072 dist (tuned 2,-1,4,0); and after that, the 2001-2003 movies using the Gibson/Taylor overlay. This review concentrates on videos; I will leave VRs for another day.

There is no need, at this remove, to cite the failings of the Bakshi anime (1978) or Jackson's groundbreaking 2001-2003 live action movie.... However, when WWM re-released the "long" version on tab with a selection of overlays, including Mercer/Tran/Lopez and Gibson/Taylor, the movie was transformed from a mere classic to a paradigm of style. Its effect on a generation resembled the effect of the original books on the "Sixties Era" (roughly 1964-1972). The wildly popular M/T/L overlay, its unearthly beauty toning down the somewhat brutal original video, went straight to the heart of the virals.

At the same time, the first underground OSc version, "OS-LOTR", was in process. Remember that this was before the Hurst case and copyright law was still in the postmillennial phase. Nevertheless, thousands of people participated. By any standard, the first version was pretty primitive. The base disappeared during Hurst. Only 18 snaps survive; ...[and they] show a wide range of competence. Some scenes, such as //this//, are nothing short of brilliant. However, I can't agree with those who believe that a large quantity of sublime art was lost. OSc was in its infancy, and the original consensualists tended to be technical personnel with vivid but unsophisticated imaginations. I have seen all 18 remaining snaps of OS-LOTR, and am convinced that nothing of value was lost to the Tolkienist or to the viewing public.

The first legal OSc version ("OurRing") is also available at universities, but is not worth the casual viewer's time. The maintainers provided no guidance. Story elements of an unsavory nature, having nothing to do with the original books, found their way into the base. Tuning was in its infancy: OurRing provides only five settings in each of three dimensions. The project became overlarge, and never gained popularity outside a hobbyist community. It is of historical interest only, as is the short-lived "Bakshi", based on the anime, begun and closed within a year after OurRing.

"Midlands", on the other hand, became a classic within weeks of startup. It derives most of its visual imagery and pacing from the centennial remake, but retains none of the bizarrer elements. A comparison of snaps is extremely revealing. The earliest still archived (two days in) is almost an exact copy of LOTR-100. In one week more, participation skyrocketed by 6000 percent, and the nine-day snap contains none at all of the odd politico-academic coloration. Note the gradients in this //graph// of the isologs: precipitous in the higher dimensions, almost flat in D1 through D5. Midlands is universally available and is the vehicle through which most young people first meet Tolkien. It is still maintained, although the classic version stabilized in 2072.

Midlands is far more tunable than OurRing. The original tuner, which is part of the OSc v. 5.4 kernel, allowed for 15 dimensions. Addicts and purists apply the 500-dimension Gordon tuner. I have viewed several allegedly "perfectly" Gordon-tuned versions and could see no difference at all. These decimal-place variations invisible to anyone else fuel quite vitriolic disputes in the hobbyist community.

"Zealand" and "Hildebrandt", Midlands' two nearest competitors, have a much smaller following. Zealand is of course based on the 2003 video. Hildebrandt is experimental; it combines OSc and overlay technologies. There is no dist--as the maintainer states in true twentieth-century fashion, it is intended to be a "work in progress", to be "as dynamic as the events it portrays". This can lead to surprises if you view over a period of days instead of capturing the whole thing at once. Its consos also tend to be outside the standard demo.

Last year's remake is, in my opinion, the best of all. Yes, it condenses the story, but this is not a bad thing, as anyone will agree who has played one of the realtime VRs. Stern's directorial imagination could not possibly be closer to Tolkien's original vision. There is, of course, no truth to the rumor that he is a clone of Tolkien made for the purpose. ....
Re:FF? by vitalidea · 2002-05-15 16:25 · Score: 3, Informative

It's easy to build a cartoon of a human but it is difficult to animate a real person that you can compare videos with.

Huh?! I work as a Sr. VFX guy, and CGI (Computer Generated Imaging) for facial animation is one of the most complex things to do!

Basically, there are so many muscles in the face and so many nuances that it is very difficult to emulate a realistic face. Chris Landreth is a director at Alias|wavefront with whom I had the "pleasure" of working with. His entire focus has mainly been with facial animation. And even with his talent, facial animation still doesn't look 100% realistic.

Check out the book: Computer Facial Animation to get a glimpse at the mathematics, anatomy, and other technical hurdles being overcome in this arena.

--

Vital Idea
Not to worry by jcsehak · 2002-05-15 17:03 · Score: 3, Insightful

I mean, this is pretty cool and all, but there's no reason to start worrying if someone's gonna put words in your mouth anytime soon. First they'd need:

1. a few minutes of footage of you saying stuff that has the full range of mouth movements directly into a camera.

2. an audio recording of you actually saying what it is that they want you to say. It's possible to cut and splice seperate recordings together, but 99% of the time, differences in the sound space would make it obvious that the recording was spliced together.

And then after that, all they'd have is a video of you saying the thing and staring like a zombie into the camera.

It's cool in theory, but I think Hollywood has done a lot better job at achieving better results.

Mmm, Gummi Venus De Milo...

--

c-hack.com |