Pencigraphy: Image Composites from Video
jafuser writes: "Prof. Steve Mann (of cyborg fame) has a detailed technical description on his site that demonstrates a method of transforming video into a high resolution composite image. Pictures are seamlessly mosaiced together to form one larger picture of the scene. Portions of the video that were "zoomed in" will result in a much clearer region in the final picture. I wonder if this could be used in a linear sequence to 'restore' old video to higher resolutions? It's on sourceforge; download and play!" Mann has been experimenting with such composites using personal video cameras for years.
Video Orbits of the Projective Group:
A New Perspective on Image Compositing.
Steve Mann
Abstract
A new technique has been developed for estimating the projective (homographic) coordinate transformation between pairs of images of a static scene, taken with a camera that is free to pan, tilt, rotate about its optical axis, and zoom. The technique solves the problem for two cases:
* images taken from the same location of an arbitrary 3-D scene, or
* images taken from arbitrary locations of a flat scene.
The technique, first published in 1993,
@INPROCEEDINGS{mannist,
AUTHOR = "S. Mann",
TITLE = "Compositing Multiple Pictures of the Same Scene",
Organization = {The Society of Imaging Science and Technology},
BOOKTITLE = {Proceedings of the 46th Annual {IS\&T} Conference},
Address = {Cambridge, Massachusetts},
Month = {May 9-14},
pages = "50--52",
note = "ISBN: 0-89208-171-6",
YEAR = {1993}
}
has recently been published in more detail in:
@techreport{manntip,
author = "S. Mann and R. W. Picard",
title = "Video orbits of the projective group;
A simple approach to featureless estimation of parameters",
institution = "Massachusetts Institute of Technology",
type = "TR",
number = "338",
address = "Cambridge, Ma",
month = "See http://n1nlf-1.eecg.toronto.edu/tip.ps.gz",
note = "Also appears {IEEE} Trans. Image Proc., Sept 1997, Vol. 6 No. 9",
year = 1995}
(The aspect of the 1993 paper dealing with differently exposed pictures to appear in a later Proc IEEE paper; please contact author of this WWW page if you're interested in knowing more about extending dynamic range by combining differently exposed pictures, or getting a preprint.)
A pdf file of the above publication, as it originally appeared, with the original pagination, etc., is also available.
The new algorithm is applied to the task of constructing high resolution still images from video. This approach generalizes inter-frame camera motion estimation methods which have previously used an affine model and/or which have relied upon finding points of correspondence between the image frames.
The new method, which allows an image to be created by ``painting with video'' is used in conjunction with a wearable wireless webcam, so that image mosaics can be generated simply by looking around, in a sense, ``painting with looks''.
Introduction
Combining multiple pictures of the same static scene allows for a higher ``resolution'' image to be constructed.: Example of image composite from IS&T 1993 paper (click to see higher resolution version). In the above example, the spatial extent of the image is increased by panning the camera while mosaicing and the spatial resolution is increased by zooming the camera and by combining overlapping frames from different viewpoints.
Note that the author overran the panning to appear twice in the composite picture (this is an old trick dating back to the days of the 1904 Kodak circuit 10 camera which is still used to take the freshman portraits in Killian court, and there are several people who still overrun the camera to get in the picture twice). Note also that the author appears sharper on the right than on the left because of the zooming in (``saliency'') at that region of the image.
Note also that, unlike previous methods based on the affine model, the inserts are not parallelogram-shaped (e.g. not affine), because a projective (homographic) coordinate transformation is used here rather than the affine coordinate transformation.
The difference between the affine model and the projective model is evident in the following figure:
For completeness, other coordinate transformations, such as bilinear and pseudo-perspective, are also shown. Note that the models are presented in two categories, models that exhibit the ``chirping'' effect, and those that do not.
Examples
1. Extreme wide-angle architectural shot. A wide-sweeping panorama is presented in a distortion-free form (e.g. where straight lines map to straight lines).
2. My point of view at Wal-Mart Click for medium-resolution greyscale image; a somewhat higher resolution image is available here; a much higher resolution version of this same picture, in either 192 bit color (double) or 24 bit color (uchar), is available upon request).
3. ``Claire'' image sequence Paul Hubel aims a hand-held video camera at his wife. Although the scene is not completely static and there is no constraint to keep the camera center of projection (COP) fixed, the algorithm produces a reasonable composite image.
4. An ``environment map'' of the Media Lab's ``computer garden''.
5. Head-mounted camera at a restaurant
6. Outdoor scene with people, close-up (Alan Alda interviewing me for Sci.Am "FRONTIERS").
7. National geographic visit
See a gallery of quantigraphic image composites
Obtain (download) latest version of VideoOrbits freesource from sourceforge
or if you can take a look at an older version, (download of old version) or if you don't want to obtain the whole tar file, you can take a look at the README of the old version. bugs, bug reports, suggestions for features, etc. to: mann@eecg.toronto.edu, fungja@eyetap.org, corey@eyetap.org
My original Matlab files upon which the C version of orbits is based (these in-turn were based on my PV-Wave and FORTRAN code)
For more info on orbits, see chapter 6 of the textbook. Steve's personal Web page
List of publications
Cretin - a powerful and flexible CD reencoder
Come on man, would u think that guy was a prof or some geek.. Looms to me he is from the CIA... check out his pic ;))
I doubt this could be used to "restore" old film or videos to a higher resolution. From my understanding, that would require many different angles and positions. You can't just take a single camera position and improve the picture based just on that picture itself.
All the video footage I have is only close-ups with no camera movements to speak of.
Je t'aime Stéphanie
Juding from the description found in that article, I believe that it is possible to enhance old video to higher qualities. However, the quality of color sometimes cannot be enhanced no matter what. Unless one has access to the original film reel, it is unlikely that any sort of improvements could be made; video copies are utterly useless in this manner. Anything from before 1990 in VHS is much worse quality, case in point being the John Woo film A Better Tomorrow. The problem with these videos is that not only is the quality blurry, but the color blending is off and sometimes exceeds the lines it should, creating distorted images. I've seen this in a lot of older movies... I wonder if there's a way to correct this. :)
At any rate this looks very promising indeed... it'd be cool to see some of the old classics in better quality.
All your hi-res video are belong to us.
"PC Load Letter? What the $@#% does that mean?!"
Can we be sure his head didn't explode?
-=- I heard rumours about an OS called "Social Life", heard of it? Is it stable? -=-
Yes, he's managed to do what every genius in the FBI, CIA, and NSA have been doing for years with a few lines of code.
Idiot.
"Good things don't end with eum, they end with mania or teria." - H. Simpson
this was on memepool yesterday. gr.
What other apps could this be used for? Sure, it's fun now, but what could it do for humanity?
Surgery Camera? there's already some out there, but they have very distorted views from the lens and displays
Security cameras? They could make a picture easier to interpret
Movies? They'd look a lot different, like a Fear and Loathing look. but it'd be cool!
improved pr0n? w00h00!
Other ideas? Reply here!
The new method, which allows an image to be created by ``painting with video'' is used in conjunction with a wearable wireless webcam, so that image mosaics can be generated simply by looking around, in a sense, ``painting with looks''.
Just in case anyone was wondering - this wasn't being done in anything close to real-time the last time I checked. There's a cluster in Prof. Mann's lab which is dedicated to compositing these images (my cube is in the next room).
Still an interesting project. The affine transformation approach has been well-understood for some time (you do a brute force and ignorance test of promising-looking affine transformations [rotations and scalings] to find one that matches the new image to the old). As far as I can tell, he's doing the same thing with a different coordinate system that has a bit less distortion.
This can of course be done (compositing multiple images to create a large image) but the problem is that each lens appears to see a slightly different image (much like human vision with two eyes) and as such the stereo effect is present. You can create an image but it will not work perfectly for all cases. If the scene is far enough away the draw backs will be minimal but as objects get closer this will have effects. What would be more interesting would be to use the dual cameras to generate two video feeds that could be piped into a HMD (head mounted display) with two displays (one for each eye) and then the stereo effect would produce a 3D view for the camera source increasing realism. The larger image would let you see more just don't expect 4 640 x 480 images to create a seemless 1280x960, you will need some overlap and the 4 images will not be from the same perspective so will always look like 4 images pasted together.
First he whines about there being spy cameras everywhere (IEEE ISWC 2001 Zurich) and then he does work to make them more effective. What's the deal?
Video Orbits of the Projective Group.
here
Candygram for Mongo!
Actually, No one at any of those TLAs has thought of doing this.
I wonder if using this on the Zapruder film will show anything interesting.
This new development of high-res composite images, along with the series of volcano eruptions that have been occurring in Japan, is clearly another sign that Linux will triumph over Microsoft, and who knows, maybe one day, even over Apple! That's not all; just as Spider-man is a pinacle of the American patriotic awakening against the forces of Axis of Evil risen out of the ashes of post-9/11, this development is a milestone that sets the end of Lucas Art's Star Wars empire, making the way for Lord of the Rings as a ray of light against Lucas' seemingly everlasting hold on nerd culture. Please, do mod me as troll/flame bait.
All of the different recordings for a given movie are commensurably low-quality, but wouldn't it be great if you combine the best aspects of each (a "greater of goods") to generate one sharp quality movie. Testing it should be a little easier since you could use the rectangular silk-screen to calibrate the images. Food for thought.
-jc
This in a very interesting and inspired use of technologies, that is giving some great results. However, one thing that is not bing taken into account here is that video is shot over time - subsequent frames of a scene represent a change in a scene according to how things progress over time. Thus for anything other than a static scene (which is not of too much use) this can cause problems.
Take for instance the example on the main page of this (if it's not slashdotted already), the two swimmers standing ready to dive in. In a real-orld situation, by the time the first picture of th swimmer on the left was taken, the one on the right may have already dived in - when it comes to take that one's picture, he would be already swimming away. Hence if these images were composited, it would look like one dived in while the other was still on the blocks.
Possibly of artistic interest, but otherwise a bit of an annoyance in what is definitely a very cool use of technology. It's interesting that after 100 years or so, we could be back at the point where someone says "hold still for a few seconds, i'm going to take a picture".
Fross
The site's very /.'ed, but I believe what's done is similar to a technology used by security firms and the military. Essentially, when you take a picture of a given object/scene, the "true" resolution (comprised of each individual photon bouncing off the objects and striking the lense) is always downsampled, to varying degrees, depending on the resolution of the camera. However, if a camera is moving, while each individual frame will be of equal resolution, the particular data that each is storing will contain differnt information about the object/scene. If, for example, the camera is pointed at a grayscale gradient that's so small it only occupies one pixel, that pixel might appear white, black, or somewhere in between depending on the exact orientation of the camera, and in a regular video would probably look like some indistinct blur between these colors. With analysis, the changes can be examined and used to create an image that accurately portrays the gradient.
:).
Traditionally, this has only been done with motionless cameras, it sounds like what this professor has done is to extend these capabilities to moving and zooming video, which is extremely cool (and I really want to check out his site, so everyone else stop going there
Mod me down, and I will become more powerful than you can possibly imagine!
Read Mortal Error. One of his own Secret Service men hanging off the back of the car shot JFK by accident as the driver suddenly accelerated after hearing Oswald's shot. Good book.
Wow. You've proven you read memepool. Congratulations.
Any bets on how long the government has had this technology?
I think it's a fantastic proof-of-concept, and I'm also glad it is open source simply because it is so very useful. Ever watch COPS on Fox, or America's Most Wanted? Say goodbye to those grainy security camera images. I don't see why this couldn't be applied _overnight_ at every precinct in the country.
https://www.accountkiller.com/removal-requested
This method sounds like a very sophisticated form of interlacing often found on TVs and monitors.
But instead of just merging in adjacent horizontal lines in subsequent frames, he is applying - like it says in the article - a more seamless approach in two dimensions rather than one.
Interesting stuff, indeed.
Grabbed this a little before the server colapsed
... v116.ppm (e.g. for an image sequence with 117 pictures in it).
The four main programs you need to use to assemble such image sets are estpchirp2m, pintegrate, pchirp2nocrop, and cement (Computer Enhanced Multiple Exposure Numerical Technique).
The programs use the ``isatty'' feature of the C programming language to provide documentation which is accessed by running them with no command line arguments (e.g. from a TTY) to get a help screen. The sections for each program give usage hints where appropriate. Future versions will support the ``pipe'' construct (e.g. some programs may be used without command line arguments but will still do the right thing in this case rather than just printing a help message).
The first program you need to run is estpchirp2m, which estimates coordinate transformation parameters between pairs of images. These ``chirp'' parameters are sets of eight real-valued quantities which indicate a projective (i.e., affine plus chirp) coordinate transformation on an image.
The images are generally numbered sequentially, for example, v000.ppm, v001.ppm,
After you run estpchirp2m on all successive pairs of input images in the sequence, the result will be a set of sets of eight numbers, in ASCII text, one set of numbers per row of text (the numbers separated by white space). The number of lines in the output ASCII text file will be one less than the total number of frames in the input image sequence. For example, if you have a 117-frame sequence (e.g. image files numbered v000.ppm to v116.ppm), there will be 116 lines of ASCII text in the output file from estpchirp2m.
The first row of the text file (e.g. the first set of numbers) indicates the coordinate transformation between frame 0 and frame 1; the second row, the coordinate transformation between frame 1 and frame 2, \ldots A typical filename for these parameters is ``parameters\_pairwise.txt''
These pairwise {\em relative} parameter estimates are then to be converted into ``integrated'' {\em absolute} coordinate transformation parameters (e.g. coordinate transformations with respect to some selected `reference frame'). This conversion is done by a program called pintegrate.
This program takes as input the filename of the file containing parameters from the ASCII text file produced by estpchirp2m (e.g. ``parameters\_pairwise.txt'' and a `reference frame' (specified by the user), and calculates the coordinate transformation parameters from each frame in the image sequence to this specified `reference frame'.
The output of pintegrate is another ASCII text file which lists the set of chirp parameters (again, 8 numbers per chirp parameter, each set of 8 numbers in ASCII, on a new row of text), this time one parameter per frame, designed to be used in order. That is, the first row of the output text file (first set of 8 real numbers) provides the coordinate transformation from frame 0 to the reference frame, the second from frame 1 to the reference frame\ldots
The program called pchirp2nocrop takes the ppm or pgm image for each input frame, together with the chirp parameter for this frame % from pintegrate, and `dechirps' it (applies the coordinate transformation to bring it into the same coordinates as the reference frame). Generally the parameters passed to pchirp2nocrop are those which come from pintegrate (e.g. {\em absolute} parameters, not relative parameters). The output of pchirp2nocrop is another ppm or pgm file.
The program called cement (CEMENT is an acronym for Computer Enhanced Multiple Exposure Numerical Technique.) assembles the dechirped images (which have been processed by pchirp2nocrop) and assembles them onto one large image `canvas'.
Next up: A 360 panoramic view of the server room exploding :)
"I'd rather have a full bottle in front of me than a full frontal lobotomy"
It is argued that, hidden within the flow of signals from typical cameras, through image processing, to display media, is a homomorphic filter. While homomorphic filtering is often desirable, there are some occasions where it is not. Thus cancellation of this implicit homomorphic filter is proposed, through the introduction of an anti--homomorphic filter. This concept gives rise to the principle of quantigraphic image processing, wherein it is argued that most cameras can be modelled as an array of idealized light meters each linearly responsive to a semi--monotonic function of the quantity of light received, integrated over a fixed spectral response profile. This quantity is neither radiometric nor photometric, but, rather, depends only on the spectral response of the sensor elements in the camera.
He has had this software out for a while, I've tried to play with it. NOT easy stuff to pick up and figure out the guts, the source code wasn't meant for your average curious person with coding skills. (Non-OO C code, not that many comments.)
To tell the truth, I'm amazed this hasn't been snapped up by some of the digital camera manufacturers. I know Canon already has a panoramic "helper" that shows part of your last image so you can position the next one.... imagine if it had a built in "Hold down the button and wave your camera around a bit to take a wild angle pic"
It would be really neat if it could interlace multiple video streams into a higher-resolution single stream.
Use of such a technique to defeat no-copy flags left as an exercise.
I saw an article a few weeks ago about some DoD fooling about with tech that merged multiple cameras (at fixed locations) into a 3-D model that could be viewed from different angles in realtime. Anybody have a link to that one?
Can I descramble cable television pr0n, yet?
dmarien
Please stop with the all your base shit. It stopped being funny about 15 minutes after the shockwave file was released.
Thx.
- A.P.
"Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
Why don't you just include the memepool link next time? It'd save you space.
The snappy video snapshot from play inc did this years ago IIRC. Even though NTSC res is 720x480 the snappy was able to squeeze high res pictures out by sampling 2 frames, them performed mathmatical magic to achieve resolutions over 1280x1024.
I saw a special on the Discovery channel where a bank robbrey in Britian was foiled because police were able to clean up the grainy, blurry surveilance camera footage using a similar technique.
I wonder how this could be used for stenography...
My Other Computer Is A Data General Nova III.
Ready the Slashdoting!
Right here
"I'd rather have a full bottle in front of me than a full frontal lobotomy"
what if you could make a video recording device that acted like snapshot camera, where the the lens captures images in a fast circumference sweep. I saw JPL in pasadena had those ultra-fast video recording devices about 15-20 years ago where they could film a balloon popping and it recorded at like 200 frames per second. Sure we have the technology available at a more reasonable price now. What if you combined a fast frame recording technology, either recording in a horizontal scan or a circular sweep like the hands on a clock, at 200 fps. (or 360 fps?) I wonder what kind of resolution you could get from something like that?
I make these: http://beatseqr.com
what a lucky guess
here
dmarien
of marketing from that annoying airbag.
First, he puts a small radio under his skin and he calls himself a 'cyborg', now he takes what NASA did decades ago with probe pictures and calls it his own?
This guy needs to be removed from universities. He is contributing NOTHING.
Opps! looks like it's /.'ed allready or it's not responding for some reason... (?)
My undergraduate design project was with Steve Mann on this technology (objective was the "parallelization" of the software on a Beowulf cluster - shout out to Mike and Anna :) ).
:)
The main use of this system so far has been to stitch multiple images into one panoramic shot. Like any auto-stitching program, this requires a certain amount of overlap between frames - the more overlap, the better the stitching. The code works remarkably well, automatically rotating, zooming, skewing and otherwise transforming the images to fit together and then mapping them into a "flat" image as opposed to a parallelogram-shaped one.
Yes, the higher resolution from multiple shots of the same scene works, and is a very cool effect of the system. Of course, this requires a more or less static scene.
Finally, it's not necessarily "video" that it uses, although pulling individual frames from a video would work. It's based of the head-mounted cameras of the wearcam systems, which essentially use a stripped-down webcam for image-gathering, so you already know the fps and resolution limitations involved with those.
Of course, in the 2 years since I've been there, the technology has probably improved, although I doubt the webpage has.
Mann has a bunch of cool projects involved with the wearcam/wearcomps. This is a great one, another is the Photoquantigraphic Lightspace Rendering (painting with light), which can also be found on the wearcam site.
- In hell, treason is the work of angels.
This is a gateway to pingpong-ball-less motion capture. In future with sufficient processing power and algorithyms, it ought to be possible to combine two lenses spaced apart for stereo, combined with x,y,and z axis positioning sensors. Such a device could record stereo data, combined positional data and the understanding that objects "grow" as the come closer", to make 3D models of anything it sees. The more time it can watch an object and rotate/zoom around it, the more detailed the model can be. It doesn't even have to make the model in realtime, just record as much data as it can then upload it to more powerful computers later. When does Minority Report take place? 2050 or so? Well by then I fully expect that instead of the flat holograms Tom Cruise watched we'll have full 3D.
One more thing - this isn't done in real-time. It can be run on a single machine and take a fair bit of time as it works through image pairs. Therefore, the more images you use, the longer it takes.
ie.- 5 images: 1, 2, 3, 4, 5
compares 1 & 2, 2 & 3, 3 & 4, and 4 & 5. The co-ordinate transformations for each pair are relative to the base image (so you don't have to re-transform after stitching).
There has been work to farm out the comparisons across a Beowulf cluster (the one built when I was there, was of some impressive VA Linux boxes, I believe it's been expanded since). But this still takes some time. So unless someone's going to get a parallel computing cluster inside a single package and make it affordable, this won't be rolled-out nationwide overnight.
- In hell, treason is the work of angels.
You know how televisions shows will pixelate the face of someone that doesn't want to be show on television? Sometimes it is just a passerby on MTV's Realworld who won't sign a release, but sometimes its somebody a little more important like a corporate or federal whistle-blower.
I've long thought that pixelization wasn't a very good way to protect the identities of these people because when they are on video, they move around and the camera sometimes moves around, but often the pixelization is applied in post-production so it stays in a relatively constant location rather than tracking the features on the person's face. Anyone sufficiently motivated and sufficiently equipped with the right tools ought to be able to reconstruct a much higher resolution, non-pixelated image of the secret person's face by extracting all of the useful information from each frame and then corollating it all together with the general movements of the person in the frame.
It sounds to me like pencigraphy is exactly the kind of science required to do something like that. So now the question is, who do we want to unmask? Too bad Deep Throat never made an on camera appearance.
When information is power, privacy is freedom.
think of this. several small video cams mounted in a had or headband or anything that the person wears around their person, all the images stitched together to forma mosaic panorama with no distortion, the image itself projected and visible inside the glasses that you are wearing.... Those secret service guys in dark shades would be able to view 360 degrees simultaneously, with the true front of them being the center ofthe display.
True this is not possible now, a wearable computer would never have the power to do this real time, but Moore's law, you know, could happen in a few years.
On the scarier side......get some sort of combat suit, that enhances the wearers strength / speed / endurance and provides additional armor and firepower, add this capability and the wearer can suddenly move faster, longer see in all directions simultaneously and target enemies....
I reject your reality
Using it to stich mosaics together.
Using it to use overlapping images to increase the effective resolution of the camera. This is called "super resolution".
Computer vision types have been doing this for years. Shmuel Peleg of Hebrew University has done some good work and had the work show up in commerical products, including Videobrush - you could take a webcam, wave it around, and in real time get a mosaic. In 1995 (I think.) Don't know if you can still get that product.
Do a google search for him and you'll find his home page and superresolution papers (Peleg and Irani is an accessible paper and one of the first - the concept they used, however, comes from Bruce Lucas' thesis at CMU.
Applications include: combining NASA satellite images of the Earth to get higher resolution, ditto for images of the human retina; and, a personal favorite, smoothing images of the system used at the Superbowl a couple of years ago where they had 75 cameras and could show any play from any angle in liive video. That was done by Takeo Kanade, Luca's' advisor.
This was posted on memepool yesterday with an interesting link. This seems to be happening more and more.
Building Better Software
They did something similar some years ago with the mars pathfinder mission. By combining all the images from the stereo-imager and the rover they were able to glue everything together into a textured 3d model.
karma police: arrest this man, he talks in maths; he buzzes like a fridge, he's like a detuned radio. [radiohead]
But I can't find any shell scripts anywhere.
I have a "panorama" series of jpgs that I'd like to stitch together with this package (I already did it by hand in Photoshop, but automated would be sweet.)
Some cameras used for research work (especially in the field of explosives) can go up to (possibly past) 10,000 FPS.
This is film, mind you, not digital, but the image correlation we're discussing isn't realtime anyway - might as well add the step of doing a bulk scan on the film to the equation for the extra FPS.
Whoever modded this guy troll has no sense of humor.
This crap Mann has been pushing is THE *EXACT* SAME program written back almost a decade ago now with only minor bug fixes. Don't feel bad if you can't run it, it's horribly written (massive memory leaks) and Steve obfuscates it's mechanism and use with dramatic acronyms. His "cement" program is a glorified raw image loader that applies a lookup table to the values...something you could probably do in a few lines of perl.
Contrary to what some people may infer:
It does not work in real time,
It does not work on images from multiple viewpoints. It doesn't work with sets of images that have parralax, and really not all that generally useful.
Well, not exactly, but something using the same principle to effectively antialias and dispeckle your pictures. It only works with a tripod and a static scene.
First, take a few identical pictures of the same scene.
Then, superimpose them in your favorite photo editor.
e.g. if you take 5 pictures, you can decrease the brightness each to 20%, then add them together, or take a fractal sum average, say the first picture contributes 50%, the second 25%, the third 12.5% etc.
The results are usually very impressive, especially for older cameras.
I don't know anyone that has hyped up their thesis work like has. For god's sake it's a poorly written 10-year old program, that doesn't work very well.
Pretty extravagant name for dynamic range recovery....
If you read the literature, there are other techniques that actually are more sound and do a better job (Debevec's work is at least visually impressive). Most importantly they don't obfuscate the technique by introducing weird ass names and terminolgy to be special.
Don't worry about ti cheezfreak.
I've worked in this guy lab and you are right; the technique isn't all that useful. Sorry that you got modded down.
Quite similar to Image Mosaics, a project we did in the Image and Video Processing class with Prof. Sclaroff. Here's my take on the project (inluding the source code), with a pretty good explanation of how to do this: Go here...
Can we say "documentation", people?
.pbm
files, which seems like what I should
have according to the extremely limited
documentation.
.pbm's together, and get a single file
as the output. "Great!", I think, it
worked and didn't give me too many
problems.
.pbm data doesn't top
that list.
I have three pictures, with roughly 2/3rds overlap.
I ran them pairwise (1 and 2, then 2 and 3) through estpchirp2m. Good, I get two output sets of 8 reals. I stuff them into a single file, one on each line.
So I pintegrate that file, using picture #2 as the reference frame. Cool, I now have three sets of eight reals.
Next, I pchirp2nocrop all three separately, passing the appropriate line from pintegrate on the command line (why bother with text files here, if I need to cut-and-past at this step anyway?). I now have three new
Step four, I cement the three new
So I open up the picture. Or try to. It seems that whatever the output file has in it, valid
I tried again, but since I had followed the (limited) directions carefully the first time, my results did not differ.
So, I have three suggestions for Mr. Cyborg...
First, it doesn't matter *how* cool of a program you write, if no one can figure out how to use it (WRITE SOME REAL DOCS!!!).
Second, it doesn't matter how cool your program *sounds*, if it doesn't work.
Third, 99% of people playing with this will either not want to tweak any of the in-between stages' results. Of those that *do*, 99% will just hack the source. Ditch the four (and then some) programs, and make a single executable that takes as its arguments just the name of the input files, in order, and perhaps a *few* tweaking options (like enable or disable filtering, which sounds useful, except YOU DON'T HAVE IT DESCRIBED ANYWHERE!).
Ahem.
Otherwise, great program. No doubt one of the many companies doing the same thing for the past 20 years will soon have their lawyers send their congrats.
Oops, pretend I replied to the parent article (Yeah, I said "No Text", but slashbot won't let me actually omit a message body).
Or similar, Salient Stills has a similar product.
I'm glad to see these products because I proposed doing this in a graduate seminar in the early nineties (was CS undergrad at the time) and the PhD candidate leading the class went on about how it was mathematically impossible (and by extension how dumb I was because I didn't understand that particular math). Righto, Charles.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
The Hosemaster spun a set of filters sequentially in front of the lens and would set off a different flash for each filter. By lighting different parts of a scene with different flashes, and using different filters with each flash, the photographer could effectively apply different filters to different objects in the scene. For instance you could have two people standing next to each other with one of them shot through a diffusion (fuzzy) filter and the other person sharp. In the late eighties there was a cliché to do portraits with a diffused backlight and the rest sharp. That's how all those pictures of CEOs standing in a murky foggy environment were done.
There is also a fiber optic bundle light source involved, hence the "hose" in the name.
I used to do lots of time exposures outdoors at night using hundreds of flashes from a small vivitar 283 strobe to illuminate things
I'm playing with something similar now using a digital camera
...Imagine how this technology could be used to compress simpsons / futurama episodes?
.flc files. This could bring a high-quality simpsons episode down to only a couple of meg + sound.
Hear me out... Mpeg (and DivX) are good, but they're like jpg. If we had something that was good at determining the parts of a screen that don't really change enough, they could be used as a "background" sort of thing, and instead of mpeg or divx, we could encode it into something more along the lines of the old autodesk animator
Send lawyers, guns, and money!
I first saw this same example of converting video to a single image at MIT in the early 90's, and it got me to thinking... If you had enough computer power which could solve thousands of simultaenous equations - you could in effect capture not only the image, but also the geometry of an environment. So, you could walk through a building with your video camera, then download it to your Beowulf cluster, and come back in a couple days and see the actual environment in 3D, texture mapped with the images. In fact, you could also calculate many of the properties in the scene by doing lookups on the difuse, reflective and specular values for the 3D objects (as you walk around, different object reflect and scatter light differently). I bet we see this in our lifetime. Think how cool this would be for designing sets and virtual environments...
Actually, that was the TurboFilter, also made by HoseMaster (Aaron Jones, Inc. - if you want to be pedantic :).
:)....
:)
I know that product intimately, since my (former) company designed their revised unit with built-in radio triggers (the PocketWizard, which we also designed), the smaller wheel, and two speeds (I don't remember if their original unit had the two speed modes or not). The TurboFilter had 3 filters - clear, light diffusion, and heavy diffusion. The camera would be set for 1/30 or 1/15 second exposure, and the 3 strobes would fire in sequence, and that was it.
The HoseMaster was a fiber optic light painting device. It was generally used on very long bulb exposures. The photographer could direct a narrow (or shaped, in later versions), controlled intensity light source at a subject, and very carefully "paint" exposure differently for each part of the frame.
The company, Aaron Jones Inc. was known as HoseMaster because of their popular product of the same name. My old company had the same thing happen - the company is LPA Design, a name nobody knows, so people call the company PocketWizard, the name of their most popular product. Go figure.
Aaah - the Vivitar 283. You have no idea how many times I've been shocked by the trigger circuit on those (at about 283 volts, by the way
We did a mega-version of that kind of multi-pop photography at the Photo East show in 1999 (or 2000). We shot the Intrpid aircraft carrier from the top of the UPS building across the street from the pier. It was illuminated by thousands of people - about 3000 on board and another 2-3K on the next pier (plus 18 moderate power Metz packs doing multiple full dumps). Each person just walked all over the ship popping their little flashes as fast as they would go. The total exposure time was about 2 minutes, and the shot came out great. You should have seen the faces of the New Yorkers driving by while all the flashing was going on - hilarious.
Have fun (and be careful of the sync jack on that V283
- The Sigless Wonder
When I was a senior in high school I attended a "science and engineering" conference for college-bound seniors. The main presenter at the conference was a researcher at the NASA JPL and Caltech.
He used earth-based telescopes to take pictures of asteroids. The problem is that the pictures were very blurry. They were almost unusuable.
To solve the problem, they took ten or fifteen pictures, each from a slightly different angle. The pictures were scanned into a computer, and then a software program would analyze the pictures, producing one much sharper picture. The results were incredible. Of course, that was the point: impress students enough to make them want to be engineers. :)
--Bruce
(My memory of this is a little fuzzy, so a few details might be off.)
There are 10 kinds of people in the world: those who understand binary, and those who don't.
Hmm
I never saw one without the other, so I assumed they were one product.
I was too cheap to rent the wheel, but I faked a similar effect.
The 283 is nothing. I once had an ancient norman , the big heavy ones with a transformer in it from the 60s . It had a switch that was on the capacitor side of the circuit.
Every so often it would arc, and either weld the switch solid or blow it up. One of the scariest things I ever owned
I traded it in on an p800D in 1979 that I still have.
The nutcase with the implant is Kevin Warwick, a professor of Cybernetics at Reading University, UK.
Steve Mann, who wrote the compositing code that this Slashdot article is about, is a professor at the University of Toronto, teaching wearable computing, and is the one who had his (non-implanted, despite what the slashdot post says) hardware ripped off in Newfoundland.
THEY ARE NOT THE SAME PERSON.
I'll be presenting a paper which will demonstrate otherwise. Its based on research to be presented at the IEEE International Symposium on Wearable Computing ISWC2002.
I encourage interested persons to read about it when the conference proceedings are released.
.
I downloaded this stuff about a year ago thinking it would be cool to build a GIMP plugin on top of it to make the whole process a little simpler.
However when I downloaded the tarball it already included a plugin contributed by someone else. This was in one of the 1.x releases directly off Steve's site, not from sourceforge. I just did a quick google for 'video orbits gimp plugin' and nothing leaps out.
I don't think I have the older software - I switched machines since and dumped a lot of stuff - but I'll dig around this afternoon. Anyone remember this plugin and know where the hordes can find it?
So now we can really do:
Enhance 224 to 176. Enhance, stop. Move in, stop. Pull out, track right, stop. Center in, pull back. Stop. Track 45 right. Stop. Center and stop. Enhance 34 to 36. Pan right and pull back. Stop. Enhance 34 to 46. Pull back. Wait a minute, go right, stop. Enhance 57 to 19. Track 45 left. Stop. Enhance 15 to 23. Give me a hard copy right there. W00T! IT'S A FUCKING UNICORN! 0WN3D!
Invoicing, Time Tracking, Reporting
How do you calculate the exposure for that, such that double-flashing the same area doesn't cause it to wash compared to the rest of the image?
Also, this sounds like a great shot, can I see it online anywhere?
Sorry, I left my account details at work, so I'm an AC here:
/usr/images/v04.pgm /usr/images/v06.pgm /usr/images/v06.pgm ./v06_mod.pgm `cat int_last.txt` /usr/images/v04.pgm ./v06_mod.pgm
- get the RPM, not the source
- try it on his example:
#!/bin/sh
# step 1 - motion estimation
estpchirp2m -outp est.txt -steps 4 16 8 4 2
# keep only last result
tail -1 est.txt > est_last.txt
# step 2 - integrate
pintegrate est_last.txt 0 > int.txt
# once again, keep only last result
tail -1 int.txt > int_last.txt
# step 3 - modify 2nd frame
pchirp2nocrop
# step 4 - merge both images
cement v_final.pgm
If you do get a correct result then you are home and dry, try on your own data, cheers.
Not in real-time. -WRONG!
I'll be presenting a paper which will demonstrate otherwise. Its based on research to be presented at the IEEE International Symposium on Wearable Computing ISWC2002.
Without custom ICs or using the Transmogrifier?
Damn, that guy looks strikingly like Inspector (etc) Monkfish offa The Fast Show
without either.
You probably meant to reply to SWPadnos , but the way you calculate the exposure so that double-flashing the same area doesn't cause it to wash compared to the rest of the image is by double flashing.
What I always did was set the flash on automatic, but for about one stop less exposure than the aperture and film combination called for. Then I would walk around the area I was photographing flashing each part of it about twice. The idea was to keep moving so that each flash would overlap, and everything would get about two flashes worth of light. I'd over expose or underexpose some areas. The really neat part of this technique was that if got in the picture accidentally in one of the exposures, I wouldn't show up in the photo. This is easier than it sounds.
Oops. Yeah, I did.
Ok, so basically you're saying a rough estimate and some guesswork, but also that it's not important to get it perfect.
That's what I would have done, but I thought there might be a trick.
without either.
...We should probably take this up in person. Look at the cube seating list in 2206 to find me (or just look for my name on a blackboard). I'll be in on Monday (no more paper means no more living at the university).
Without using the cluster, for images larger than a postage stamp?
Robustly for just about all video streams you've tried, as opposed to special cases that work well?
The only way you could do this is with an (at worst) O(n log n) or O(n [log n]^2) algorithm for finding features present in multiple image frames and judging the correct transform to use, and being right _all_ the time. This would be quite the accomplishment.
Because the moderator was a moron. Welcome to Slashdot. I M2'd him down, it's our only hope.