Slashdot Mirror


Hardware for Homebrew Motion Capture?

goruka asks: "We are a small garage game development and 3D animation group, and as such, we try to develop by reducing costs as much as we can. Recently, it came to our mind that we could setup and develop a home-brew motion capture system by using three consumer USB web-cams to motion track bright objects attached to the body. However, we don't know which web-cam models can: capture at a decent frame rate (25fps) and resolution; are supported and easily programmed under GNU/Linux, since we'd like to later release our software as open source; and lastly, won't cost us a fortune. What are your experiences with such devices?"

18 of 82 comments (clear)

  1. Different solution by JanneM · · Score: 5, Interesting

    Since you probably don't need to do anything real-time with the capture data, I'd suggest that you use whatever inexpensive cameras you can - and record streams onto video. Ideally, you'd borrow three camcorders and use them. Then you can at your leisure transfer the streams to a machine via firewire and calculate 3d-data to your hearts' content.

    The benefit of this setup is that you can get away with very cheap hardware (you can probably borrow needed camcorders from friends and family if it's just a temporary deal), and the image quality - resolution, dynamic range, low-light performance, noise - will be a lot better than with a heavily compressed usb-cam stream.

    As for synking the streams, you have that problem with three usb cams as well (can't caprture three usb-streams on the same computer), and with camcorders at least one step up from the bargain bin, you should be able to use sync cabling if you're really concerned about capturing frames at the same instant. I doubt that would be necessary, though, for the kind of precicion you're looking at getting (just do a linear interpolation between captured points to do an approximate soft sync should be fine for any movement you can hope to capture at 25/50 frames/s anyway).

    --
    Trust the Computer. The Computer is your friend.
    1. Re:Different solution by monopole · · Score: 2, Insightful

      Firewire cameras such as the UniBrain Fire-I allow for synced capture along a firewire dasiychain. You have to adjust the framerate and dynamic range to allow for bandwidth issues

  2. Axis by Southpaw018 · · Score: 2, Informative

    We run an Axis 207 at work. Pair it up with Zoneminder and you've got yourself a montion capture system, albeit in the form of home security system software.

    --
    ACs are modded -6. I don't read you, I don't mod you, I don't see you. Don't like it? Don't be a coward.
  3. Get yourself a Philips SPC 900NC by Ayanami+Rei · · Score: 3, Interesting

    They run for about $100 and they are available at most CompUSA stores (and nowhere else, it seems).

    Features:
        * 640x480@30fps w/high compression enabled (15 or 10 without)
        * 35mm camera screw mount
        * Manual adjustments on camera (sensor angle and focus ring)
        * Lots of software settings to play with (AGC, white balance, shutter speed, aperature)
        * Compatible with the PWC 10.0.12 drivers from http://saillard.org/linux/pwc/
        * Above all: stable.

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
  4. Sorry, here's the URL: by Ayanami+Rei · · Score: 2, Informative

    http://www.compusa.com/products/product_info.asp?p fp=SEARCH&Ntt=philips+900&N=0&Dx=mode+matchall&Nty =1&D=philips+900&Ntk=All&product_code=337160&pfp=s rch1

    The reviews are not exaggerating, it's a nice camera.

    I forgot, it has a usb-audio device endpoint two that's a built in mic, but that's not important.
    The 1280x960 modes mentioned are software scaling, so they're useless. It's a fairly standard CCD board in the unit that is 640x480. Since it uses a Bayer pattern to filter color, you're going to want to throw away the chroma components in your analysis. You might be able to use chroma for helping it distinguish the balls from the background, but you'll want to use the luma information for accurate tracking.

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
    1. Re:Sorry, here's the URL: by josepha48 · · Score: 2, Informative
      Yeah or you can get a quickcam 400 they work also. Or if he goes here http://linux-uvc.berlios.de/ he can get a quickcam 5000.

      I've got a 4000 and it does 640x480 at about 30fps.

      --

      Only 'flamers' flame!
      Does slashdot hate my posts?

  5. Did this 6 years ago with camcorders for a dem by Hufo · · Score: 4, Interesting

    This is a lot of work but also a lot of fun! I did it for a real-time demo project with a few friend. We used Christmas fairy lights and 5 mini-VHS camcorders. You can see the result at the very end of our Childbone demo.

    Nowadays, using webcams will save you a lot of troubles, and you can find lots of very useful codes on the Internet (such as Intel's OpenCV, however majors issues that you still have to solve would be calibrating camera positions and reliably tracking crossing markers in images. In my system I had to do an editor to manually reassign markers when incorrectly detected or labeled, which can be a very tedious task...

    I would recommend Logitech Quickcam Pro 5000 webcams, as they are USB 2.0, can do 640x480 at 30 fps, and most importantly use the somewhat recent generic USB Video Class spec, for which a driver for linux is available. I have a few of those and the image quality is quite good :)

    Good luck!

    1. Re:Did this 6 years ago with camcorders for a dem by munpfazy · · Score: 2, Interesting

      Interesting.

      I wonder how the big studios deal with marker crossings? (Then again, perhaps they just pay humans to do tedious work.)

      Seems like there must be a cheap hardware solution, given enough time and energy.

      For example, one could put colored filters on the reflectors. By replacing each dot with a cluster of colored dots and then selectively blackening them you could code each point uniquely. It would take some experimentation to figure out how to get the results you need with consumer gear. Presumably you won't have the resolution you'd need to pick out the individual dots, and it might take some work to identify an unresolved marker in a color image. If cameras are either very cheap or have a very fast and have an externally accessible frame sync, you could either use single color filtered cameras or a chopper wheel with several filters on each camera.

      Or you could try to use single-color dots with a range of different colors. It would take some experimentation to see how many unique IDs can be reliably identified. Seems like it could be in the low hundreds though, given the number of unique colors a human can pick out of an image taken with a cheap webcam. Finding suitable reflectors might be a challenge. Perhaps mixing dies with corner reflector granules would work.

      If you replace passive targets with LEDs and use very fast cameras, you could conceivably identify the dots by strobing them. If you can run at, say, 8 times the rate you need for motion capture then you've got plenty of bits to work with. Finding cheap consumer gear that will give you low noise images that fast may not be trivial. (Or, you could always do your motion capture work in slow motion, I guess.)

      A final option is to add extra cameras. By adding additional nonorthogonal views, it should be possible to unambiguously decipher most crossings. Actually, that might not be very much harder than doing motion capture with a small number of cameras, especially if it doesn't need to happen in real time. Might be harder to align things properly - but you could imagine playing tricks with a rigid grid to try to automate most of that.

    2. Re:Did this 6 years ago with camcorders for a dem by Niet3sche · · Score: 2, Informative

      I can second the OpenCV nomination.

      However, I think I may be able to add something to the puzzle: I was informed (but have not yet tested) that IEEE1394 (Firewire) cams will synch across the bus. This means that you no longer have to worry about adjusting for framedrops or timing or whatnot. Rather, the two cameras "see" their fields in lock-step with respect to time. I know that some folks here locally have had great success with Uni-Brain Fire-i cameras, but earlier in the thread someone reported a bad experience with them.

      However, this being slashdot, I must remind you that YMMV.

  6. Cheap Mocap Solution by MacroMegaMan · · Score: 2, Interesting

    You could try "Optitrack" by naturalpoint software. Seems really useful, actually and for $249, it's worth taking a chance on too.
    Here's their link:
    http://www.naturalpoint.com/optitrack/

    If you have Poser(and free time), you can also try the Rotoscoper plugin by PhilC as well.
    Huge link follows:
    http://istore.mikrotec.com/philc/index1.html?page= catalog&trackerid=1661406456&category=a&vid=208024 5373&pid=924839477&oldvid=2143420604

  7. Hitlab by kramulous · · Score: 2, Interesting

    Howdy,

    Hitlab (NZ [hitlabnz.org] but also an American office [somewhere]) also have come out with some pretty funky motion tracking. Beit for other purposes, but the source is available (via SourceForge: ARToolkit).

    It may not be exactly what you are wanting, but with a little modification it should, and, importantly, is CHEAP.

    Good luck. Hope to see some break-through gaming experiences. Hooroo

    --
    .
  8. This may be a much bigger job then you think... by Anonymous Coward · · Score: 2, Informative
    I don't do MoCap, but I have worked in face tracking and I have been in major film industry motion capture studios and seen their set ups. It is very complex.

    First, they have many, many cameras, because you have to have 3 unobscured camera views to triangulate a point. I assume you want to mocap people doing game moves, so multiple camera are required. Also, they capture with infra-red, not visible light. They put infra-red reflective spheres on the people at key locations, so what the camera picks up are dots on a black background. This is for 2D tracking.

    Some real time tracking occurs, and and a rough 3D wireframe is generated so they can see if they have a good take. Note that it's not one computer for all the cameras because of bandwidth limits. You may not be able to support very many cameras per computer, because you need to save all the frames for post processing. The rough tracking data is sent via UDP to a single computer that does the wire frame display.

    After you have captured the data, then it gets really hard. You need to calibrate all the cameras so you can combine their data. You need to survey their positons and their pointing angles. It is possible to calibrate by locking down the camera and shooting targets before you start. I don't know if you need to correct for lens distortion or not, it may depend on your cameras.

    The cameras have to be synched. If they are taking picture as different times then 3D point positions will be not be right. Web cams don't have external synch.

    First, you have to do 2D tracking for each camera. Then you have to figure out which 2D tracked point on camera A corresponds to the same point on cameras B, C, D for each frame, typically while the mocap actors are being very athletic. The you need to combine multiple 2D points to 3D points. Remember that 2D points will dissapear and reappear during a move.

    After you have 3D points then you need to connect those dots into motion paths. This takes a lot of very complex motion filtering software. People often use Kalman filters for this. Sometimes they do Kalman filtering in 2D and in 3D. Multi pass filters can be used, where you go from 2D to 3D to motion paths, and then you take the estimated 3D positon and project it back to 2D. The back projected data is combined with the captured images to get better data for the next pass.

    Assuming all that works, then you can take 3D path data and translate to the frame of reference on the person so you can animate the character. Are you going to use inverse kinematics to to derive joint angles from end posions of arms and legs? Often times you measure points on both sides of a joint and directly measure joint angles so you can directly apply the measured data to the 3D model.

    Heck, I don't do this stuff, I just have been around it, and these are things I remember. Optical tracking is very hard. People still use magnetic tracking and joint flex tracking, sometimes in combination with optical, becasue they are better for some kinds of measurements. Now you know why movies and high end vidoe games are so expensive....

  9. cheap homebrew infra-red mocap rig by Robbat2 · · Score: 4, Informative

    Couple of things in here, from researching the field with a university research lab to see about buying commercial gear, I have a lot of suggestions.

    - For your camera, look for cheap used DV cameras on ebay. Not super high res, but lots of them 3 ain't going to cut it, consider at least 9 (high/low from each of the cardinal directions, and on top [might want a few for different sectors]) - occlusion is an absolute bitch of a problem.
    - This will provide reliable time-synced data, and NOT max out your USB bus.
    - USB cannot provide you with images from 3 cameras with the same timesync, it's just not capable of such behavior.
    - Firewire has a longer length limit on the cables, which is a big help for your work.

    - Cheap PCI firewire cards - two should be enough, this will give you 6 seperate firewire busses, and put you at the limit of your PCI bus.
    - Find filters that fit said cameras, and are opaque to visible light, but transparent to infrared.
    - Rig up really bright infra-red lighting, ideally with a low quantity of visible light output.
    - Go to an burgler alarm supply place, and buy infra-red reflective tape - I leart this tip from the EA guys a couple of years back, the 'official' reflective tape from 3M costs too much, and is a pain to order, but alarm places stock stuff that works even better, and is cheaper to boot.
    - Buy really small polystrene balls, and cover with infra-red tape. On one small part of the ball, put the hook side of a velcro dot. These are reusable now, avoiding problem with tape waste. You can also clean them easily to keep them very reflective.
    - For your subjects, get them to wear any clothing that velcro will hook reliably onto (pretty easy choice)
    - Place the reflective balls on either side of every joint, spaced not more than 90 degrees apart - eg your elbow should have 8 balls.

    Using infra-red helps reduce the data-set size way down, and also lets you use the cameras in monochrome for capturing, greatly reducing the data-set size.

    From working with several commercial mocap rigs, I'll say that the calibration routines are extremely important. You need to accurately map the entire volume that you wish to capture in. Depending on space available to you, consider building a simple frame or using a lighting rig to attach the cameras to.

    I will repeat again, occlusion is an absolute killer problem. From visting the EA facilities in Burnaby BC to specifically research their systems (I was working with a university research lab at the time), they estimated that they lost 2 hours of production a day to occlusion problems during mocap shoots.
    Your system must be capable of tracking all the balls, all of the time. If it loses one, it's almost impossible to pick it up again properly during a runtime - you'd need to recode the relative location of that ball before it gave you useful data again.

    --
    ICQ# : 30269588
    "I used to be an idealist, but I got mugged by reality."
    1. Re:cheap homebrew infra-red mocap rig by Emil+Brink · · Score: 2, Interesting

      Sounds interesting, the tip about IR-reflective tape especially. It got me thinking ... if reflective tape is expensive enough to warrant hunting for cheaper sources ... And you also need to get IR light sources, wouldn't it make sense to invert the lighting, and put IR-emissive dots directly on the mocap actor? Something like LED throwies but with IR LED(s) rather than visible light? Perhaps it's still too expensive and/or impractical what with batteries and so on, though. I do wonder how it would compare, brightness-wise. Anyone tried it?

      --
      main(O){10<putchar(4^--O?77-(15&5128 >>4*O):10)&&main(2+O);}
  10. Re:Unibrain Fire-i by alazor · · Score: 2, Informative

    The iSight was only discontinued in Europe it turns out.

    http://en.wikipedia.org/wiki/ISight#Discontinued_i n_Europe

    --

    -
    Systems Administrators: We read the manual so you don't have to.
  11. Would RFID chips? by mikael · · Score: 2, Interesting

    Could you have each marker represented by a RFID chip? Then detecting the position of each marker would only require four RFID transmitters. The time delay would give you the distance to each marker and you could use triangulation to determine the current orientation. And each RFID tag would be easy to label.

    --
    Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
  12. Clean solution by John+Sokol · · Score: 2, Interesting

    I use the Kodicom 4400r board http://dvr.videotechnology.com/ this uses 4 Conexant 878 chips (formerly called Brooktree BT878)
      The default bt878 driver in FreeBSD works but I had to write a small driver to init the video switcher on the board.
      Using very simple code you can capture and process 4 full motion video channels in FreeBSD.

      I there is also the BTTV Linux driver for this board.

      CCTV Cameras can be had for $35 each and the board is $200. for a total cost of $300 for 3 cameras to do motion capture.

      I have used Blinking dual color LEDS on the target very successfully.
      Also retroreflective balls and LED lighting also works well. The $35 black and white versions of these camera come with IR leds for so called "Night Vision" and works great with the 3M reflective tape.
      See http://www.videotechnology.com/old1104.html Retroreflective Materials for more info on that also.

    --
    I am always doing that which I can not do, in order that I may learn how to do it. - Pablo Picasso
  13. Your idea is good, but the implementation... by cr0sh · · Score: 2, Informative
    ...is the most important thing. First off, I want to give kudos to everyone who has responded to these - just about everyone here has given great ideas and suggestions to the problem. These ideas should be listened to and evaluated. I myself have been researching the idea of sourceless and sourced motion capture and position tracking for a long time now, as it relates to virtual reality applications, simply because there is nothing commercial for the task that comes down to homebrew pricing. Personally, I only want to track two things - the position and orientation of my hand, and the orientation of my head. The second I (and anyone else) can easily do today with a cheap 3-axis accelerometer/compass system. The first, though, is not easy at all. Position is one thing, but the orientation is a completely different beast.

    With that said, your ideas on using webcams is spot on, but you are going to need more than three, mainly for occlusion handling. For the rig I was contemplating (using webcams much the same as you), I was thinking of at least four cameras. The main problem I ran into (just in thinking about it, no actual implementation), and as others have described, was timing issues. For best results, you need all the frames captured from the cameras to happen at the exact same time. Since with USB webcams this isn't possible, you either need to come up with another solution (people here have mentioned some "high end" cameras that have syncing systems), or deal with it in software (very difficult to do, in addition to dealing with everything else, and still getting a high frame rate).

    Another problem you are going to run into (and has been mentioned by others, but not much on the reason) is webcamera resolution. Most webcams that capture at decent framerates do so at QVGA (320x240). Even those that capture at a real 640x480 typically do so at only around 15fps, instead of 24 or 30. Rare (and more expensive) is the webcam that will capture at 24-30fps with VGA resolution. Even at VGA resolution, though, you are going to have to deal with the angular vs pixel resolution of the camera. What I mean by this is that as an object moves throught the FOV of the camera, it is going to only be imaged by certain pixels of the CCD imaging device. Depending on the distance away from the camera, the object may move say a foot, and only move (on camera) a pixel or so. The further away the moving object, the fewer pixels covered due to parallax. This translates into a lower resolution of pixels (on camera) to inches/cm (in real motion). In fact, this is almost the inverse problem of HMDs, where you can have high resolution, and low FOV, or vice-versa. In order to have both (in either cameras or HMDs), you have to pay a lot of money. In optical camera-based mocap, this means HDTV or better resolution cameras. I hope you understand what I mean here, because it is important for motion capture where you may be capturing large amounts of motion over a lot of area. For close-ups (like facial capture) it is less important - but remember, the higher the resolution of the camera, the finer the motion you can capture at all distances from the person/object to the camera. Higher resolution cameras translate into higher prices for the system, because you have to deal with more data, all in realtime. Not easy, not cheap.

    You might best be able to deal with this by going the custom camera route. What you would want to do is build a custom frame capturing system, using 640x480 (or better) b&w CCD cameras (you don't need color, you just need IR sensitivity - even with B/W cameras, you are going to filter the final image down so far that it is mostly only a true b&w 2bpp image - so the closer you can do that in hardware, the less you have to do in software). This won't be easy, but many people have done similar systems for homebrew robotic vision systems, so look there. Realize that this kind of a project will likely dwarf your game development project in both hardware and software needs, and you might end up with a system

    --
    Reason is the Path to God - Anon