User Guide for Multiple Depth Sensors Configuration

From iPi Docs
Revision as of 12:08, 21 October 2015 by Vmaslov (Talk | contribs)

Jump to: navigation, search

System Requirements

iPi Recorder

  • Computer (desktop or laptop):
    • CPU: x86 compatible (Intel Pentium 4 or higher, AMD Athlon or higher, 2GHz), dual- or quad- core is preferable
    • Operating system: Windows 10 / 8.1 / 8 / 7 (32-bit or 64-bit)
    • USB: at least two USB 2.0 or USB 3.0 controllers
      For more info see USB controllers
      Important! Kinect 2 for Windows, Kinect for Xbox One requires USB 3.0 controller. Kinect SDK 2.0 supports only single sensor on one PC. Libfreenect2 supports multiple sensors on one PC.
    • ExpressCard or eSATA slot (for laptops)
      Optional, but highly recommended. It allows to install external USB controller in case of compatibility issues between cameras and built-in USB controllers, or if all USB ports are in fact connected to a single USB controller
    • Storage system: HDD or SSD or RAID with write speed:
      • For Kinect 2 for Windows / Kinect for Xbox One sensors:
        • 2 sensors — not less than 66,0 MByte/sec
        • 3 sensors — not less than 99,0 MByte/sec
      • For Microsoft Kinect sensors:
        • 2 sensors — not less than 55,0 MByte/sec
        • 3 sensors — not less than 82,5 MByte/sec
Note: If your write speed is lower, you can use can use background subtraction mode. Alternatively, you can use compressed mode (that gives 3-5 times lower required write speed, but the CPU performance may become a bottleneck)
  • 2 or 3 depth sensors: Microsoft Kinect 2 for Windows / Microsoft Kinect for Xbox One, or Microsoft Kinect
Note: For Microsoft Kinect for Xbox One you will need Microsoft Kinect Adapter cable

iPi Mocap Studio

  • Computer (desktop or laptop):
    • CPU: x86/x64 compatible (Intel Pentium 4 or higher, AMD Athlon or higher), dual- or quad- core is preferable.
    • Operating system: Windows 10 / 8.1 / 8 / 7 (32-bit or 64-bit).
    • Video card: DirectX 11 capable gaming-class graphics card.
GPUz example.gif
Note: Before you start working with multiple depth sensors, it is highly recommended to get appropriate results with single depth sensor configuration: User Guide for Single Depth Sensor Configuration

Software Installation

iPi Recorder

Important! Please unplug all cameras from computer before installation.

Download and run the setup package of the latest version of iPi Recorder. You will be presented with the following dialog.

iPi Recorder 3 Setup.png
  1. Select needed components
  2. Read and accept the license agreement by checking appropriate checkbox
  3. Press the Install button to begin installation
Note: Most of the components require administrative privileges because they install device drivers or write to Program Files and other system folders. You will be presented with UAC prompts when appropriate during installation. If you plan to use iPi Recorder under user which has no administrative rights, you can pre-install other components separately using administrator's account.
  1. You can plug only one depth sensor to one USB controller. Single USB controller bandwidth is not enough to record from 2 sensors.
  2. You can plug not more than 2 Sony PS Eye cameras to one USB controller, otherwise you will not be able to capture at 60 fps with 640 x 480 resolution.
For more info see USB controllers.

Upon installation is complete, iPi Recorder will launch automatically. Continue with user's guide to get a knowledge of using the software.


If some of the components is already installed, it has no checkbox and is marked with ALREADY INSTALLED label. You should not install all optional components in advance, without necessity. All of them can be installed separately at later time. Components descriptions below contain corresponding download links.

  • (Windows 8, 8.1, 10) Microsoft Kinect 2:: MS Kinect SDK 2.0. Check if you plan to work with Kinect 2 for Windows or Kinect for Xbox One depth sensors, but do not plan to connect multiple Kinects to a single PC.
    Device drivers and software libraries for Microsoft Kinect 2. Requires 64-bit Windows 8+ and USB 3.0.
  • iPi Recorder 4.x.x.x. This is required component and cannot be unchecked.
    iPi Recorder itself.

iPi Mocap Studio

Download and run the latest setup package of iPi Mocap Studio. You will be presented with the following dialog:

iPi Mocap Studio 3 Setup.png

All components are required for installation.

Note: The installation of Microsoft .NET Framework 4.5.1 requires an Internet connection. If needed, you can download offline installer for Microsoft .NET separately, and run it before iPi Mocap Studio setup.

Other components are included with iPi Mocap Studio setup.

Note: Shell Extensions for Video and Project files are needed to show thumbnails and preview for iPi video and project files in Windows Explorer
  1. Press the Install button to begin installation.
  2. You will be prompted to read and accept the license agreement(s) by checking corresponding checkbox.
    iPi Mocap Studio 3 Setup Accept License.png
    Click to enlarge
    Click to enlarge
  3. Press the Install button to begin installation.
  4. Upon installation is complete, you will be prompted to launch iPi Mocap Studio.
    iPi Mocap Studio 3 Setup Launch.png
  5. As soon as iPi Mocap Studio launches, you will be prompted to enter your license key or start 30-days free trial period.
    For more info about license protection see Licensing Policy.
    Welcome to ipistudio dlg.png
  6. Ensure that your graphics hardware is set to maximum performance with iPi Mocap Studio.

Recording Video from Multiple Depth Sensors


Minimum space requirements for depth sensor configurations depend on sensor model and number of sensors:

Configuration Azure Kinect (WFOV mode) Azure Kinect (NFOV mode)
Kinect 2 (Kinect for Xbox One)
1st gen. Depth Sensors
Orbbec Astra (Pro)
Single Depth Sensor

7 by 5 feet = 2 by 1.5 meters

8 by 5 feet = 2.5 by 1.5 meters

10 by 10 feet = 3 by 3 meters

2 Depth Sensors
(90-degrees config.)

7 by 7 feet = 2 by 2 meters

8 by 8 feet = 2.5 by 2.5 meters

10 by 10 feet = 3 by 3 meters

2 Depth Sensors
(180-degrees config.)

12 by 7 feet = 3.5 by 2 meters

16 by 8 feet = 5 by 2.5 meters

20 by 10 feet = 6 by 3 meters

3+ Depth Sensors

12 by 12 feet = 3.5 by 3.5 meters

16 by 16 feet = 5 by 5 meters

20 by 20 feet = 6 by 6 meters

Maximum capture area is about 7 by 7 feet ( = 2 by 2 meters), for all sensor models and configurations.

The pictures below will help you to understand possible capture area and required space. Dimensions are different for Azure Kinect, Kinect 2 (Kinect for Xbox One) and 1st gen. depth sensors due to different field of view.

Azure Kinect in NFOV mode and Kinect 2 for Windows (Kinect for Xbox One)

Kinect 2 (Kinect for Xbox One) sensors and Azure Kinect sensors in NFOV mode (narrow-view mode) have very close view angles (less than Azure Kinect in WFOV mode, but greater than 1st gen. depth sensors and Orbbec Astra):

Click to enlarge
Click to enlarge
Side view
Top view

Azure Kinect in WFOV mode

Wide-view mode (WFOV mode) of Azure Kinect has extremely wide view angle:

Click to enlarge
Click to enlarge
Side view
Top view
Note: But please be aware that quality of depth map is better in narrow-view mode (NFOV mode). For this reason, if you have enough space, it is recommended to use NFOV mode for Azure Kinect sensors.

First Generation Depth Sensor or Orbbec Astra (Pro)

If you're using outdated 1st generation depth sensor like Kinect v1 or Orbbec Astra (Pro) sensor which is almost the same as 1st gen. depth sensor:

Click to enlarge
Click to enlarge
Side view
Top view

Below in #Calibration section you will find information on 2 recommended sensors' mutual configuratioins.

Actor Clothing

Current version uses only depth information to track motions. So clothes requirements are:

  • no restrictions on clothes colors (just avoid shiny fabrics)
  • please use slim clothes to reduce noise in resulted animation

Recording Process

Please record a video using iPi Recorder application. It supports recording with Sony PS Eye cameras, depth sensors (Kinect) and DirectShow-compatible webcams (USB and FireWire).

iPi Recorder is a stand-alone application and does not require a powerful video card. You may choose to install it on a notebook PC for portability. Since it is free, you can install it on as many computers as you need.

Please run iPi Recorder and complete setup and background recording steps following the instructions: iPi Recorder Setup


Calibration is a process of computing accurate camera positions and orientations from a video of user waving a small glowing object called marker (for color/color+depth cameras). This step is essential and required for multi-camera system setup.

Important! Once you calibrated the camera system, you should not move your cameras for subsequent video shoots. If you move at least one camera, you need to perform calibration again.
Tip: We recommend to run calibration twice - before and after capture session. If any camera was moved during capture session, calibration made after the session can give you correct camera positions.

Dual Depth Sensors Configurations

There are two possible arrangements of the two sensors:

  1. angle between sensors is between 60 and 90 degrees;
  2. angle between sensors in near to 180 degrees that means that sensors are placed opposite to each other.

First configuration

Kinect 2

Second configuration

Kinect 2
Tip: If you see much noise (many yellow points) in any pair of sensors, try to slightly change direction of one of the sensors, this may substantially decrease noise caused by mutual interference.

Triple Depth Sensors Configuration

Ideally, depth sensors should be placed at apexes of equilateral triangle with sides equal to about 6 meters.

Ideal Configuration for Kinect 2
Ideal Configuration for Kinect

In practice it may not always be achievable due to available space size and configuration, mount positions, USB and/or power cord lengths, etc. Thus, sensor positions may slightly differ from ideal ones. Please look at the example from real mocap session.

Real-life Scene Configuration Expample

Four Depth Sensors Configuration

Ideally, depth sensors should be placed at sides of a square, each pair of sensors facing each other.

Tip: If you see much noise (many yellow points) in any pair of sensors, try to slightly change direction of one of the sensors, this may substantially decrease noise caused by mutual interference.
Ideal Configuration for Kinect 2
Ideal Configuration for Kinect

Recording Calibration Video

We use flashlight or other small glowing marker to perform calibration for depth sensors. This calibration procedure is very similar to one used for Multiple Sony PS Eye Cameras Configuration. But in case of depth sensors the overall workflow is simpler:

  • There is no need to touch floor by flashlight during recording.
  • No marking of points as ground in iPi Mocap Studio.
    (depth data is used to detect ground plane automatically)
  • No manual adjustments of scene scale in iPi Mocap Studio.
    (depth data is used to determine scale automatically)
Note: For Dual Depth Sensor configuration you can use calibration based on 3D plane. Accuracy of both methods is the same. Using flashlight marker is recommended as more easy-to-use method.

Glowing Marker

Mini Maglite flashlight is recommended for calibration. This is a very common flashlight in US and many other countries. Removing flashlight reflector converts it into an ideal glowing marker easily detectable by motion capture software.


If you cannot get a Mini Maglite, you can use some other similar flashlight.

Noname flashlight.jpg

Alternatively, you can use Sony Move motion controller with white light turned on.


Recording Calibration Sequence

  • Run iPi Recorder.
  • Color stream should be recorded along with depth data. That's why only modes "(depth+color)" can be used.
IPi Recorder video mode.png
  • Avoid bright lightening and white objects on background during recording of calibration video.
  • Start recording.
  • Move the marker slowly through your entire capture volume (front-top-right-bottom-left-back-top-right-bottom-left). Start from top and move the marker in a descending spiral motion.
Tip: The exact trajectory of the marker is not so important, just try to cover the whole capture volume, or at least its perimeter.
Tip: You should make the marker visible to both depth sensors at all times. Hold the marker in the straight arm away from your body.

Processing Calibration Video

  • Create new calibration project in iPi Mocap Studio:
    • Press New button or select File > New Project menu item or use Ctrl+N (1)
    • Choose Calibration project type in New Project Wizard.
    Select Project Type Calibration.png
  • Adjust Region-of-Interest so that glowing marker is visible at the beginning and at the end (2).
  • Click Calibrate Based on Light Marker button (3).
    Kinect Calibration Flashlight Start.png
  • Wait while automatic calibration is performing.
  • Make sure you have Good or Perfect calibration result (5).
Important! Failed calibration is not recommended to use, as you will not be able to get accurate tracking results. However, sometimes Failed status can be misdetected. If detected marker positions are close to marker image on video in all frames for all sensors, you can use this calibration for tracking (5).
  • Save results to calibration project file or using Save scene... button on Scene tab (6).
Kinect Calibration Flashlight Completed.png Depth Project Scene Tab.png

Recording Actor's Performance

After completing Setup and Background recording steps, press Record button to begin video recording. To stop recording, press Stop button.

Recommended Layout of an Action Video

  • Enter the actor.
  • Strike a T-pose.
  • Action
Click to enlarge


As soon as recorder starts, go to the capture area and stand in a T-pose. After that you can act desired motions.

Tip: Sometimes it is inconvenient or not possible to stand in T-Pose. You can do without it, but in this case initial alignment of actor model in iPi Mocap Studio will require more manual actions.

If you make several takes of one actor, we recommend to strike a T-pose in the beginning of each take.

Tip: We use tracking data from modern depth sensors for workflow improvements. You can get most of it if you follow recommendations described here.


Take is a concept originating from cinematography. In a nutshell, take is a single continuous recorded performance.

Usually it is a good idea to record multiple takes of the same motion, because a lot of things can go wrong for purely artistic reasons.


A common problem with motion capture is “clipping” in resulting 3D character animation. For example, arms entering the body of animated computer-generated character. Many CG characters have various items and attachments like a bullet-proof vest, a fantasy armor or a helmet. It can be easy for an actor to forget about the shape of the CG model.

For this reason, you may need to schedule more than one motion capture session for the same motions. Recommended approach is:

  • Record the videos
  • Process the videos in iPi Mocap Studio
  • Import your target character into iPi Mocap Studio and review the resulting animation
  • Give feedback to the actor
  • Schedule another motion capture session if needed

Ian Chisholm's hints on motion capture

Ian Chisholm is a machinima director and actor and the creator of critically acclaimed Clear Skies machinima series. Below are some hints from his motion capture guide based on his experience with motion capture for Clear Skies III.

Three handy hints for acting out mocap:

  1. Don’t weave and bob around like you’re in a normal conversation – it looks terrible when finally onscreen. You need to be fairly (but not completely) static when acting.
  2. If you are recording several lines in one go, make sure you have lead in and lead out between each one, i.e. stand still! Otherwise, the motions blend into each other and it’s hard to pick a start and end point for each take.
  3. Stand a bit like a gorilla – have your arms out from your sides:

    Well, obviously not quite that much. But anyway, if you don’t, you’ll find the arms clip slightly into the models and they look daft.

If you have a lot of capture to do, you need to strike a balance between short and long recordings. Aim for 30 seconds to 2 minutes. Too long is a pain to work on later due to the fiddlyness of setting up takes, and too short means you are forever setting up T-poses.


Because motion capture is not a perfect art, and neither is acting, it’s best to perform multiple takes. I found that three was the best amount for most motion capture. Take less if it’s a basic move, take more if it’s complex and needs to be more accurate. It will make life easier for you in the processing stage if you signal the break between takes – I did this by reaching out one arm and holding up fingers to show which take it was.

Naming conventions

As it’s the same actor looking exactly the same each and every time, and there is no sound, and the capture is in lowres 320*200, you really need to name the files very clearly so that you later know which act, scene, character, and line(s) the capture is for.

My naming convention was based on act, scene, character, page number of the scene, line number, and take number. You end up with something unpleasant to read like A3S1_JR_P2_L41_t3 but it’s essential when you’ve got 1500 actions to record.

Processing Video from Multiple Depth Sensors

  • Run iPi Mocap Studio
  • Press Ctrl+N or push New button on toolbar to create new project (1)
  • Choose recorded *.iPiVideo file
  • Select "Action" project type
    Select Project Type Action.png
  • To load camera calibration data, select corresponding calibration project file (.iPiCalib) or scene file (.iPiScene)
  • Save created project by pressing Ctrl+S or pushing button Save on toolbar (2)
  • Position timeline slider to the frame where actor is in T-pose (3)
  • Adjust actor height using appropriate slider on tab Actor (4)
  • Select Move tool on toolbar. (5)
  • Move actor model to left or right to match roughly actor silhouette on video.
Note: Actor model can look smaller due to its position along axis of view. Don’t pay attention to it on this step.
  • Switch to Tracking tab and push Refit pose button. (6)
  • As a result model should be matched with actor image from video. If it does not happen then delete result using item Edit > Delete pose from main menu and repeat above actions.
  • Using the slider right to the button Show Skin in toolbar, make sure that morph of model corresponds to the actor image (7).
    • If no, than adjust chest/bust/waist/hips/belly morph using appropriate sliders from Actor tab (8).
  • Set the beginning of Region-of-Interest (ROI) to the current frame with T-pose by pressing I key on keyboard or by double-clicking on the left edge of ROI bar under timeline (7).
  • Switch to Tracking tab, change tracking options (Head tracking, Shoulders and Spine) if required (9).
Tip: Check Use fast tracking algorithm (BETA) option to use fast algorithm. This option only affects Track Forward / Track Backward. This algorithm may be less accurate in some cases, but its tracking speed is up to 2.5 times higher, depending on particular hardware and tracking options.
  • To start tracking just push Track Forward button (10).
  • Wait and watch...
Note: Use old tracking algorithm option (11) allows to switch to old algorithm. You can use this in case default tracking crashes (rare case).

Tracking Tips and Tricks

Using Pose Mismatch View

Pose Mismatch window is a very useful tool which allows you to understand how actor settings affect tracking.

  • Pose Mismatch window is shown using View > Pose Mismatch menu item
  • Mismatch number at the top evaluates how actor model matches to video in the current frame
  • You need to run Refit Pose to match actor model to video before comparing Mismatch numbers
  • Lower value of Mismatch number means better match. So your need to minimize Mismatch number while choosing settings
Note: When Mismatch number is negative, then greater absolute value means better match.
Pose Mimatch View

Checking Sensors Calibration

Frequent cause of tracking errors is incorrect sensors calibration, that can be result of moving sensor(s) after calibration recording. You can use T-pose to detect this problem:

  • Select frame with T-Pose
  • Select View > Show Depth from All Sensors menu item
  • Select View > Color Point Cloud with RGB Data menu item (if available)
  • If any sensor was shifted, point clouds from different sensor will not match properly and you'll see weird shifts in point clouds (see screenshots)
Tip: We recommend to run calibration twice - before and after capture session. If any camera was moved during capture session, calibration made after the session can give you correct camera positions.
Incorrect calibration
Correct calibration

Manual Clean-up

Once initial tracking is performed on all (or part) of your video, you can begin cleaning out tracking errors (if any). Automatic Refinement and Filtering should be applied after clean-up.

Cleaning up tracking gaps

Clean-up Steps

Tracking errors usually happen in a few specific video frames and propagate to multiple subsequent frames, resulting in tracking gaps. Examples of problematic frames:

  • Occlusion (like one hand not visible in any of the cameras)
  • Indistinctive pose (like hands folded on chest).
  • Very fast motion with motion blur.

To clean up a sequence of incorrect frames (a tracking gap), you should use backward tracking:

  1. Go toward the last frame of tracking gap, to a frame where actor pose is distinctive (no occlusion, no motion blur etc.).
  2. If necessary, use Rotate, Move and IK (Inverse Kinematics) tools to edit character pose to match actor pose on video.
  3. Turn off Trajectory Filtering (set it to zero) so that it does not interfere with your editing.
  4. Click Refit Pose button to get a better fit of character pose.
  5. Click Track Backward button.
  6. Stop backward tracking as soon as it comes close to the nearest good frame.
  7. If necessary, go back to remaining parts of tracking gap and use forward and backward tracking to clean them up.

Individual body parts tracking

Tracking tab individual body parts.png

In most cases tracking errors affect some of limbs. Individual Body Parts Tracking settings on Tracking tab allow to redo tracking specified body parts.

  • Tracking will be done for selected body parts only.
  • Unselected body parts will keep the same rotations.

Cleaning up individual frames

To clean up individual frames you should use a combination of editing tools (Rotate, Move and Inverse Kinematics) and Refit Pose button.

Note: after Refit Pose operation iPi Mocap Studio automatically applies Trajectory Filtering to produce a smooth transition between frames. As the result, pose in current frame is affected by nearby frames. This may look confusing. If you want to see exact result of Refit Pose operation in current frame you should turn off Trajectory Filtering (set it to zero), but do not forget to change it back to suitable value later.

Tracking errors that cannot be cleaned up using iPi Studio

Not all tracking errors can be cleaned up in iPi Mocap Studio using automatic tracking and Refit Pose button.

  • Frames immediately affected by occlusion sometimes cannot be corrected. Recommended workarounds:
    • Manually edit problematic poses (not using Refit Pose button).
    • Record a new video of the motion and try to minimize occlusion.
    • Record a new video of the motion using more cameras.
  • Frames immediately affected by motion blur sometimes cannot be corrected. Recommended workarounds:
    • Manually edit problematic poses (not using Refit Pose button).
    • Edit problematic poses in some external animation editor.
    • Record a new video of the motion using higher framerate.
  • Frames affected by strong shadows on the floor sometimes cannot be corrected. Typical example is push-ups. This is a limitation of current version of markerless mocap technology. iPi Soft is working to improve tracking in future versions of iPi Mocap Studio.

Automatic Refinement and Filtering

Automatic Refinement and Filtering should be applied after Manual Clean-up, if there were tracking errors.

Also, this final step is called Post-Processing and includes:

Clean-up Steps
  1. Tracking Refinement
  2. Jitter Removal
  3. Trajectory Filtering

Tracking refinement

After the primary tracking and cleanup are complete, you can optionally run the Refine pass (see Refine Forward and Refine Backward buttons). It slightly improves accuracy of pose matching, and can automatically correct minor tracking errors. However, it takes a bit more time than the primary tracking, so it is not recommended for quick-and-dirty tests.

Important! Refine should be applied:
  • Using the same tracking parameters as the primary tracking (e.g. feet tracking, head tracking) in order not to lose previously tracked data.
  • Before motion controller data.
  • If you plan to manually edit the animation (not related to automatic cleanup with Refit Pose).

In contrast to the primary tracking, Refine does no pose prediction. It is based on the current pose in a frame only. Essentially, running Refine is equal to automatically applying Refit Pose to a range of frames which were previously tracked.

Post-processing: Jitter Removal

  • Jitter Removal filter is a powerful post-processing filter. It should be applied after cleaning up tracking gaps and errors.
  • It is recommended that you always apply Jitter Removal filter before exporting animation.
  • Jitter Removal filter suppresses unwanted noise and at the same time preserves sharp, dynamic motions. By design, this filter should be applied to relatively large segments of animation (no less than 50 frames).
  • Range of frames affected by Jitter Removal is controlled by current Region of Interest (ROI).
  • You can configure Jitter Removal options for specific body parts. Default setting for Jitter Removal “aggressiveness” is 1 (one tick of corresponding slider). Oftentimes, you can get better results by applying a slightly more aggressive Jitter Removal for torso and legs. Alternatively, you may want to use less aggressive Jitter Removal settings for sharp motions like martial arts moves.
  • Jitter Removal filter makes an internal backup of all data produced by tracking and clean up stages. Therefore, you can re-apply Jitter Removal multiple times. Each subsequent run works off original tracking/clean-up results and overrides previous runs.

Post-processing: Trajectory Filtering

  • Trajectory Filter is a traditional digital signal filter. Its purpose is to filter out minor noise that remains after Jitter Removal filter.
  • Trajectory Filter is very fast. It is applied on-the-fly to current Region of Interest (ROI).
  • Default setting for Trajectory Filter is 1. Higher settings result in multiple passes of Trajectory Filter. It is recommended that you leave it at the default setting.
  • Trajectory Filter can be useful for “gluing” together multiple segments of animation processed with different Jitter Removal options: change the Region of Interest (ROI) to cover all of your motion (e.g. multiple segments processed with different jitter removal setting); change Trajectory Filtering setting to 0 (zero); then change it back to 1 (or other suitable value).

Export and Motion Transfer

Animation export and motion transfer

Video Materials

  • From Jimer Lins: Part 1 - Setting up your Kinects and Calibrating
  • From Jimer Lins: Part 2 - Recording the Action
  • From Jimer Lins: Part 3 - Processing the Recorded Action