Mono-STAR: Mono-camera Scene-level Tracking and Reconstruction

Overview

We present Mono-STAR, the first real-time 3D reconstruction system that simultaneously supports semantic fusion, fast motion tracking, non-rigid object deformation, and topological change under a unified framework. The proposed system solves a new optimization problem that incorporates optical-flow-based 2D constraints to deal with fast motion, and a novel semantic-aware deformation graph (SAD-graph) for handling topology change. We test the proposed system under various challenging scenes and demonstrate that it significantly outperforms existing state-of-the-art systems.

Proposed

Overview of the proposed system. The system runs in two parallel threads, one for measurement and one for geometry. In each time-step $t$, the measurement thread loads a measurement $M_t$ from images or a camera buffer. Then, a segmentation network generates a set of semantic labels $L^m_{t}$. Once the measurement is loaded on the GPU memory, $M_t$ and previous alignment rendering $R^a_{t-1}$ are fed into an optical-flow network to generate the optical-flow $OF_t$ from previous geometry $S_{t-1}$ to measurement $M_t$. Optical-flow $OF_t$, geometry rendering $R_t$ and measurement $M_t$ are used to compute warp-field $W_t$ with non-rigid alignment. After the alignment, previous geometry $S_{t-1}$ will be warped to $S_{t-1}^{warp}$. The fusion render map $R^g_{t-1}$ is then rendered from $S_{t-1}^{warp}$. $R^g_{t-1}$, $S_{t-1}^{warp}$ are then combined with semantic labels $L^m_t$ from the measurement thread to generate the updated geometry $S_t$. In the end, a new alignment rendering $R^a_t$ is generated from updated geometry $S_t$ for the processing of the next frame.

Experiment Results

Fast Motion - BasketBall

Color	Graph	Semantic
geometry_color.mp4	geometry_graph.mp4	geometry_seg.mp4

Fast Motion - Falling Cup

Color	Graph	Semantic
geometry_color.mp4	geometry_graph.mp4	geometry_seg.mp4

Topology Change - Pick Cup

Color	Geometry	Semantic
geometry_color.mp4	geometry_normal.mp4	geometry_seg.mp4

Topology Change - Move Toy

Color	Geometry	Semantic
geometry_color.mp4	geometry_normal.mp4	geometry_seg.mp4

Topology Change - Push Coffee

Color	Geometry	Semantic
geometry_color.mp4	geometry_normal.mp4	geometry_seg.mp4

Deformation - Deform Pillow

Color	Geometry	Graph
geometry_color.mp4	geometry_normal.mp4	geometry_graph.mp4

Deformation - Deform Umbrella

Color	Geometry	Graph
geometry_color.mp4	geometry_normal.mp4	geometry_graph.mp4

Conclusion

We presented Mono-STAR, a single-view solution for the semantic-aware STAR problem. Mono-STAR uses a novel semantic-aware and adaptive deformation graph for simultaneous tracking and reconstruction, and can handle topology changes as well as semantic fusion. Experiments show that Mono-STAR achieves promising results in non-rigid object reconstruction, while resisting to semantic segmentation errors, and capturing fast motions on various challenging scenes. We believe that this system can inspire and boost more future research on imitation learning, dexterous manipulation, and many other relevant robotics problems.

Source Code

For source code, please check https://github.com/changhaonan/StarHub. Code is only checked under Ubuntu20.04.

Cite the work

@article{chang2023mono,
  title={Mono-STAR: Mono-camera Scene-level Tracking and Reconstruction},
  author={Chang, Haonan and Ramesh, Dhruv Metha and Geng, Shijie and Gan, Yuqiu and Boularias, Abdeslam},
  journal={arXiv preprint arXiv:2301.13244},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
docs		docs
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mono-STAR: Mono-camera Scene-level Tracking and Reconstruction

Overview

Proposed

Experiment Results

Fast Motion - BasketBall

Fast Motion - Falling Cup

Topology Change - Pick Cup

Topology Change - Move Toy

Topology Change - Push Coffee

Deformation - Deform Pillow

Deformation - Deform Umbrella

Conclusion

Source Code

Cite the work

About

Releases

Packages

changhaonan/Mono-STAR-demo

Folders and files

Latest commit

History

Repository files navigation

Mono-STAR: Mono-camera Scene-level Tracking and Reconstruction

Overview

Proposed

Experiment Results

Fast Motion - BasketBall

Fast Motion - Falling Cup

Topology Change - Pick Cup

Topology Change - Move Toy

Topology Change - Push Coffee

Deformation - Deform Pillow

Deformation - Deform Umbrella

Conclusion

Source Code

Cite the work

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages