This repo introduces an open and scalable aerial synthetic data generation workflow TOPO-DataGen. It takes common geo-data as inputs and outputs diverse synthetic visual data such as 2D image-3D geometry-semantics-camera pose. The rendering engine is developed upon CesiumJS.
Supported input formats:
- orthophoto (usually
.tif
) - digital terrain model (usually
.tif
) - point cloud with classification (usually
.las
)
Expected output:
- synthetic RGB image
- scene coordinates in ECEF, i.e., point cloud associted with each pixel
- semantics label
One may further create 2D/3D data using the above outputs such as:
- depth (z-buffer or Euclidean)
- surface normal vectors
- 2D/3D keypoints
Last update: 12.28.2021
The TOPO-DataGen workflow is officially presented in the paper
CrossLoc: Scalable Aerial Localization Assisted by Multimodal Synthetic Data
Qi Yan, Jianhao Zheng,
Simon Reding, Shanci Li,
Iordan Doytchinov
École Polytechnique Fédérale de Lausanne (EPFL)
Links: arXiv | code repos
Minimum system requirement:
- OS:
Ubuntu 20.04 / 18.04
firefox
browser (headless node might not work)conda
environment- a reliable Internet connection is preferred
Install dependencies:
sudo bash setup/install.sh
Note: the sudo
rights is needed for some Linux dependencies such as docker
and nodejs
.
Data preprocessing:
Please refer to data_preprocess/notes.md
for data preprocessing steps. We provide a quick
demo
dataset for you to explore and reproduce the workflow using an open-sourced geodata database. The provided EPFL
and comballaz
assets are respectively used to produce the urbanscape
and naturescape
sets in our
CrossLoc Benchmark Datasets.
Please note that there is no strictly standardized data preprocessing steps, and it is out of the scope of this repo.
Configuration of data generation:
-
LHS synthetic data generation
Configure the sampling boundary in
script/presets/sence_name.json
. The configuration parameter is of great significance for the following redering of synthetic images. Please make sure the height is about 100~200 meters above the ground of the area. -
Matched synthetic image with given camera pose from real data collected by the DJI drone
export scene=demo export OUT_DIR=$HOME/Documents/Cesium python scripts/start_generate.py $scene-matching $scene -matchPhantom Path_to_real_images -cesiumhome $OUT_DIR # The -matchPhantom argument will call exiftool to automatically extract the meta data including # camera poses and orientation of the images into a .csv file and start generating the matching synthetic image.
Start data generation:
To initiate the demo
data rendering, simply run:
export scene=demo
export OUT_DIR=$HOME/Documents/Cesium
python scripts/start_generate.py $scene-LHS $scene -p scripts/presets/$scene.json -cesiumhome $OUT_DIR
After the rendering is finished, we suggest running the helper scripts to clean the data and do some simple sanity check as follows:
# please refer to the code for detailed implementation
# the data cleaning and sanity check are based on some intuitive rules, and they do not guarantee a perfect dataset afterwards
export LAS_DIR=$(pwd)/data_preprocess/$scene/****-surface3d/ecef-downsampled # demo data preprocessing default path
python scripts/remove_outliers.py --input_path $OUT_DIR/$scene-LHS --las_path $LAS_DIR --save_backup
python scripts/tools/scan_npy_pointcloud.py --label_path $OUT_DIR/$scene-LHS --threshold 25
Necessary sanity check:
With the scan_npy_pointcloud.py, we would delete the synthetic image with reprojection error above 5 pixels. This may be caused by the fluctuation of the data steaming from the Ceisum Ion sever or local file loading issue. After that, run the following script to regenerate these images again until all the images look good and pass scan_npy_pointcloud check:
python scripts/start_generate.py $scene-LHS scene -cesiumhome $OUT_DIR
Here we also suggest to implement manually check if you have time.
Retrieve semantics:
Please note that we retrieve the pixel-wise semantic label based on the classified point cloud and scene coordinate. For each pixel in the frame, the closest matching point in the classified point cloud is identified and its class is used as the label.
We highly recommend to first clean the data (last step) to remove the outliers outside the boundary of the classified point cloud, as it improves the semantic recovery efficiency and quality.
# CUDA device is preferred as the matrix computation could be much faster
export SM_DIST_DIR=$OUT_DIR/$scene-LHS-sm-dist
python scripts/semantics_recovery.py --input_path $OUT_DIR/$scene-LHS --las_path $LAS_DIR \
--output_path_distance $SM_DIST_DIR
Preview the data:
Please refer to scripts/tools/data_preview.ipynb
notebook to visualize and preview the data.
Other helper scripts like npy_to_txt_pointcloud.py
, plot_depth_images.py
, plot_semantics_images.py
could also be found at scripts/tools
.
Please note that many other 2D/3D data could be generated based on the raw data output, for example:
- camera pose + absolute scene coordinate ---> depth / surface normals
- RGB image + absolute scene coordinate ---> point cloud with color
- camera pose + RGB image ---> temporal video sequence
The many possibilities to generate downstream tasks are omitted in the diagram.
Note that we produce depth and surface normal through homogeneous transformation and open3d
respectively. As long as the scene coordinate and 6D camera pose are available, many other 3D labels are relatively easy to compute.
├── assets # doc assets
├── data_preprocess # utility to preprocess the data, see the `notes.md` for guidline
├── scripts
│ ├── presets # preset files defining camera poses in the virtual world
│ ├── tools # helper scripts for sanity check and visualization
│ ├── random_walk.py # script to generate random walk camera poses
│ ├── reframeLib.jar # coordinate coversion utility adopted from https://github.com/hofmann-tobias/swissREFRAME
│ ├── reframeTransform.py # script to convert coordinate reference system
│ ├── remove_outliers.py # script to remove outlier points in the generated scene coordinate data
│ ├── semantics_recovery.py # script to recover the pixel-wise semantic label
│ └── start_generate.py # main script for data generation
├── setup # utility to setup the environment
├── source # JS utilities
│ ├── app.js # main JS script for data generation
│ ├── nd.min.js # utility JS script to support numpy data
│ ├── npyjs.js # utility JS script to support numpy data
│ └── testmap.js # JS script for debug mode
├── index.css # CesiumJS utility
├── index.html # CesiumJS utility
├── package.json # nodeJS dependency list
├── server.js # CesiumJS utility
└── testmap.html # web for quick debug mode
Some tips here:
-
To change the resolution of picture , find and mofidy the
container.style.width
andcontainer.style.height
defined inscripts/start_generate.py
. The default resolution is 720*480 px. Beware that a higher resolution takes more time to render. -
To fine tune the detailed rendering parameters in
source/app.js
, the official documentation could be very helpful.
In particular, we thank swisstopo and CesiumJS respectively for their open-sourced geodata and open-sourced rendering engine. We also appreciate the following open-sourced tools, which greatly simplify the workflow:
If you find our code useful for your research, please cite the paper:
@article{yan2021crossloc,
title={CrossLoc: Scalable Aerial Localization Assisted by Multimodal Synthetic Data},
author={Yan, Qi and Zheng, Jianhao and Reding, Simon and Li, Shanci and Doytchinov, Iordan},
journal={arXiv preprint arXiv:2112.09081},
year={2021}
}