Skip to content

Commit

Permalink
Update ScanNet (#21)
Browse files Browse the repository at this point in the history
* update ScanNet preprocessing

* merge (batch_)load_scannet_data.py from mmdetection3d

* fix --dataset in readme
  • Loading branch information
filaPro authored Jul 21, 2021
1 parent bceee4b commit fd87424
Show file tree
Hide file tree
Showing 11 changed files with 471 additions and 240 deletions.
36 changes: 4 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
# ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

**News**:
* :fire: July, 2021. We update `ScanNet` image preprocessing both [here](https://github.com/saic-vul/imvoxelnet/pull/21) and in [mmdetection3d](https://github.com/open-mmlab/mmdetection3d/pull/696).
* :fire: June, 2021. `ImVoxelNet` for `KITTI` is now [supported](https://github.com/open-mmlab/mmdetection3d/tree/master/configs/imvoxelnet) in [mmdetection3d](https://github.com/open-mmlab/mmdetection3d).

This repository contains implementation of the monocular/multi-view 3D object detector ImVoxelNet, introduced in our paper:
Expand Down Expand Up @@ -38,7 +39,7 @@ We support three benchmarks based on the **SUN RGB-D** dataset.
you should follow the instructions in [sunrgbd](data/sunrgbd).
* For the [PerspectiveNet](https://papers.nips.cc/paper/2019/hash/b87517992f7dce71b674976b280257d2-Abstract.html)
benchmark with 30 object categories, the same instructions can be applied;
you only need to pass `--dataset sunrgbd_monocular` when running `create_data.py`.
you only need to set `dataset` argument to `sunrgbd_monocular` when running `create_data.py`.
* The [Total3DUnderstanding](https://github.com/yinyunie/Total3DUnderstanding)
benchmark implies detecting objects of 37 categories along with camera pose and room layout estimation.
Download the preprocessed data as
Expand All @@ -49,38 +50,9 @@ We support three benchmarks based on the **SUN RGB-D** dataset.
python tools/data_converter/sunrgbd_total.py
```

**ScanNet.** Please follow instructions in [scannet](data/scannet).
Note that `create_data.py` works with point clouds, not RGB images; thus, you should do some preprocessing before running `create_data.py`.
1. First, you should obtain RGB images. We recommend using a script from [SensReader](https://github.com/ScanNet/ScanNet/tree/master/SensReader/python).
2. Then, copy the camera pose `.txt` files and `.jpg` images to the `scannet/sens_reader` folder.
3. Copy axis alignment matrix `.txt` files to the `scannet/txts` folder.
4. Move the results of `batch_load_scannet_data.py` to the `scannet/mmdetection3d` folder. Final directory structure:
```
scannet
├── sens_reader
│ ├── scans
│ │ ├── scene0000_00
│ │ │ ├── out
│ │ │ │ ├── frame-000001.color.jpg
│ │ │ │ ├── frame-000001.pose.txt
│ │ │ │ ├── frame-000002.color.jpg
│ │ │ │ ├── ...
│ │ ├── ...
├── txts
│ ├── scene0000_00.txt
│ ├── ...
├── mmdetection3d
│ ├── scene0000_00_bbox.npy
│ ├── scene0000_00_ins_label.npy
│ ├── scene0000_00_sem_label.npy
│ ├── scene0000_00_vert.npy
│ ├── scene0000_01_bbox.npy
│ ├── ...
```
Now, you may run `create_data.py` with `--dataset scannet_monocular`.

For **ScanNet** please follow instructions in [scannet](data/scannet).
For **KITTI** and **nuScenes**, please follow instructions in [getting_started.md](docs/getting_started.md).
For `nuScenes`, set `--dataset nuscenes_monocular`.
For `nuScenes`, set `dataset` argument to `nuscenes_monocular`.

### Getting Started

Expand Down
32 changes: 27 additions & 5 deletions data/scannet/README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,30 @@
### Prepare ScanNet Data
### Prepare ScanNet Data for Indoor Detection or Segmentation Task

We follow the procedure in [votenet](https://github.com/facebookresearch/votenet/).

1. Download ScanNet v2 data [HERE](https://github.com/ScanNet/ScanNet). Link or move the 'scans' folder to this level of directory.
1. Download ScanNet v2 data [HERE](https://github.com/ScanNet/ScanNet). Link or move the 'scans' folder to this level of directory. If you are performing segmentation tasks and want to upload the results to its official [benchmark](http://kaldir.vc.in.tum.de/scannet_benchmark/), please also link or move the 'scans_test' folder to this directory.

2. In this directory, extract point clouds and annotations by running `python batch_load_scannet_data.py`. Add the `--max_num_point 50000` flag if you only use the ScanNet data for the detection task. It will downsample the scenes to less points.

3. In this directory, extract RGB image with poses by running `python extract_posed_images.py`. This step is optional. Skip it if you don't plan to use multi-view RGB images. Add `--max-images-per-scene -1` to disable limiting number of images per scene. ScanNet scenes contain up to 5000+ frames per each. After extraction, all the .jpg images require 2 Tb disk space. The recommended 300 images per scene require less then 100 Gb. For example multi-view 3d detector ImVoxelNet samples 50 and 100 images per training and test scene.

2. In this directory, extract point clouds and annotations by running `python batch_load_scannet_data.py`.
4. Enter the project root directory, generate training data by running

3. Enter the project root directory, generate training data by running
```bash
python tools/create_data.py scannet --root-path ./data/scannet --out-dir ./data/scannet --extra-tag scannet
```

The overall process could be achieved through the following script

```bash
python batch_load_scannet_data.py
python extract_posed_images.py
cd ../..
python tools/create_data.py scannet --root-path ./data/scannet --out-dir ./data/scannet --extra-tag scannet
```

The directory structure after pre-processing should be as below

```
scannet
├── scannet_utils.py
Expand All @@ -26,11 +33,26 @@ scannet
├── scannet_utils.py
├── README.md
├── scans
├── scannet_train_instance_data
├── scans_test
├── scannet_instance_data
├── points
│ ├── xxxxx.bin
├── instance_mask
│ ├── xxxxx.bin
├── semantic_mask
│ ├── xxxxx.bin
├── seg_info
│ ├── train_label_weight.npy
│ ├── train_resampled_scene_idxs.npy
│ ├── val_label_weight.npy
│ ├── val_resampled_scene_idxs.npy
├── posed_images
│ ├── scenexxxx_xx
│ │ ├── xxxxxx.txt
│ │ ├── xxxxxx.jpg
│ │ ├── intrinsic.txt
├── scannet_infos_train.pkl
├── scannet_infos_val.pkl
├── scannet_infos_test.pkl
```
124 changes: 83 additions & 41 deletions data/scannet/batch_load_scannet_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,58 +16,81 @@
from load_scannet_data import export
from os import path as osp

SCANNET_DIR = 'scans'
DONOTCARE_CLASS_IDS = np.array([])
OBJ_CLASS_IDS = np.array(
[3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 24, 28, 33, 34, 36, 39])


def export_one_scan(scan_name, output_filename_prefix, max_num_point,
label_map_file, scannet_dir):
def export_one_scan(scan_name,
output_filename_prefix,
max_num_point,
label_map_file,
scannet_dir,
test_mode=False):
mesh_file = osp.join(scannet_dir, scan_name, scan_name + '_vh_clean_2.ply')
agg_file = osp.join(scannet_dir, scan_name,
scan_name + '.aggregation.json')
seg_file = osp.join(scannet_dir, scan_name,
scan_name + '_vh_clean_2.0.010000.segs.json')
# includes axisAlignment info for the train set scans.
meta_file = osp.join(scannet_dir, scan_name, f'{scan_name}.txt')
mesh_vertices, semantic_labels, instance_labels, instance_bboxes, \
instance2semantic = export(mesh_file, agg_file, seg_file,
meta_file, label_map_file, None)

mask = np.logical_not(np.in1d(semantic_labels, DONOTCARE_CLASS_IDS))
mesh_vertices = mesh_vertices[mask, :]
semantic_labels = semantic_labels[mask]
instance_labels = instance_labels[mask]

num_instances = len(np.unique(instance_labels))
print(f'Num of instances: {num_instances}')

bbox_mask = np.in1d(instance_bboxes[:, -1], OBJ_CLASS_IDS)
instance_bboxes = instance_bboxes[bbox_mask, :]
print(f'Num of care instances: {instance_bboxes.shape[0]}')

N = mesh_vertices.shape[0]
if N > max_num_point:
choices = np.random.choice(N, max_num_point, replace=False)
mesh_vertices = mesh_vertices[choices, :]
semantic_labels = semantic_labels[choices]
instance_labels = instance_labels[choices]
mesh_vertices, semantic_labels, instance_labels, unaligned_bboxes, \
aligned_bboxes, instance2semantic, axis_align_matrix = export(
mesh_file, agg_file, seg_file, meta_file, label_map_file, None,
test_mode)

if not test_mode:
mask = np.logical_not(np.in1d(semantic_labels, DONOTCARE_CLASS_IDS))
mesh_vertices = mesh_vertices[mask, :]
semantic_labels = semantic_labels[mask]
instance_labels = instance_labels[mask]

num_instances = len(np.unique(instance_labels))
print(f'Num of instances: {num_instances}')

bbox_mask = np.in1d(unaligned_bboxes[:, -1], OBJ_CLASS_IDS)
unaligned_bboxes = unaligned_bboxes[bbox_mask, :]
bbox_mask = np.in1d(aligned_bboxes[:, -1], OBJ_CLASS_IDS)
aligned_bboxes = aligned_bboxes[bbox_mask, :]
assert unaligned_bboxes.shape[0] == aligned_bboxes.shape[0]
print(f'Num of care instances: {unaligned_bboxes.shape[0]}')

if max_num_point is not None:
max_num_point = int(max_num_point)
N = mesh_vertices.shape[0]
if N > max_num_point:
choices = np.random.choice(N, max_num_point, replace=False)
mesh_vertices = mesh_vertices[choices, :]
if not test_mode:
semantic_labels = semantic_labels[choices]
instance_labels = instance_labels[choices]

np.save(f'{output_filename_prefix}_vert.npy', mesh_vertices)
np.save(f'{output_filename_prefix}_sem_label.npy', semantic_labels)
np.save(f'{output_filename_prefix}_ins_label.npy', instance_labels)
np.save(f'{output_filename_prefix}_bbox.npy', instance_bboxes)


def batch_export(max_num_point, output_folder, train_scan_names_file,
label_map_file, scannet_dir):
if not test_mode:
np.save(f'{output_filename_prefix}_sem_label.npy', semantic_labels)
np.save(f'{output_filename_prefix}_ins_label.npy', instance_labels)
np.save(f'{output_filename_prefix}_unaligned_bbox.npy',
unaligned_bboxes)
np.save(f'{output_filename_prefix}_aligned_bbox.npy', aligned_bboxes)
np.save(f'{output_filename_prefix}_axis_align_matrix.npy',
axis_align_matrix)


def batch_export(max_num_point,
output_folder,
scan_names_file,
label_map_file,
scannet_dir,
test_mode=False):
if test_mode and not os.path.exists(scannet_dir):
# test data preparation is optional
return
if not os.path.exists(output_folder):
print(f'Creating new data folder: {output_folder}')
os.mkdir(output_folder)

train_scan_names = [line.rstrip() for line in open(train_scan_names_file)]
for scan_name in train_scan_names:
scan_names = [line.rstrip() for line in open(scan_names_file)]
for scan_name in scan_names:
print('-' * 20 + 'begin')
print(datetime.datetime.now())
print(scan_name)
Expand All @@ -78,7 +101,7 @@ def batch_export(max_num_point, output_folder, train_scan_names_file,
continue
try:
export_one_scan(scan_name, output_filename_prefix, max_num_point,
label_map_file, scannet_dir)
label_map_file, scannet_dir, test_mode)
except Exception:
print(f'Failed export scan: {scan_name}')
print('-' * 20 + 'done')
Expand All @@ -88,14 +111,18 @@ def main():
parser = argparse.ArgumentParser()
parser.add_argument(
'--max_num_point',
default=50000,
default=None,
help='The maximum number of the points.')
parser.add_argument(
'--output_folder',
default='./scannet_train_instance_data',
default='./scannet_instance_data',
help='output folder of the result.')
parser.add_argument(
'--scannet_dir', default='scans', help='scannet data directory.')
'--train_scannet_dir', default='scans', help='scannet data directory.')
parser.add_argument(
'--test_scannet_dir',
default='scans_test',
help='scannet data directory.')
parser.add_argument(
'--label_map_file',
default='meta_data/scannetv2-labels.combined.tsv',
Expand All @@ -104,10 +131,25 @@ def main():
'--train_scan_names_file',
default='meta_data/scannet_train.txt',
help='The path of the file that stores the scan names.')
parser.add_argument(
'--test_scan_names_file',
default='meta_data/scannetv2_test.txt',
help='The path of the file that stores the scan names.')
args = parser.parse_args()
batch_export(args.max_num_point, args.output_folder,
args.train_scan_names_file, args.label_map_file,
args.scannet_dir)
batch_export(
args.max_num_point,
args.output_folder,
args.train_scan_names_file,
args.label_map_file,
args.train_scannet_dir,
test_mode=False)
batch_export(
args.max_num_point,
args.output_folder,
args.test_scan_names_file,
args.label_map_file,
args.test_scannet_dir,
test_mode=True)


if __name__ == '__main__':
Expand Down
Loading

0 comments on commit fd87424

Please sign in to comment.