Merge pull request #316 from princeton-vl/develop

Release v1.10
princeton-vl · Oct 28, 2024 · 98295c7 · 98295c7
2 parents 0fd9e54 + 4b2119a
commit 98295c7
Show file tree

Hide file tree

Showing 40 changed files with 897 additions and 352 deletions.
diff --git a/README.md b/README.md
@@ -110,6 +110,7 @@ Conference on Computer Vision and Pattern Recognition (CVPR) 2024
 - ["Hello World": Generate your first Infinigen-Nature scene](docs/HelloWorld.md)
 - ["Hello Room": Generate your first Infinigen-Indoors scene](docs/HelloRoom.md)
 - [Configuring Infinigen](docs/ConfiguringInfinigen.md)
+- [Configuring Cameras](docs/ConfiguringCameras.md)
 - [Downloading pre-generated data](docs/PreGeneratedData.md)
 - [Generating individual assets](docs/GeneratingIndividualAssets.md)
 - [Exporting to external fileformats (OBJ, OpenUSD, etc)](docs/ExportingToExternalFileFormats.md)

diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md
@@ -129,4 +129,11 @@ v1.9.1
 - Fix gin configs not correctly passed to slurm jobs in generate_individual_assets
 - Fix integration test image titles 
 - Fix integration test asset image alignment
-- Make multistory houses disabled by default
+- Make multistory houses disabled by default
+
+v1.10.0
+- Add Configuring Cameras documentation
+- Add config for multiview cameras surrounding a point of interest
+- Add MaterialSegmentation output pass
+- Add passthrough mode to direct manage_jobs stdout directly to terminal
+- Add "copyfile:destination" upload mode
diff --git a/docs/ConfiguringCameras.md b/docs/ConfiguringCameras.md
@@ -0,0 +1,102 @@
+# Configuring Cameras
+
+This document gives examples of how to configure cameras in Infinigen for various computer vision tasks.
+
+### Example Commands
+
+##### Stereo Matching
+
+Generate many nature, scenes each with 1 stereo camera:
+```bash
+python -m infinigen.datagen.manage_jobs --output_folder outputs/stereo_nature --num_scenes 30 \
+--pipeline_config stereo.gin local_256GB.gin cuda_terrain.gin blender_gt.gin --configs high_quality_terrain
+```
+
+Generate many indoor rooms, each with 20 stereo cameras:
+```bash
+python -m infinigen.datagen.manage_jobs --output_folder outputs/stereo_indoors --num_scenes 30 \
+--pipeline_configs local_256GB.gin stereo.gin blender_gt.gin indoor_background_configs.gin --configs singleroom \
+--pipeline_overrides get_cmd.driver_script='infinigen_examples.generate_indoors' \
+--overrides camera.spawn_camera_rigs.n_camera_rigs=20 compute_base_views.min_candidates_ratio=2 compose_indoors.terrain_enabled=False compose_indoors.restrict_single_supported_roomtype=True
+```
+
+We recommend 20+ cameras per indoor room since room generation is not view-dependent and can be rendered from many angles. This helps overall GPU utilization since many frames are rendered per scene generated. In nature scenes, the current camera code would place cameras very far apart, meaning visible content does not overlap and there is minimal benefit to simply increasing `n_camera_rigs` in nature scenes without also customizing their arrangement. Thus, if you wish to extract more stereo frames per nature scene, we recommend instead rendering a low fps video using the "Random Walk Videos" commands below. 
+
+##### Random Walk Videos
+
+Nature video, slow & smooth random walk camera motion:
+```bash
+python -m infinigen.datagen.manage_jobs --output_folder outputs/video_smooth_nature --num_scenes 30 \
+--pipeline_config monocular_video.gin local_256GB.gin cuda_terrain.gin blender_gt.gin --configs high_quality_terrain \
+--pipeline_overrides iterate_scene_tasks.cam_block_size=24
+```
+
+Nature video, fast & noisy random walk camera motion:
+```bash
+python -m infinigen.datagen.manage_jobs --output_folder outputs/video_smooth_nature --num_scenes 30 \
+--pipeline_config monocular_video.gin local_256GB.gin cuda_terrain.gin blender_gt.gin --configs high_quality_terrain noisy_video \
+--pipeline_overrides iterate_scene_tasks.cam_block_size=24 --overrides configure_render_cycles.adaptive_threshold=0.05
+```
+
+Indoor video, slow moving camera motion:
+```bash
+python -m infinigen.datagen.manage_jobs --output_folder outputs/video_slow_indoor --num_scenes 30 \
+--pipeline_configs local_256GB.gin monocular_video.gin blender_gt.gin indoor_background_configs.gin --configs singleroom \
+--pipeline_overrides get_cmd.driver_script='infinigen_examples.generate_indoors' \
+--overrides compose_indoors.terrain_enabled=False compose_indoors.restrict_single_supported_roomtype=True AnimPolicyRandomWalkLookaround.speed=0.5 AnimPolicyRandomWalkLookaround.step_range=0.5 compose_indoors.animate_cameras_enabled=True
+```
+
+:warning: Random walk camera generation is very unlikely to find paths between indoor rooms, and therefore will fail to generate long or fast moving videos for indoor scenes. We will followup soon with a pathfinding-based camera trajectory generator to handle these cases. 
+
+##### Multi-view Camera Arrangement (for Multiview Stereo, NeRF, etc.)
+
+Many tasks require cameras placed in a roughly circular arrangement. Below with some noise added to their angle, roll, pitch, and yaw with respect to the object.
+
+<p align="center">
+  <img src="images/multiview_stereo/mvs_indoors.png"/>
+  <img src="images/multiview_stereo/mvs_indoors_2.png">
+  <img src="images/multiview_stereo/mvs_nature.png"/>
+  <img src="images/multiview_stereo/mvs_ocean.png"/>
+</p>
+
+Generate a quick test scene (indoor room with no furniture etc) with 5 multiview cameras:
+```bash
+python -m infinigen.datagen.manage_jobs --output_folder outputs/mvs_test --num_scenes 1 --configs multiview_stereo.gin fast_solve.gin no_objects.gin --pipeline_configs local_256GB.gin monocular.gin blender_gt.gin cuda_terrain.gin indoor_background_configs.gin --overrides camera.spawn_camera_rigs.n_camera_rigs=5 compose_nature.animate_cameras_enabled=False compose_indoors.restrict_single_supported_roomtype=True --pipeline_overrides get_cmd.driver_script='infinigen_examples.generate_indoors' iterate_scene_tasks.n_camera_rigs=5
+```
+
+Generate a dataset of indoor rooms with 30 multiview cameras:
+```bash
+python -m infinigen.datagen.manage_jobs --output_folder outputs/mvs_indoors --num_scenes 30 --pipeline_configs local_256GB.gin monocular.gin blender_gt.gin indoor_background_configs.gin --configs singleroom.gin multiview_stereo.gin --pipeline_overrides get_cmd.driver_script='infinigen_examples.generate_indoors' iterate_scene_tasks.n_camera_rigs=30 --overrides compose_indoors.restrict_single_supported_roomtype=True camera.spawn_camera_rigs.n_camera_rigs=30 
+```
+
+Generate a dataset of nature scenes with 30 multiview cameras:
+```bash
+python -m infinigen.datagen.manage_jobs --output_folder outputs/mvs_nature --num_scenes 30 --configs multiview_stereo.gin --pipeline_configs local_256GB.gin monocular.gin blender_gt.gin cuda_terrain.gin --overrides camera.spawn_camera_rigs.n_camera_rigs=30 compose_nature.animate_cameras_enabled=False --pipeline_overrides iterate_scene_tasks.n_camera_rigs=30
+```
+
+##### Custom camera arrangement
+
+Camera poses can be easily manipulated using the Blender API to create any camera arrangement you wish
+
+For example, you could replace our `pose_cameras` step in `generature_nature.py` or `generate_indoors.py` with code as follows:
+
+```python
+for i, rig in enumerate(camera_rigs):
+  rig.location = (i, 0, np.random.uniform(0, 10))
+  rig.rotation_euler = np.deg2rad(np.array([90, 0, 180 * i / len(camera_rigs)]))
+```
+
+If you wish to animate the camera rigs to move over the course of a video, you would use code similar to the following:
+
+```python
+
+for i, rig in enumerate(camera_rigs):
+
+  for t in range(bpy.context.scene.frame_start, bpy.context.scene.frame_end + 1):
+
+    rig.location = (t, i, 0) 
+    rig.keyframe_insert(data_path="location", frame=t)
+
+    rig.rotation_euler = np.deg2rad(np.array((90, 0, np.random.uniform(-10, 10))))
+    rig.keyframe_insert(data_path="rotation_euler", frame=t)
+```
diff --git a/docs/ConfiguringInfinigen.md b/docs/ConfiguringInfinigen.md
@@ -101,12 +101,12 @@ If you have more than one GPU and are using a `local_*.gin` compute config, each
 
 ### Rendering Video, Stereo and other data formats
 
-Generating a video, stereo or other dataset typically requires more render jobs, so we must instruct `manage_jobs.py` to run those jobs. `datagen/configs/data_schema/` provides many options for you to use in your `--pipeline_configs`, including `monocular_video.gin` and `stereo.gin`. <br> These configs are typically mutually exclusive, and you must include at least one </br>
+Generating a video, stereo or other dataset typically requires more render jobs, so we must instruct `manage_jobs.py` to run those jobs. `datagen/configs/data_schema/` provides many options for you to use in your `--pipeline_configs`, including `monocular_video.gin`, `stereo.gin` and `multiview_stereo.gin`. <br> These configs are typically mutually exclusive, and you must include at least one </br>
 
 
 To create longer videos, modify `iterate_scene_tasks.frame_range` in `monocular_video.gin` (note: we use 24fps video by default). `iterate_scene_tasks.view_block_size` controls how many frames will be grouped into each `fine_terrain` and render / ground-truth task.
 
-If you need more than two cameras, or want to customize their placement, see `infinigen_examples/configs_nature/base.gin`'s `camera.spawn_camera_rigs.camera_rig_config` for advice on existing options, or write your own code to instantiate a custom camera setup.
+If you need more than two cameras, or want to customize their placement, see `infinigen_examples/configs_nature/base.gin`'s `camera.spawn_camera_rigs.camera_rig_config` for advice on existing options, or write your own code to instantiate a custom camera setup. For multiview stereo data, you may include `multiview_stereo.gin` in  `--configs`, which creates 30 cameras by default.
 
 ### Config Overrides to Customize Scene Content
 

diff --git a/docs/images/multiview_stereo/mvs_indoors.png b/docs/images/multiview_stereo/mvs_indoors.png
diff --git a/docs/images/multiview_stereo/mvs_indoors_2.png b/docs/images/multiview_stereo/mvs_indoors_2.png
diff --git a/docs/images/multiview_stereo/mvs_nature.png b/docs/images/multiview_stereo/mvs_nature.png
diff --git a/docs/images/multiview_stereo/mvs_ocean.png b/docs/images/multiview_stereo/mvs_ocean.png
diff --git a/infinigen/__init__.py b/infinigen/__init__.py
@@ -6,7 +6,7 @@
 import logging
 from pathlib import Path
 
-__version__ = "1.9.2"
+__version__ = "1.10.0"
 
 
 def repo_root():

diff --git a/infinigen/assets/objects/rocks/boulder.py b/infinigen/assets/objects/rocks/boulder.py
@@ -35,15 +35,15 @@ class BoulderFactory(AssetFactory):
     def __init__(
         self,
         factory_seed,
-        meshing_camera=None,
+        meshing_cameras=None,
         adapt_mesh_method="remesh",
         cam_meshing_max_dist=1e7,
         coarse=False,
         do_voronoi=True,
     ):
         super(BoulderFactory, self).__init__(factory_seed, coarse)
 
-        self.camera = meshing_camera
+        self.cameras = meshing_cameras
         self.cam_meshing_max_dist = cam_meshing_max_dist
         self.adapt_mesh_method = adapt_mesh_method
 
@@ -166,10 +166,10 @@ def geo_extrusion(nw: NodeWrangler, extrude_scale=1):
         nw.new_node(Nodes.GroupOutput, input_kwargs={"Geometry": geometry})
 
     def create_asset(self, i, placeholder, face_size=0.01, distance=0, **params):
-        if self.camera is not None and distance < self.cam_meshing_max_dist:
+        if self.cameras is not None and distance < self.cam_meshing_max_dist:
             assert self.adapt_mesh_method != "remesh"
             skin_obj, outofview, vert_dists, _ = split_inview(
-                placeholder, cam=self.camera, vis_margin=0.15
+                placeholder, cameras=self.cameras, vis_margin=0.15
             )
             butil.parent_to(outofview, skin_obj, no_inverse=True, no_transform=True)
             face_size = detail.target_face_size(vert_dists.min())

diff --git a/infinigen/assets/objects/trees/generate.py b/infinigen/assets/objects/trees/generate.py
@@ -58,7 +58,7 @@ def __init__(
         child_col,
         trunk_surface,
         realize=False,
-        meshing_camera=None,
+        meshing_cameras=None,
         cam_meshing_max_dist=1e7,
         coarse_mesh_placeholder=False,
         adapt_mesh_method="remesh",
@@ -73,7 +73,7 @@ def __init__(
         self.trunk_surface = trunk_surface
         self.realize = realize
 
-        self.camera = meshing_camera
+        self.cameras = meshing_cameras
         self.cam_meshing_max_dist = cam_meshing_max_dist
         self.adapt_mesh_method = adapt_mesh_method
         self.decimate_placeholder_levels = decimate_placeholder_levels
@@ -168,12 +168,12 @@ def create_asset(
                 ),
             )
 
-        if self.camera is not None and distance < self.cam_meshing_max_dist:
+        if self.cameras is not None and distance < self.cam_meshing_max_dist:
             assert self.adapt_mesh_method != "remesh"
 
             skin_obj_cleanup = skin_obj
             skin_obj, outofview, vert_dists, _ = split_inview(
-                skin_obj, cam=self.camera, vis_margin=0.15
+                skin_obj, cameras=self.cameras, vis_margin=0.15
             )
             butil.parent_to(outofview, skin_obj, no_inverse=True, no_transform=True)
 

diff --git a/infinigen/core/execute_tasks.py b/infinigen/core/execute_tasks.py
@@ -47,7 +47,7 @@ def get_scene_tag(name):
 def render(
     scene_seed,
     output_folder,
-    camera_id,
+    camera,
     render_image_func=render_image,
     resample_idx=None,
     hide_water=False,
@@ -59,7 +59,7 @@ def render(
     if resample_idx is not None and resample_idx != 0:
         resample_scene(int_hash((scene_seed, resample_idx)))
     with Timer("Render Frames"):
-        render_image_func(frames_folder=Path(output_folder), camera_id=camera_id)
+        render_image_func(frames_folder=Path(output_folder), camera=camera)
 
 
 def is_static(obj):
@@ -87,8 +87,9 @@ def is_static(obj):
 
 @gin.configurable
 def save_meshes(
-    scene_seed,
-    output_folder,
+    scene_seed: int,
+    output_folder: Path,
+    cameras: list[bpy.types.Object],
     frame_range,
     resample_idx=False,
     point_trajectory_src_frame=1,
@@ -108,8 +109,9 @@ def save_meshes(
     for obj in bpy.data.objects:
         obj.hide_viewport = not (not obj.hide_render and is_static(obj))
     frame_idx = point_trajectory_src_frame
-    frame_info_folder = Path(output_folder) / f"frame_{frame_idx:04d}"
+    frame_info_folder = output_folder / f"frame_{frame_idx:04d}"
     frame_info_folder.mkdir(parents=True, exist_ok=True)
+
     logger.info("Working on static objects")
     exporting.save_obj_and_instances(
         frame_info_folder / "static_mesh",
@@ -128,17 +130,17 @@ def save_meshes(
     ):
         bpy.context.scene.frame_set(frame_idx)
         bpy.context.view_layer.update()
-        frame_info_folder = Path(output_folder) / f"frame_{frame_idx:04d}"
+        frame_info_folder = output_folder / f"frame_{frame_idx:04d}"
         frame_info_folder.mkdir(parents=True, exist_ok=True)
-        logger.info(f"Working on frame {frame_idx}")
+        logger.info(f"save_meshes processing {frame_idx=}")
 
         exporting.save_obj_and_instances(
             frame_info_folder / "mesh",
             previous_frame_mesh_id_mapping,
             current_frame_mesh_id_mapping,
         )
         cam_util.save_camera_parameters(
-            camera_ids=cam_util.get_cameras_ids(),
+            camera_ids=cameras,
             output_folder=frame_info_folder / "cameras",
             frame=frame_idx,
         )
@@ -246,12 +248,15 @@ def execute_tasks(
         with open(outpath / "info.pickle", "wb") as f:
             pickle.dump(info, f, protocol=pickle.HIGHEST_PROTOCOL)
 
-    cam_util.set_active_camera(*camera_id)
+    camera_rigs = cam_util.get_camera_rigs()
+    camrig_id, subcam_id = camera_id
+    active_camera = camera_rigs[camrig_id].children[subcam_id]
+    cam_util.set_active_camera(active_camera)
 
     group_collections()
 
     if Task.Populate in task and populate_scene_func is not None:
-        populate_scene_func(output_folder, scene_seed)
+        populate_scene_func(output_folder, scene_seed, camera_rigs)
 
     need_terrain_processing = "atmosphere" in bpy.data.objects
 
@@ -267,10 +272,9 @@ def execute_tasks(
             whole_bbox=info["whole_bbox"],
         )
 
-        cameras = [cam_util.get_camera(i, j) for i, j in cam_util.get_cameras_ids()]
         terrain.fine_terrain(
             output_folder,
-            cameras=cameras,
+            cameras=[c for rig in camera_rigs for c in rig.children],
             optimize_terrain_diskusage=optimize_terrain_diskusage,
         )
 
@@ -326,7 +330,7 @@ def execute_tasks(
         render(
             scene_seed,
             output_folder=output_folder,
-            camera_id=camera_id,
+            camera=active_camera,
             resample_idx=resample_idx,
         )