Add depth module (#247)

* generalized depth module * update CLI options * setup save as validator * move comments into PR * put back to depth * fixes and flake8 * add depth tests * make detection indices a lsit * add cli tests * add docs * flake8 * fix save bug so overwrite works now; print out filepath csv not list of filepaths * fix bug where frame of zeros is wrong dimension; this happens when detection is at end of video * use noqa * first pass at code review comments * make parent directories and always include distance col even if null * add docs intro sentence * expose num_workers * expose gpu * use duplicated * remove log as this is just for calculating train metrics * change order around * cleanup; add gpu validation * fix typo * add comma * clean up transforms * remove unused code * fix docstring * set order explicitly * couple homepage updates * add section to homepage on depth * tweaks * tweak * add window order to det frame number * expose DepthDataset * flake8 * remove extra code * fix test * check values on real pred * round * use gpus if available * make tests more flexible to support various ffmpeg versions * round to 1 decimal place * avoid a copy * account for os differences * reflect rounding * alphabetize * use v3 as v1 is deprecated * normalize in get_item; conserve memory * lint * add note about memory * working tweak
drivendataorg · Dec 1, 2022 · d576ec6 · d576ec6
1 parent 0b9a901
commit d576ec6
Show file tree

Hide file tree

Showing 19 changed files with 859 additions and 112 deletions.
diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
@@ -77,7 +77,7 @@ jobs:
           make densepose-tests
 
       - name: Upload coverage to codecov
-        uses: codecov/codecov-action@v1
+        uses: codecov/codecov-action@v3
         with:
           file: ./coverage.xml
           fail_ci_if_error: true

diff --git a/README.md b/README.md
@@ -14,6 +14,7 @@ https://user-images.githubusercontent.com/46792169/138346340-98ee196a-5ecd-4753-
 - Identify which species appear in each video
 - Filter out blank videos
 - Create your own custom models that identify your species in your habitats
+- Estimate the distance between animals in the frame and the camera
 - And more! 🙈 🙉 🙊
 
 The official models in `zamba` can identify blank videos (where no animal is present) along with 32 species common to Africa and 11 species common to Europe. Users can also finetune models using their own labeled videos to then make predictions for new species and/or new ecologies.
@@ -50,23 +51,26 @@ Once you have `zamba` installed, some good starting points are:
 Once `zamba` is installed, you can see the basic command options with:
 ```console
 $ zamba --help
-Usage: zamba [OPTIONS] COMMAND [ARGS]...
-
-  Zamba is a tool built in Python to automatically identify the species seen
-  in camera trap videos from sites in Africa and Europe. Visit
-  https://zamba.drivendata.org/docs for more in-depth documentation.
-
-Options:
-  --version             Show zamba version and exit.
-  --install-completion  Install completion for the current shell.
-  --show-completion     Show completion for the current shell, to copy it or
-                        customize the installation.
-  --help                Show this message and exit.
-
-Commands:
-  densepose  Run densepose algorithm on videos.
-  predict    Identify species in a video.
-  train      Train a model on your labeled data.
+
+ Usage: zamba [OPTIONS] COMMAND [ARGS]...
+
+ Zamba is a tool built in Python to automatically identify the species seen in camera trap
+ videos from sites in Africa and Europe. Visit https://zamba.drivendata.org/docs for more
+ in-depth documentation.
+
+╭─ Options ─────────────────────────────────────────────────────────────────────────────────╮
+│ --version                     Show zamba version and exit.                                │
+│ --install-completion          Install completion for the current shell.                   │
+│ --show-completion             Show completion for the current shell, to copy it or        │
+│                               customize the installation.                                 │
+│ --help                        Show this message and exit.                                 │
+╰───────────────────────────────────────────────────────────────────────────────────────────╯
+╭─ Commands ────────────────────────────────────────────────────────────────────────────────╮
+│ densepose      Run densepose algorithm on videos.                                         │
+│ depth          Estimate animal distance at each second in the video.                      │
+│ predict        Identify species in a video.                                               │
+│ train          Train a model on your labeled data.                                        │
+╰───────────────────────────────────────────────────────────────────────────────────────────╯
 ```
 
 `zamba` can be used "out of the box" to generate predictions or train a model using your own videos. `zamba` supports the same video formats as FFmpeg, [which are listed here](https://www.ffmpeg.org/general.html#Supported-File-Formats_002c-Codecs-or-Features). Any videos that fail a set of FFmpeg checks will be skipped during inference or training.
@@ -105,6 +109,16 @@ Now you can pass this configuration to the command line. See the [Quickstart](ht
 
 You can then share your model with others by adding it to the [Model Zoo Wiki](https://github.com/drivendataorg/zamba/wiki).
 
+### Estimating distance between animals and the camera
+
+```console
+$ zamba depth --data-dir path/to/videos
+```
+
+By default, predictions will be saved to `depth_predictions.csv`. Run `zamba depth --help` to list all possible options to pass to `depth`.
+
+See the [depth estimation page](https://zamba.drivendata.org/docs/stable/models/depth/) for more details.
+
 
 ## Contributing
 

diff --git a/docs/docs/api-reference/depth_config.md b/docs/docs/api-reference/depth_config.md
@@ -0,0 +1,3 @@
+# zamba.models.depth_estimation.config
+
+::: zamba.models.depth_estimation.config
diff --git a/docs/docs/api-reference/depth_manager.md b/docs/docs/api-reference/depth_manager.md
@@ -0,0 +1,3 @@
+# zamba.models.depth_estimation.depth_manager
+
+::: zamba.models.depth_estimation.depth_manager
diff --git a/docs/docs/index.md b/docs/docs/index.md
@@ -14,6 +14,7 @@
 - Identify which species appear in each video
 - Filter out blank videos
 - Create your own custom models that identify your species in your habitats
+- Estimate the distance between animals in the frame and the camera
 - And more! 🙈 🙉 🙊
 
 The official models in `zamba` can identify blank videos (where no animal is present) along with 32 species common to Africa and 11 species common to Europe. Users can also finetune models using their own labeled videos to then make predictions for new species and/or new ecologies.
@@ -50,23 +51,26 @@ Once you have `zamba` installed, some good starting points are:
 Once `zamba` is installed, you can see the basic command options with:
 ```console
 $ zamba --help
-Usage: zamba [OPTIONS] COMMAND [ARGS]...
-
-  Zamba is a tool built in Python to automatically identify the species seen
-  in camera trap videos from sites in Africa and Europe. Visit
-  https://zamba.drivendata.org/docs for more in-depth documentation.
-
-Options:
-  --version             Show zamba version and exit.
-  --install-completion  Install completion for the current shell.
-  --show-completion     Show completion for the current shell, to copy it or
-                        customize the installation.
-  --help                Show this message and exit.
-
-Commands:
-  densepose  Run densepose algorithm on videos.
-  predict    Identify species in a video.
-  train      Train a model on your labeled data.
+
+ Usage: zamba [OPTIONS] COMMAND [ARGS]...
+
+ Zamba is a tool built in Python to automatically identify the species seen in camera trap
+ videos from sites in Africa and Europe. Visit https://zamba.drivendata.org/docs for more
+ in-depth documentation.
+
+╭─ Options ─────────────────────────────────────────────────────────────────────────────────╮
+│ --version                     Show zamba version and exit.                                │
+│ --install-completion          Install completion for the current shell.                   │
+│ --show-completion             Show completion for the current shell, to copy it or        │
+│                               customize the installation.                                 │
+│ --help                        Show this message and exit.                                 │
+╰───────────────────────────────────────────────────────────────────────────────────────────╯
+╭─ Commands ────────────────────────────────────────────────────────────────────────────────╮
+│ densepose      Run densepose algorithm on videos.                                         │
+│ depth          Estimate animal distance at each second in the video.                      │
+│ predict        Identify species in a video.                                               │
+│ train          Train a model on your labeled data.                                        │
+╰───────────────────────────────────────────────────────────────────────────────────────────╯
 ```
 
 `zamba` can be used "out of the box" to generate predictions or train a model using your own videos. `zamba` supports the same video formats as FFmpeg, [which are listed here](https://www.ffmpeg.org/general.html#Supported-File-Formats_002c-Codecs-or-Features). Any videos that fail a set of FFmpeg checks will be skipped during inference or training.
@@ -105,6 +109,16 @@ Now you can pass this configuration to the command line. See the [Quickstart](qu
 
 You can then share your model with others by adding it to the [Model Zoo Wiki](https://github.com/drivendataorg/zamba/wiki).
 
+### Estimating distance between animals and the camera
+
+```console
+$ zamba depth --data-dir path/to/videos
+```
+
+By default, predictions will be saved to `depth_predictions.csv`. Run `zamba depth --help` to list all possible options to pass to `depth`.
+
+See the [depth estimation page](models/depth/) for more details.
+
 
 ## Contributing
 

diff --git a/docs/docs/models/depth.md b/docs/docs/models/depth.md
@@ -0,0 +1,105 @@
+# Depth estimation
+
+## Background
+
+Our depth estimation model is useful for predicting the distance an animal is from the camera, which is an input into models used to estimate animal abundance. 
+
+The depth model comes from one of the winners of the [Deep Chimpact: Depth Estimation for Wildlife Conservation](https://www.drivendata.org/competitions/82/competition-wildlife-video-depth-estimation/) machine learning challenge hosted by DrivenData. The goal of this challenge was to use machine learning and advances in monocular (single-lens) depth estimation techniques to automatically estimate the distance between a camera trap and an animal contained in its video footage. The challenge drew on a unique labeled dataset from research teams from the Max Planck Institute for Evolutionary Anthropology (MPI-EVA) and the Wild Chimpanzee Foundation (WCF).
+
+The Zamba package supports running the depth estimation model on videos. Under the hood, it does the following:
+
+- Resamples the video to one frame per second
+- Runs the [MegadetectorLite](../models/species-detection.md#megadetectorlite) model on each frame to find frames with animals in them
+- Estimates depth for each detected animal in the frame
+- Outputs a csv with predictions
+
+## Output format
+
+The output of the depth estimation model is a csv with the following columns:
+
+- `filepath`: video name
+- `time`: seconds from the start of the video
+- `distance`: distance between detected animal and the camera
+
+There will be multiple rows per timestamp if there are multiple animals detected in the frame. If there is no animal in the frame, the distance will be null.
+
+For example, the first few rows of the `depth_predictions.csv` might look like this:
+
+```
+filepath,time,distance
+video_1.avi,0,7.4
+video_1.avi,0,7.4
+video_1.avi,1,
+video_1.avi,2,
+video_1.avi,3,
+```
+
+## Installation
+
+The depth estimation is included by default. If you've already [installed zamba](/docs/install/), there's nothing more you need to do.
+
+## Running depth estimation
+
+Here's how to run the depth estimation model.
+
+=== "CLI"
+    ```bash
+    # output a csv with depth predictions for each frame in the videos in PATH_TO_VIDEOS
+    zamba depth --data-dir PATH_TO_VIDEOS
+    ```
+=== "Python"
+    ```python
+    from zamba.models.depth_estimation import DepthEstimationConfig
+    depth_conf = DepthEstimationConfig(data_dir="PATH_TO_VIDEOS")
+    depth_conf.run_model()
+    ```
+
+### Debugging
+
+Unlike in the species classification models, selected frames are stored in memory rather than cached to disk. If you run out of memory, try predicting on a smaller number of videos. If you hit a GPU memory error, try reducing the [number of workers](../../debugging/#reducing-num_workers) or the [batch size](../../debugging/#reducing-the-batch-size).
+
+## Getting help
+
+To see all of the available options, run `zamba depth --help`.
+
+```console
+$ zamba depth --help
+
+ Usage: zamba depth [OPTIONS]
+
+ Estimate animal distance at each second in the video.
+
+╭─ Options ─────────────────────────────────────────────────────────────────────────────────╮
+│ --filepaths                       PATH          Path to csv containing `filepath` column  │
+│                                                 with videos.                              │
+│                                                 [default: None]                           │
+│ --data-dir                        PATH          Path to folder containing videos.         │
+│                                                 [default: None]                           │
+│ --save-to                         PATH          An optional directory or csv path for     │
+│                                                 saving the output. Defaults to            │
+│                                                 `depth_predictions.csv` in the working    │
+│                                                 directory.                                │
+│                                                 [default: None]                           │
+│ --overwrite               -o                    Overwrite output csv if it exists.        │
+│ --batch-size                      INTEGER       Batch size to use for inference.          │
+│                                                 [default: None]                           │
+│ --num-workers                     INTEGER       Number of subprocesses to use for data    │
+│                                                 loading.                                  │
+│                                                 [default: None]                           │
+│ --gpus                            INTEGER       Number of GPUs to use for inference. If   │
+│                                                 not specifiied, will use all GPUs found   │
+│                                                 on machine.                               │
+│                                                 [default: None]                           │
+│ --model-cache-dir                 PATH          Path to directory for downloading model   │
+│                                                 weights. Alternatively, specify with      │
+│                                                 environment variable `MODEL_CACHE_DIR`.   │
+│                                                 If not specified, user's cache directory  │
+│                                                 is used.                                  │
+│                                                 [default: None]                           │
+│ --weight-download-region          [us|eu|asia]  Server region for downloading weights.    │
+│                                                 [default: None]                           │
+│ --yes                     -y                    Skip confirmation of configuration and    │
+│                                                 proceed right to prediction.              │
+│ --help                                          Show this message and exit.               │
+╰───────────────────────────────────────────────────────────────────────────────────────────╯
+```
diff --git a/docs/docs/models/species-detection.md b/docs/docs/models/species-detection.md
@@ -1,6 +1,6 @@
-# Available models
+# Species detection
 
-The algorithms in `zamba` are designed to identify species of animals that appear in camera trap videos. The pretrained models that ship with the `zamba` package are: `blank_nonblank`, `time_distributed`, `slowfast`, and `european`. For more details of each, read on!
+The classification algorithms in `zamba` are designed to identify species of animals that appear in camera trap videos. The pretrained models that ship with the `zamba` package are: `blank_nonblank`, `time_distributed`, `slowfast`, and `european`. For more details of each, read on!
 
 ## Model summary
 

diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml
@@ -24,6 +24,7 @@ nav:
       - Guide to common optional parameters: "extra-options.md"
   - "Available Models":
       - Species detection: "models/species-detection.md"
+      - Depth estimation: "models/depth.md"
       - DensePose: "models/densepose.md"
       - African species performance: "models/td-full-metrics.md"
   - "Advanced Options":
@@ -39,6 +40,9 @@ nav:
           - zamba.data.video: "api-reference/data-video.md"
       - zamba.models:
           - zamba.models.config: "api-reference/models-config.md"
+          - zamba.model.depth_estimation:
+            - zamba.models.depth_estimation.config: "api-reference/depth_config.md"
+            - zamba.models.depth_estimation.depth_manager: "api-reference/depth_manager.md"
           - zamba.models.densepose:
             - zamba.models.densepose.config: "api-reference/densepose_config.md"
             - zamba.models.densepose.densepose_manager: "api-reference/densepose_manager.md"

diff --git a/tests/assets/depth_tests/aava.mp4 b/tests/assets/depth_tests/aava.mp4
diff --git a/tests/test_cli.py b/tests/test_cli.py
@@ -190,6 +190,40 @@ def test_actual_prediction_on_single_video(tmp_path, model):  # noqa: F811
     )
 
 
+def test_depth_cli_options(mocker, tmp_path):  # noqa: F811
+    mocker.patch("zamba.models.depth_estimation.config.DepthEstimationConfig.run_model", pred_mock)
+
+    result = runner.invoke(
+        app,
+        [
+            "depth",
+            "--help",
+        ],
+    )
+
+    assert result.exit_code == 0
+    assert "Estimate animal distance" in result.output
+
+    result = runner.invoke(
+        app,
+        [
+            "depth",
+            "--data-dir",
+            str(TEST_VIDEOS_DIR),
+            "--save-to",
+            str(tmp_path),
+            "--batch-size",
+            12,
+            "--weight-download-region",
+            "asia",
+            "--yes",
+        ],
+    )
+
+    assert result.exit_code == 0
+    assert "The following configuration will be used" in result.output
+
+
 @pytest.mark.skipif(
     not bool(int(os.environ.get("ZAMBA_RUN_DENSEPOSE_TESTS", 0))),
     reason="""Skip the densepose specific tests unless environment variable \