rpichap7.tex

\chapter{CONCLUSION} \label{chapter:conclusion}

\noindent The proposed photographic censusing methodology encompasses the entire process from start to finish: from the engagement of citizen scientists for decentralized image collection; to the parallel annotation of new training data; to the training and inference of automated decision making with computer vision algorithms; to the final population estimates with their valuable individual ecological, social, and temporal data.  Furthermore, this dissertation demonstrates the effectiveness of a detection pipeline for filtering raw images into identifiable annotations and reducing errors by mitigating common identification errors.  Any errors that are made by the automated algorithms can also be factored into the final population estimate. Thus, the power of photographic censusing is driven by comprehensive detection and identification computer vision pipelines and a thorough understanding of how modern ecological studies are performed in the field.

\section{Contributions}

The research presented in this dissertation is heavily applied and represents the culmination of over a decade's worth\footnote{and has required the focus of more than one Ph.D. dissertation, see~\cite{crall_identifying_2017}.} of computer vision and ecology research.  Furthermore, the trajectory of this work has been heavily influenced by the real-world implications and implementation details of performing photographic census events on endangered animal populations.

\begin{enumerate}
    \item \textbf{Animal Detection Pipeline} - a comprehensive detection pipeline for animals for use in photographic censusing.  The pipeline is designed to be easily bootstrapable for new species with relatively minimal annotation work.  The output of the detection pipeline is customized for the task of animal instance recognition and is comprised of the following modularized components:
          \begin{enumerate}
              \item Whole Image Classification (WIC) - A CNN that performs a multi-label classification problem for high-level filtering
              \item Localization - A CNN that performs bounding box localization and classification to find animals
              \item Annotation Classification (Labeler) - A CNN that performs a single-label classification problem for annotating species and viewpoint
              \item Coarse Background Segmentation - A FCNN that attempts to provide an approximate segmentation for a given species to mask out background pixels.
              \item Annotation of Interest - We present the concept of AoI and evaluate its effectiveness for filtering irrelevant annotations in an identification pipeline.
              \item Specialized Needs - Additional detection components to rotate annotations or find animals from overhead imagery are useful for specific needs.
          \end{enumerate}
    \item \textbf{Census Annotation} - a novel concept that is designed to reduce incomparable and incidental matching during animal identification.
          \begin{enumerate}
              \item Census Annotation (CA) - selects annotations that are identifiable and show a consistent part of the animal body, reducing the amount of human work that is needed during a photographic census.
              \item Census Annotation Region (CA-R) - reduces the impact of incidental matching by creating more focused regions within existing detected Census Annotations, drastically reducing the amount of human effort by increasing the separability of automated ID verifiers
          \end{enumerate}
    \item \textbf{Photographic Censusing} - a comprehensive process for building an animal ID database from scratch, relying on the concepts of verification and the continual curation of IDs.  The formal definition also includes a new Automated Lincoln-Petersen Estimator to better estimate populations when machine learning methods are involved.
    \item \textbf{Photographic Censusing Rallies} - an organized data collection event where ``citizen scientist'' volunteer photographers are trained and tasked to take photos with GPS-enabled cameras over two back-to-back days.  The results of the Great Gr\'evy's Rally 2016 (GGR-16) and Great Gr\'evy's Rally 2018 (GGR-18) censusing rallies are significant contributions of this work.
    \item \textbf{Animal Datasets} - new public datasets for animal detection and ID research.  Common public datasets for computer vision tasks like object detection generally do not provide associated ID information when they include boxes of animals.  Likewise, animal ID datasets often only include pre-cropped images of animals and rarely focus on herding species.
          \begin{enumerate}
              \item WILD - a dataset with six species containing comprehensive bounding boxes and AoI flags on challenging scenarios.
              \item DETECT - a Plains and Gr\'evy's zebra only database focusing on even more visual nuances for detection.
              \item Gr\'evy's Zebra Census Dataset (GZCD) - a dataset of Gr\'evy's zebra that focuses on the problem of incidental matching.
              \item Great Gr\'evy's Rally 2016 (GGR-16) - A first-of-its-kind photographic census of the Gr\'evy's zebra in Kenya, producing a baseline population estimate.
              \item Great Gr\'evy's Rally 2018 (GGR-18) - A second census of the Gr\'evy's zebra and reticulated giraffe in Kenya, providing a population estimate of Gr\'evy's zebra with improved sampling and measuring the increase of the population.
          \end{enumerate}
\end{enumerate}

\section{Future Work}

The field of automated wildlife conservation is in its infancy, and there seems to be a lack of widely available animal ID datasets.  Building large-scale datasets on animals with the first principle of ID seems to be the fastest way to unlock the interest within the larger research community on animal detection and re-ID.  Furthermore, my hope is that the analysis provided on the Gr\'evy's zebra species has demonstrated the overwhelming benefits of photographic censusing as a population monitoring methodology, where the principles that have been described weave a general framework that can be easily adapted for other endangered species.  If both cases are true, then the next step is -- quite simply -- to get to work protecting some animals that need our help to survive.

The study of endangered species is tricky when state-of-the-art research methods also produce fantastic tools for poachers.  The availability of an ID database for conservation policy has the apparent downside of being a clearinghouse for the size and location of a given population.  Continued research should be focused on one-shot learning to reduce the exposure of a species to only what is essential for machine learning training.  A focus on one-shot or few-shot learning also comes with the obvious benefit of not building animal population monitoring systems that are brittle to a low distribution of sightings.

A major missing component of this dissertation is a robust analysis of true segmentation methods.  The ability to segment a mother and foal is likely the only way that problem will be solved long-term.  Likewise, some species are poor candidates for bounding boxes (e.g., giraffes) because they fill the annotations with a lot of background noise.  The downside is that segmentation algorithms have historically been very data-hungry to get good performance, but the future looks bright for more accurate filtering methods prior to ID.

The definition of a given species’ Census Annotation Region relies primarily on human intuition and depends to some extent on the ranking and verification algorithms used during ID curation. There needs to be a more principled way to locate the visual information needed for effective automated ID curation. Related work on attention mechanisms with deep convolutional neural networks may provide a mechanism for automatically defining these regions on a per species basis. Furthermore, there may be a need to decouple the results of an ID algorithm with the visualization of its suggested correspondences. For example, the PIE algorithm does not natively visualize the matching regions between two annotations that are found to be close in its learned embedding space. While a given ID algorithm may offer useful intermediate primitives to visualize, this cannot be guaranteed, and a more standardized visualization approach may be possible.

The process of photographic censusing is presented here as a comprehensive, bootstrappable, and end-to-end option for wildlife conservation managers. The motivating use case for this dissertation has been the management of large \textit{megafauna} populations in Kenya. As such, one of the core implementation decisions with photographic censusing rallies is that it relies on two days of collection.  This two-day structure attempts to follow the guidance of historical surveys done in that country, but it is not the only valid time frame option for a census. There exists a clear need for more flexible collection options because some species cannot be comprehensively censused in a single day. For example, whale watching seasons typically involve several months of image collection and cannot be expected to cover the entire migratory range in only a handful of days. Instead, photographic censusing needs to be extended to support a longer, more continual collection of images.  The Petersen-Lincoln estimator is not compatible with such a design, so additional statistical methods and validating experiments are needed to apply this work more broadly.

Finally, the research studies that have successfully used citizen science efforts have almost exclusively been focused on species classification and do not meaningfully engage with ID verification. For example, there does not seem to exist experimental data on how well the general public can verify if two zebra are the same in a fixed time, or two beluga whales seen from above, or the flippers of two sea turtle sightings.  Furthermore, there is no established international standard for how photographic censusing should be performed, focusing on less invasive collection and automated analysis.  A recognized portfolio of species that are good candidates for photographic re-ID and recommendations for how best an average citizen may safely collect and contribute valuable data would unlock new avenues for data collection.