Skip to content

Latest commit

 

History

History
176 lines (109 loc) · 8.79 KB

README.md

File metadata and controls

176 lines (109 loc) · 8.79 KB

kabr-tools DOI

This repository contains tools to calculate time budget analysis from drone videos of zebras and giraffes, using the KABR model to label behavior automatically.

Figure 1: Overview of the pipeline for KABR dataset preparation.

KABR tools requires that torch be installed.

The KABR tools used in this process can be installed with:

pip install torch torchvision
pip install git+https://github.com/Imageomics/kabr-tools

Notes:

  • Refer to pytorch.org to install specific versions of torch/CUDA
  • detectron2 requires Linux or MacOS.
  • If building detectron2's wheel fails, check gcc & g++ ≥ 5.4 (run gcc --version and g++ --version).
  • SlowFast's setup.py is outdated; our workaround is pip install git+https://github.com/Imageomics/SlowFast@797a6f3ae81c49019d006296f1e0f84f431dc356, which is included when installing kabr_tools.

Each KABR tool can be run through the command line (as described below) or imported as a python module. They each have help information which can be accessed on the command line through <tool-name> -h.

Please refer to our KABR Project Page for additional details on the dataset and original paper.

Step 1: Video Data Collection with Drones

Figure 2: Clip of drone video containing Plains and Grevy's zebras, plus some impalas.

The drone videos for the KABR dataset were collected at the Mpala Research Centre in January 2023. The missions were flown manually, using a DJI 2S Air drone.

We collaborated with expert ecologists to ensure minimal disturbance to the animals. We launched the drone approximately 200 meters horizontally from the animals and at an altitude of 30-40 meters. We gradually approached the herd from the side by reducing the altitude and horizontal distance and monitoring the animals for signs of vigilance.

Note that the vigilance exhibited by wildlife varies widely by species, habitat, sex, and the level to which animals may be habituated to anthropogenic noise. So, we recommend that you tailor your approach to your particular species and setting.

Please refer to our papers for details on the data collection process:

Step 2: Data Pre-processing with CVAT

In order to automatically label the animal videos with behavior, we must first create mini-scenes of each individual animal captured in the frame, illustrated below.

See the Wiki CVAT User Guide and Data Management Tips for detailed instructions and recommendations.

Figure 3: A mini-scene is a sub-image cropped from the drone video footage centered on and surrounding a single animal. Mini-scenes simulate the camera as well-aligned with each animal in the frame, compensating for the drone's movement by focusing on just the animal and its immediate surroundings. The KABR dataset consists of mini-scenes and their frame-by-frame behavior annotation.

To create mini-scenes, we first must perform the detection step, by drawing bounding boxes around each animal in frame.

See data/mini_scenes on Hugging Face for example mini-scenes.

Step 2A: Perform detections to create tracks

Option 1: Manual detections in CVAT

Figure 4: Simplified CVAT annotation tool interface

Upload your raw videos to CVAT and perform the detections by drawing bounding boxes manually. This can be quite consuming, but has the advantage of generating highly accurate tracks. Depending on the resolution of your raw video, you may encounter out of space issues with CVAT. You can use downgrade.sh to reduce the size of your videos.

Option 2: Automatic detections with YOLO

You may use YOLO to automatically perform detection on your videos. Use the script below to convert YOLO detections to CVAT format.

detector2cvat: Detect objects with Ultralytics YOLO detections, apply SORT tracking and convert tracks to CVAT format.

detector2cvat --video path_to_videos --save path_to_save [--imshow]

Step 2B: Create mini-scenes from tracks

Once you have your tracks generated, use them to create mini-scenes from your raw footage.

tracks_extractor: Extract mini-scenes from CVAT tracks.

tracks_extractor --video path_to_videos --annotation path_to_annotations [--tracking] [--imshow]

Step 3: Label mini-scenes with behavior

You can use the KABR model on Hugging Face to label the mini-scenes with behavior. See the ethogram folder for the list of behaviors used to label the zebra videos.

To use the KABR model, download checkpoint_epoch_00075.pyth.zip from Hugging Face, unzip checkpoint_epoch_00075.pyth, and install SlowFast. Then run miniscene2behavior.py.

Label the mini-scenes:

miniscene2behavior [--config path_to_config] --checkpoint path_to_checkpoint [--gpu_num number_of_gpus] --miniscene path_to_miniscene [--output path_to_output_csv]

Notes:

  • If the config hasn't been extracted yet, the script will write it to config.
  • checkpoint should be the path to checkpoint_epoch_00075.pyth.
  • If gpu_num is 0, the model will use CPU. Using at least 1 GPU greatly increases inference speed. If you're using OSC, you can request a node with one GPU by running sbatch -N 1 --gpus-per-node 1 -A [account] --time=[minutes] [bash script].
  • mini-scenes are clipped videos focused on individual animals and video is the raw video file from which mini-scenes have been extracted.

See these csv files in Hugging Face for examples of annotated mini-scene outputs.

Step 4: Calculate time budgets

See the time budgets notebook for the code to create these visualizations.

drone_telemetry grevys

Figure 5: Example flight path and video clip from KABR datasetL, 2 male Grevy's zebras observed for 10 minutes on 01/18/23.


Figure 6: Overall time budget for duration of 10 minute observation


Figure 7: Gantt chart for each zebra (3 minute duration)

Optional Steps:

Fine-tune YOLO for your dataset

If you wish to use YOLO to automatically generate detections, you may want to fine-tune your YOLO model for your dataset using the train_yolo notebook.

cvat2ultralytics: Convert CVAT annotations to Ultralytics YOLO dataset.

cvat2ultralytics --video path_to_videos --annotation path_to_annotations --dataset dataset_name [--skip skip_frames]

Extras

player: Player for tracking and behavior observation.

player --folder path_to_folder [--save] [--imshow]

Figure 7: Example player.py output.

cvat2slowfast: Convert CVAT annotations to the dataset in Charades format.

cvat2slowfast --miniscene path_to_mini_scenes --dataset dataset_name --classes path_to_classes_json [--old2new path_to_old2new_json] [--no_images]