Skip to content

Latest commit

 

History

History
executable file
·
57 lines (43 loc) · 3.23 KB

README.md

File metadata and controls

executable file
·
57 lines (43 loc) · 3.23 KB

Birds Eye View Prediction

The following script assumes that the data is stored at artifacts/data and the annotation.csv file is at artifacts/data/annotation.csv

The path to data can be modified by modifying each of the following shell scripts.

  • final_integration/train_autoencoder.sh
  • final_integration/train_rm.sh
  • final_integration/train_rotation.sh

Training

Since our training occurs in steps, running following script in order is necessary

Running the scripts from the final_integration directory:

  1. final_integration/train_autoencoder.sh - Trains the autoencoder model using top views that are split by camera angles
  2. final_integration/train_rm.sh - Trains 6 CNN's (ResNet18) for 6 angles to predict encodings generated by the best encoder (from step 1) given a true front view in that angle
  3. final_integration/train_rotation.sh - Pretrains a ResNet18 model using SSL rotation pretext task
  4. final_integration/generate_mono_data.sh - Generates training data for predicting masks for dynamic elements
  5. final_integration/train_bb.sh - Train the GAN for predicting masks using the generated data from step 4 and the pretrained ResNet18 from step 3

Testing

Core testing code is highly condensed into just 1 file "final_integration/test_model/model_loader.py" and hence it might be a little unreadable.

Run the following script to test the overall performance of our model:

  1. final_integration/run_test.sh - Runs the testing script used for evaluation in the competition.

Note:

At this point, run_test.sh assumes that the following files exist:

  • "artifacts/models/topview_resnet/front/best_performing.pt"
  • "artifacts/models/topview_resnet/front_left/best_performing.pt"
  • "artifacts/models/topview_resnet/front_right/best_performing.pt"
  • "artifacts/models/topview_resnet/back/best_performing.pt"
  • "artifacts/models/topview_resnet/back_left/best_performing.pt"
  • "artifacts/models/topview_resnet/back_right/best_performing.pt"
  • "artifacts/models/autoencoder/best_performing.pt"
  • "artifacts/models/rotation_ssl/best_performing.pt"
  • "artifacts/models/mono/front/monolayout/best/encoder.pth"
  • "artifacts/models/mono/front_left/monolayout/best/encoder.pth"
  • "artifacts/models/mono/front_right/monolayout/best/encoder.pth"
  • "artifacts/models/mono/back/monolayout/best/encoder.pth"
  • "artifacts/models/mono/back_left/monolayout/best/encoder.pth"
  • "artifacts/models/mono/front/monolayout/best/encoder.pth"
  • "artifacts/models/mono/front/monolayout/best/decoder.pth"
  • "artifacts/models/mono/front_left/monolayout/best/decoder.pth"
  • "artifacts/models/mono/front_right/monolayout/best/decoder.pth"
  • "artifacts/models/mono/back/monolayout/best/decoder.pth"
  • "artifacts/models/mono/back_left/monolayout/best/decoder.pth"
  • "artifacts/models/mono/front/monolayout/best/decoder.pth"

The above files can be generated by running the training shell scripts in the default configuration.

Acknowledgements

A huge thank you to "Mani, Kaustubh and Daga, Swapnil and Garg, Shubhika and Narasimhan, Sai Shankar and Krishna, Madhava and Jatavallabhula, Krishna Murthy". A lot of our inspiration was from their research paper: MonoLayout: Amodal scene layout from a single image and their code