Skip to content

Latest commit

 

History

History
89 lines (70 loc) · 2.58 KB

README.md

File metadata and controls

89 lines (70 loc) · 2.58 KB

Annotation-free Audio-Visual Segmentation

Official implementation of Annotation-free Audio-Visual Segmentation .

This paper has been accepted by WACV 2024, the project page is https://jinxiang-liu.github.io/anno-free-AVS/.



Requirements

Installation

Create a conda environment and install dependencies:

conda create -n sama python=3.10.11
conda activate sama

pip install -r requirements.txt

Dataset

1. Download the datasets

2. Configure the dataset locations

After downloading the datasets with annotations, please declare the directory and file locations in the configs/sam_avs_adapter.yaml file.


Get Started

Evaluation

Model weights: All the weights including the image backbone from SAM, audio backbone for VGGish and our pretrained models are obtained with the OneDrive link.

  • Please place vggish-10086976.pth and sam_vit_h_4b8939.pth in assets sub-folder.
  • Please place the pretrained model weights in ckpts sub-folder.

Test

  • Test on AVS-Synthetic test set
bash scripts/synthetic_test.sh
  • Test on AVSBench S4 test set
bash scripts/s4_test.sh
  • Test on AVSBench MS3 test set
bash scripts/ms3_test.sh

Training

  • Train AVS-Synthetic
bash scripts/synthetic_train.sh
  • Train AVSBench S4
bash scripts/s4_train.sh
  • Train AVSBench MS3
bash scripts/ms3_train.sh

Citation

@inproceedings{liu2024annotation,
  title={Annotation-free audio-visual segmentation},
  author={Liu, Jinxiang and Wang, Yu and Ju, Chen and Ma, Chaofan and Zhang, Ya and Xie, Weidi},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={5604--5614},
  year={2024}
}

Contact

If you have any question, feel free to contact jinxliu#sjtu.edu.cn (replace # with @).