Skip to content

TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning

License

Notifications You must be signed in to change notification settings

OpenGVLab/TimeSuite

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

🦜 Introduction

This paper proposes TimeSuite, a collection of new designs to adapt the existing short-form video MLLMs for long video understanding, including a simple yet efficient framework to process long video sequence, a high-quality video dataset for grounded tuning of MLLMs, and a carefully-designed instruction tuning task to explicitly incorporate the grounding supervision in the traditional QA format.

State-of-the-art performance: VideoChat-T demonstrates high performance for both long-form video question answering and temporal grounding. alt text

Highly efficient model architecture with exceptional inference speed, encoding each video frame into just 3 tokens, leading to the flops of our VideoChat-T are 5.1% of Llava-OneVision alt text

High-quality data

  • We introduced the comprehensive dataset TimePro, which includes 9 task types with video sources from 15 different datasets.
  • We designed a novel Temporal Grounded Caption fine-tuning task to effectively mitigate hallucinations in MLLM. alt text

🔥 Updates

TODO

Inference & Demo

TODO

Evaluation Results

TODO

Grounded Training

TODO

📄 Citation

If you find this project useful in your research, please consider cite:

@article{zeng2024timesuite,
  title={Timesuite: Improving mllms for long video understanding via grounded tuning},
  author={Zeng, Xiangyu and Li, Kunchang and Wang, Chenting and Li, Xinhao and Jiang, Tianxiang and Yan, Ziang and Li, Songze and Shi, Yansong and Yue, Zhengrong and Wang, Yi and others},
  journal={arXiv preprint arXiv:2410.19702},
  year={2024}
}

💫 Acknowledgement

About

TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published