Show, Infer and Tell: Contextual Inference for Creative Captioning

Best Student Paper Award Honourable Mention (Top 0.5%) in 30^th British Machine Vision Conference 2019

Code for the paper, "Show, Infer and Tell: Contextual Inference for Creative Captioning". Paper link: https://bmvc2019.org/wp-content/uploads/papers/0655-paper.pdf

Code supports:

Self Critical Sequence Training
Model Ensemble Averaging
Multi-GPU training
Adam optimizer with reduce on plateau (learning rate reduction when plateaued)

Requirements

Python 2.7 (because there is no coco-caption version for python 3)
PyTorch 0.4 (along with torchvision)
cider

Download COCO captions and preprocess them

Extract dataset_coco.json from the zip file included in this repository and copy it in to data/. This file provides preprocessed captions and also standard train-val-test splits.

The code is based on ruotianluo and neuratalk2. Please refer them for more details on setup.

Notes about Training

Final model with LSTM size 2048 units is over 2 GB. With a batch size of 10 it take 3GB of GPU space.

Evaluate on Karpathy's test split and COCO split

Code includes COCO split option and evaluation script for testing on COCO server.

Reference

If you use the code, please consider citing:

@inproceedings{DBLP:conf/bmvc/KhareH19,
  author    = {Ankit Khare and
               Manfred Huber},
  title     = {Show, Infer and Tell: Contextual Inference for Creative Captioning},
  booktitle = {30th British Machine Vision Conference 2019, {BMVC} 2019, Cardiff,
               UK, September 9-12, 2019},
  pages     = {20},
  publisher = {BMVA Press},
  year      = {2019},
  url       = {https://bmvc2019.org/wp-content/uploads/papers/0655-paper.pdf},
  timestamp = {Tue, 21 Apr 2020 23:13:25 +0200},
  biburl    = {https://dblp.org/rec/conf/bmvc/KhareH19.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Acknowledgements

Thanks to google cloud for giving free trial credits, the original neuraltalk2, rluo and awesome PyTorch team.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
data		data
downloads		downloads
misc		misc
models		models
scripts		scripts
vis		vis
LICENSE		LICENSE
README.md		README.md
_config.yml		_config.yml
dataloader.py		dataloader.py
dataloader_123k.py		dataloader_123k.py
dataloader_server.py		dataloader_server.py
dataloaderraw.py		dataloaderraw.py
eval_ensemble.py		eval_ensemble.py
eval_utils.py		eval_utils.py
modiEval.py		modiEval.py
opts.py		opts.py
opts_noargs.py		opts_noargs.py
prepro_feats.py		prepro_feats.py
server_eval.py		server_eval.py
train.py		train.py
train_gpu_opt.py		train_gpu_opt.py
train_gpu_opt_123k.py		train_gpu_opt_123k.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Show, Infer and Tell: Contextual Inference for Creative Captioning

Code supports:

Requirements

Download COCO captions and preprocess them

Notes about Training

Evaluate on Karpathy's test split and COCO split

Reference

Acknowledgements

About

Releases

Packages

Languages

License

ankit1khare/Show_Infer_and_Tell-CIC

Folders and files

Latest commit

History

Repository files navigation

Show, Infer and Tell: Contextual Inference for Creative Captioning

Code supports:

Requirements

Download COCO captions and preprocess them

Notes about Training

Evaluate on Karpathy's test split and COCO split

Reference

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages