Download the flickr30k dataset in this link
Pre-computed bounding boxes are extracted by using FasterRCNN
We use the config "e2e_faster_rcnn_R_50_C4_1x.yaml" to train the object detector on MSCOCO dataset and extract the feature map at C4 layer.
Language graph extraction by using SceneGraphParser. I have uploaded the sg_anno.json into Google drive. You can download it now.
Some pre-processing data, like sentence annotations, box annotations.
You need to create the './flickr_datasets' folder and put all annotation in it. I would highly recommend you to figure all the data path out in this project. You can refer this two file "maskrcnn_benchmark/config/paths_catalog.py" and "maskrcnn_benchmark/data/flickr.py" for details.

The pretrained object detector weights and annotations can be found here at baidu-disk (link:https://pan.baidu.com/s/1bYbGUsHcZJQHele87MzcMg password:5ie6) or google drive

training

You can train our model by running the scripts

sh scripts/train.sh

citation

If you are interested in our paper, please cite it.

@inproceedings{liu2019learning,
  title={Learning Cross-modal Context Graph for Visual Grounding},
  author={Liu, Yongfei and Wan, Bo and Zhu, Xiaodan and He, Xuming},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligenc}
  year={2020}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

youngfly11 / LCMCG-PyTorch

Programming Languages

Labels

Projects that are alternatives of or similar to LCMCG-PyTorch

LCMCG.Pytorch

Installation

pre-requirements

training

citation