All Projects → WeitaoVan → Sohu-LuckData-Image-Text-Matching-Competition

WeitaoVan / Sohu-LuckData-Image-Text-Matching-Competition

Licence: other
Sohu 2017 competition. We won the third prize.

Programming Languages

python
139335 projects - #7 most used programming language

Sohu Chinese Image-Text Matching Competition 2017. We won the third prize.

Requirements

Please clone the Chinese word-segmentation tools https://github.com/thunlp/THULAC-Python into ./NLP

And clone a caffe repo into ./caffe

Download a pre-trained VGG-16 model trained on ImageNet with caffe.

Platform:

tensorflow 1.0

caffe

Model Architecture:

Image -> vgg16 -> 4096-d feature -> fc1 -> fc2 -> distance(image, text)

Text TF-IDF or GMM feature -> fc1 ->  fc2 ->  distance(image, text)

References:

《Learning two-branch neural networks for image-text matching tasks》

《Fisher Vectors Derived from Hybrid Gaussian-Laplacian Mixture Models for Image Annotation》

Files:

caffe-recurrent/
- extract_feature.py # extract features using VGG-16

Tensorflow/
-BidirectionNet_tfidf.py # train using the tfidf features
-BidirectionNet_lstm.py # train lstm
-test_match_pairList.py # generate the matching result (top-10) using image/text embeddings.

NLP/
-run_SH_split.py # word segmentation and generate vocabulary

-word_type.py # filter nouns and verbs

-run_create_train_txt.py # generate one file .txt based on the word-segmented files and a vocabulary.

-tfidf_from_seg.py # compute tfidf

-kmeans_word2vec.py cluster based on word2vec (fasttext word2vec is recommended. Search 'fasttext' repo)

Key of performance improvment:

model ensemble

OCR (which we did not use)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].