bearcatt / LaBERT

Licence: other

A length-controllable and non-autoregressive image captioning model.

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to LaBERT

Image-Captioning-with-Beam-Search

Generating image captions using Xception Network and Beam Search in Keras

Stars: ✭ 18 (-64%)

Mutual labels: image-captioning

Udacity

This repo includes all the projects I have finished in the Udacity Nanodegree programs

Stars: ✭ 57 (+14%)

Mutual labels: image-captioning

tfvaegan

[ECCV 2020] Official Pytorch implementation for "Latent Embedding Feedback and Discriminative Features for Zero-Shot Classification". SOTA results for ZSL and GZSL

Stars: ✭ 107 (+114%)

Mutual labels: eccv2020

deep-atrous-guided-filter

Deep Atrous Guided Filter for Image Restoration in Under Display Cameras (UDC Challenge, ECCV 2020).

Stars: ✭ 32 (-36%)

Mutual labels: eccv2020

PiP-Planning-informed-Prediction

(ECCV 2020) PiP: Planning-informed Trajectory Prediction for Autonomous Driving

Stars: ✭ 101 (+102%)

Mutual labels: eccv2020

DeepPS

[ECCV 2020] "Deep Plastic Surgery: Robust and Controllable Image Editing with Human-Drawn Sketches"

Stars: ✭ 63 (+26%)

Mutual labels: eccv2020

SRResCycGAN

Code repo for "Deep Cyclic Generative Adversarial Residual Convolutional Networks for Real Image Super-Resolution" (ECCVW AIM2020).

Stars: ✭ 47 (-6%)

Mutual labels: eccv2020

People-Flows

The code for our ECCV 2020 paper: Estimating People Flows to Better Count Them in Crowded Scenes

Stars: ✭ 44 (-12%)

Mutual labels: eccv2020

SAN

[ECCV 2020] Scale Adaptive Network: Learning to Learn Parameterized Classification Networks for Scalable Input Images

Stars: ✭ 41 (-18%)

Mutual labels: eccv2020

BUTD model

A pytorch implementation of "Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering" for image captioning.

Stars: ✭ 28 (-44%)

Mutual labels: image-captioning

udacity-cvnd-projects

My solutions to the projects assigned for the Udacity Computer Vision Nanodegree

Stars: ✭ 36 (-28%)

Mutual labels: image-captioning

IAST-ECCV2020

IAST: Instance Adaptive Self-training for Unsupervised Domain Adaptation (ECCV 2020) https://teacher.bupt.edu.cn/zhuchuang/en/index.htm

Stars: ✭ 84 (+68%)

Mutual labels: eccv2020

WS3D

Official version of 'Weakly Supervised 3D object detection from Lidar Point Cloud'(ECCV2020)

Stars: ✭ 104 (+108%)

Mutual labels: eccv2020

catr

Image Captioning Using Transformer

Stars: ✭ 206 (+312%)

Mutual labels: image-captioning

FFWM

Implementation of "Learning Flow-based Feature Warping for Face Frontalization with Illumination Inconsistent Supervision" (ECCV 2020).

Stars: ✭ 107 (+114%)

Mutual labels: eccv2020

CS231n

CS231n Assignments Solutions - Spring 2020

Stars: ✭ 48 (-4%)

Mutual labels: image-captioning

Expressive-FastSpeech2

PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.

Stars: ✭ 139 (+178%)

Mutual labels: non-autoregressive

visdial

Visual Dialog: Light-weight Transformer for Many Inputs (ECCV 2020)

Stars: ✭ 27 (-46%)

Mutual labels: eccv2020

VAENAR-TTS

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Stars: ✭ 66 (+32%)

Mutual labels: non-autoregressive

Cross-Speaker-Emotion-Transfer

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech

Stars: ✭ 107 (+114%)

Mutual labels: non-autoregressive

View All Similar Projects ➔

Length-Controllable Image Captioning (ECCV2020)

This repo provides the implemetation of the paper Length-Controllable Image Captioning.

Install

conda create --name labert python=3.7
conda activate labert

conda install pytorch=1.3.1 torchvision cudatoolkit=10.1 -c pytorch
pip install h5py tqdm transformers==2.1.1
pip install git+https://github.com/salaniz/pycocoevalcap

Data & Pre-trained Models

Prepare MSCOCO data follow link.
Download pretrained Bert and Faster-RCNN from Baidu Cloud Disk [code: 0j9f] or Google Drive.
- It's an unified checkpoint file, containing a pretrained Bert-base and the fc6 layer of the Faster-RCNN.
Download our pretrained LaBERT model from Baidu Cloud Disk [code: fpke] or Google Drive.

Scripts

Train

python -m torch.distributed.launch \
  --nproc_per_node=$NUM_GPUS \
  --master_port=4396 train.py \
  save_dir $PATH_TO_TRAIN_OUTPUT \
  samples_per_gpu $NUM_SAMPLES_PER_GPU

Continue train

python -m torch.distributed.launch \
  --nproc_per_node=$NUM_GPUS \
  --master_port=4396 train.py \
  save_dir $PATH_TO_TRAIN_OUTPUT \
  samples_per_gpu $NUM_SAMPLES_PER_GPU \
  model_path $PATH_TO_MODEL

Inference

python inference.py \
  model_path $PATH_TO_MODEL \
  save_dir $PATH_TO_TEST_OUTPUT \
  samples_per_gpu $NUM_SAMPLES_PER_GPU

Evaluate

python evaluate.py \
  --gt_caption data/id2captions_test.json \
  --pd_caption $PATH_TO_TEST_OUTPUT/caption_results.json \
  --save_dir $PATH_TO_TEST_OUTPUT

Cite

Please consider citing our paper in your publications if the project helps your research.

@article{deng2020length,
  title={Length-Controllable Image Captioning},
  author={Deng, Chaorui and Ding, Ning and Tan, Mingkui and Wu, Qi},
  journal={arXiv preprint arXiv:2007.09580},
  year={2020}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

bearcatt / LaBERT

Programming Languages

Labels

Projects that are alternatives of or similar to LaBERT

Length-Controllable Image Captioning (ECCV2020)

Install

Data & Pre-trained Models

Scripts

Cite