Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → eric-xw → Video-guided-Machine-Translation

eric-xw / Video-guided-Machine-Translation

Licence: other

Starter code for the VMT task and challenge

Programming Languages

139335 projects - #7 most used programming language

77523 projects

Labels

machine-translation multimodal-machine-translation video-guided-machine-translation vatex

Projects that are alternatives of or similar to Video-guided-Machine-Translation

Distill-BERT-Textgen

Research code for ACL 2020 paper: "Distilling Knowledge Learned in BERT for Text Generation".

Stars: ✭ 121 (+168.89%)

Mutual labels: machine-translation

The Business Scene Dialogue corpus

Stars: ✭ 51 (+13.33%)

Mutual labels: machine-translation

A simple ruby gem for the DeepL API

Stars: ✭ 38 (-15.56%)

Mutual labels: machine-translation

Sanskrit compound segmentation using seq2seq model

Stars: ✭ 21 (-53.33%)

Mutual labels: machine-translation

Interactive Neural Machine Translation tool

Stars: ✭ 44 (-2.22%)

Mutual labels: machine-translation

parallel-corpora-tools

Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.

Stars: ✭ 35 (-22.22%)

Mutual labels: machine-translation

Training open neural machine translation models

Stars: ✭ 166 (+268.89%)

Mutual labels: machine-translation

omegat-tencent-plugin

This is a plugin to allow OmegaT to source machine translations from Tencent Cloud.

Stars: ✭ 31 (-31.11%)

Mutual labels: machine-translation

A tool for translating text from source grammar to target grammar (context-free) with corresponding dictionary.

Stars: ✭ 19 (-57.78%)

Mutual labels: machine-translation

Machine-Translation-v2

英中机器文本翻译

Stars: ✭ 48 (+6.67%)

Mutual labels: machine-translation

The official code repository for MetricMT - a reward optimization method for NMT with learned metrics

Stars: ✭ 23 (-48.89%)

Mutual labels: machine-translation

SequenceToSequence

A seq2seq with attention dialogue/MT model implemented by TensorFlow.

Stars: ✭ 11 (-75.56%)

Mutual labels: machine-translation

Masakhane Web is a translation web application for solely African Languages.

Stars: ✭ 27 (-40%)

Mutual labels: machine-translation

tai5-uan5 gian5-gi2 kang1-ku7

臺灣言語工具

Stars: ✭ 79 (+75.56%)

Mutual labels: machine-translation

A Fast Neural Machine Translation System. It is developed in C++ and resorts to NiuTensor for fast tensor APIs.

Stars: ✭ 112 (+148.89%)

Mutual labels: machine-translation

extreme-adaptation-for-personalized-translation

Code for the paper "Extreme Adaptation for Personalized Neural Machine Translation"

Stars: ✭ 42 (-6.67%)

Mutual labels: machine-translation

Machine-Translation-Hindi-to-english-

Machine translation is the task of converting one language to other. Unlike the traditional phrase-based translation system which consists of many small sub-components that are tuned separately, neural machine translation attempts to build and train a single, large neural network that reads a sentence and outputs a correct translation.

Stars: ✭ 19 (-57.78%)

Mutual labels: machine-translation

A tool that locates, downloads, and extracts machine translation corpora

Stars: ✭ 95 (+111.11%)

Mutual labels: machine-translation

Deep-NLP-Resources

Curated list of all NLP Resources

Stars: ✭ 65 (+44.44%)

Mutual labels: machine-translation

Neural machine translation implementation using dynet's python bindings

Stars: ✭ 17 (-62.22%)

Mutual labels: machine-translation

View All Similar Projects ➔

Video-guided Machine Translation

This repo contains the starter code for the VATEX Translation Challenge for Video-guided Machine Translation (VMT), aiming at translating a source language description into the target language with video information as additional spatiotemporal context.

VMT is introduced in our ICCV oral paper "VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research". VATEX is a new large-scale multilingual video description dataset, which contains over 41,250 videos and 825,000 captions in both English and Chinese and half of these captions are English-Chinese translation pairs. For more details, please check the latest version of the paper: https://arxiv.org/abs/1904.03493.

Prerequisites

Python 3.7
PyTorch 1.4 (1.0+)
nltk 3.4.5

Training

1. Download corpus files and the extracted video features

First, under the vmt/ directory, download train/val/test json file:

./data/download.sh

Then download the I3D video features from here for trainval and here for test

# set up your DIR/vatex_features for storing large video features
mkdir DIR/vatex_features

wget https://vatex-feats.s3.amazonaws.com/trainval.zip -P DIR/vatex_features
unzip DIR/vatex_features/trainval.zip
wget https://vatex-feats.s3.amazonaws.com/public_test.zip -P DIR/vatex_features
unzip DIR/vatex_features/public_test.zip

cd vmt/
ln -s DIR/vatex_features data/vatex_features

2. Training

To train the baseline VMT model:

python train.py

The default hyperparamters are set in configs.yaml.

Evaluation

Run

python eval.py

Specify the model name in configs.yaml. The script will generate a json file for submission to the VMT Challenge on CodaLab.

Results

The baseline VMT model achieves the following performance on corpus-level bleu score (the numbers here are slightly different from those in the paper due to different evaluation setups. For fair comparison, please compare with the performance here):

Model	EN -> ZH	ZH -> EN
BLEU-4	31.1	24.6

On the evaluation server, we report cumulative corpus-level BLEU score (up to 4-gram) and each individual n-gram score for reference, shown as B-1, ..., B-4.

Model performance is evaluated by cumulative BLEU-4 score in the challenge.

Reference

Please cite our paper if you use our code or dataset:

@InProceedings{Wang_2019_ICCV,
author = {Wang, Xin and Wu, Jiawei and Chen, Junkun and Li, Lei and Wang, Yuan-Fang and Wang, William Yang},
title = {VaTeX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 45

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (6) 🔗