Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

[CVPR 2019] "Handwriting Recognition in Low-resource Scripts using Adversarial Learning ”, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2019.

Stars: ✭ 52 (-11.86%)

Mutual labels: data-augmentation

Describing a knowledge base

Code for Describing a Knowledge Base

Stars: ✭ 42 (-28.81%)

Mutual labels: natural-language-generation

Pqg Pytorch

Paraphrase Generation model using pair-wise discriminator loss

Stars: ✭ 33 (-44.07%)

Mutual labels: natural-language-generation

Convai Baseline

ConvAI baseline solution

Stars: ✭ 49 (-16.95%)

Mutual labels: natural-language-generation

Essentials

The essential plugin suite for Minecraft servers.

Stars: ✭ 957 (+1522.03%)

Mutual labels: paper

Multiagent Particle Envs

Code for a multi-agent particle environment used in the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"

Stars: ✭ 1,086 (+1740.68%)

Mutual labels: paper

Nlp xiaojiang

自然语言处理（nlp），小姜机器人（闲聊检索式chatbot），BERT句向量-相似度（Sentence Similarity），XLNET句向量-相似度（text xlnet embedding），文本分类（Text classification），实体提取（ner，bert+bilstm+crf），数据增强（text augment, data enhance），同义句同义词生成，句子主干提取（mainpart），中文汉语短文本相似度，文本特征工程，keras-http-service调用

Stars: ✭ 954 (+1516.95%)

Mutual labels: data-augmentation

Neural Architecture Search With Rl

Minimal Tensorflow implementation of the paper "Neural Architecture Search With Reinforcement Learning" presented at ICLR 2017

Stars: ✭ 37 (-37.29%)

Mutual labels: paper

Style Transfer In Text

Paper List for Style Transfer in Text

Stars: ✭ 1,030 (+1645.76%)

Mutual labels: paper

Bert In Production

A collection of resources on using BERT (https://arxiv.org/abs/1810.04805 ) and related Language Models in production environments.

Stars: ✭ 58 (-1.69%)

Mutual labels: paper

View All Similar Projects ➔

Submodular optimization-based diverse paraphrasing and its effectiveness in data augmentation

Source code for NAACL 2019 paper: Submodular optimization-based diverse paraphrasing and its effectiveness in data augmentation

Overview of DiPS during decoding to generate k paraphrases. At each time step, a set of N sequences V^(t) is used to determine k < N sequences (X^∗) via submodular maximization . The above figure illustrates the motivation behind each submodular component. Please see Section 4 in the paper for details.

Dependencies

compatible with python 3.6
dependencies can be installed using requirements.txt

Dataset

Download the following datasets:

Extract and place them in the data directory. Path : data/<dataset-folder-name>. A sample dataset folder might look like data/quora/<train/test/val>/<src.txt/tgt.txt>.

Download GoogleNews-vectors-negative300.bin.gz into the data directory. In case the above link doesn't work, find the zip file here

Setup:

To get the project's source code, clone the github repository:

$ git clone https://github.com/malllabiisc/DiPS

Install VirtualEnv using the following (optional):

$ [sudo] pip install virtualenv

Create and activate your virtual environment (optional):

$ virtualenv -p python3 venv
$ source venv/bin/activate

Install all the required packages:

$ pip install -r requirements.txt

Install the submodopt package by running the following command from the root directory of the repository:

$ cd ./packages/submodopt
$ python setup.py install
$ cd ../../

Training the sequence to sequence model

python -m src.main -mode train -gpu 0 -use_attn -bidirectional -dataset quora -run_name <run_name>

Create dictionary for submodular subset selection. Used for Semantic similarity (L₂)

To use trained embeddings -

python -m src.create_dict -model trained -run_name <run_name> -gpu 0

To use pretrained word2vec embeddings -

python -m src.create_dict -model pretrained -run_name <run_name> -gpu 0

This will generate the word2vec.pickle file in data/embeddings

Decoding using submodularity

python -m src.main -mode decode -selec submod -run_name <run_name> -beam_width 10 -gpu 0

Citation

Please cite the following paper if you find this work relevant to your application

@inproceedings{dips2019,
    title = "Submodular Optimization-based Diverse Paraphrasing and its Effectiveness in Data Augmentation",
    author = "Kumar, Ashutosh  and
      Bhattamishra, Satwik  and
      Bhandari, Manik  and
      Talukdar, Partha",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/N19-1363",
    pages = "3609--3619"
}

For any clarification, comments, or suggestions please create an issue or contact [email protected] or Satwik Bhattamishra

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 59

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

malllabiisc / Dips

Programming Languages

Labels

Projects that are alternatives of or similar to Dips

Submodular optimization-based diverse paraphrasing and its effectiveness in data augmentation

Dependencies

Dataset

Setup:

Training the sequence to sequence model

Create dictionary for submodular subset selection. Used for Semantic similarity (L2)

Decoding using submodularity

Citation

Create dictionary for submodular subset selection. Used for Semantic similarity (L₂)