quincyliang / Nlp Data Augmentation
Data Augmentation for NLP. NLP数据增强
Stars: ✭ 235
Labels
Projects that are alternatives of or similar to Nlp Data Augmentation
What I Have Read
Paper Lists, Notes and Slides, Focus on NLP. For summarization, please refer to https://github.com/xcfcode/Summarization-Papers
Stars: ✭ 110 (-53.19%)
Mutual labels: data-augmentation
Evoskeleton
Official project website for the CVPR 2020 paper (Oral Presentation) "Cascaded Deep Monocular 3D Human Pose Estimation With Evolutionary Training Data"
Stars: ✭ 154 (-34.47%)
Mutual labels: data-augmentation
Tsaug
A Python package for time series augmentation
Stars: ✭ 180 (-23.4%)
Mutual labels: data-augmentation
Aaltd18
Data augmentation using synthetic data for time series classification with deep residual networks
Stars: ✭ 124 (-47.23%)
Mutual labels: data-augmentation
Torchsample
High-Level Training, Data Augmentation, and Utilities for Pytorch
Stars: ✭ 1,731 (+636.6%)
Mutual labels: data-augmentation
Torch Audiomentations
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
Stars: ✭ 164 (-30.21%)
Mutual labels: data-augmentation
Cutmix
a Ready-to-use PyTorch Extension of Unofficial CutMix Implementations with more improved performance.
Stars: ✭ 99 (-57.87%)
Mutual labels: data-augmentation
Face.evolve.pytorch
🔥🔥High-Performance Face Recognition Library on PaddlePaddle & PyTorch🔥🔥
Stars: ✭ 2,719 (+1057.02%)
Mutual labels: data-augmentation
Copy Paste Aug
Copy-paste augmentation for segmentation and detection tasks
Stars: ✭ 132 (-43.83%)
Mutual labels: data-augmentation
Muda
A library for augmenting annotated audio data
Stars: ✭ 177 (-24.68%)
Mutual labels: data-augmentation
Semsegpipeline
A simpler way of reading and augmenting image segmentation data into TensorFlow
Stars: ✭ 126 (-46.38%)
Mutual labels: data-augmentation
Ghost Free Shadow Removal
[AAAI 2020] Towards Ghost-free Shadow Removal via Dual Hierarchical Aggregation Network and Shadow Matting GAN
Stars: ✭ 133 (-43.4%)
Mutual labels: data-augmentation
Stylealign
[ICCV 2019]Aggregation via Separation: Boosting Facial Landmark Detector with Semi-Supervised Style Transition
Stars: ✭ 172 (-26.81%)
Mutual labels: data-augmentation
All Conv Keras
All Convolutional Network: (https://arxiv.org/abs/1412.6806#) implementation in Keras
Stars: ✭ 115 (-51.06%)
Mutual labels: data-augmentation
Tensorflow Mnist Cnn
MNIST classification using Convolutional NeuralNetwork. Various techniques such as data augmentation, dropout, batchnormalization, etc are implemented.
Stars: ✭ 182 (-22.55%)
Mutual labels: data-augmentation
Fcn train
The code includes all the file that you need in the training stage for FCN
Stars: ✭ 104 (-55.74%)
Mutual labels: data-augmentation
Imagecorruptions
Python package to corrupt arbitrary images.
Stars: ✭ 158 (-32.77%)
Mutual labels: data-augmentation
Syndata Generation
Code used to generate synthetic scenes and bounding box annotations for object detection. This was used to generate data used in the Cut, Paste and Learn paper
Stars: ✭ 214 (-8.94%)
Mutual labels: data-augmentation
Scaper
A library for soundscape synthesis and augmentation
Stars: ✭ 186 (-20.85%)
Mutual labels: data-augmentation
Torch videovision
Transforms for video datasets in pytorch
Stars: ✭ 174 (-25.96%)
Mutual labels: data-augmentation
NLP Data Augmentaion
Paper
- Unsupervised Data Augmentation
- Unsupervised Question Answering by Cloze Translation
- Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
- How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?
- It’s Morphin’ Time! Combating Linguistic Discrimination with Inflectional Perturbations
Overview
- A Visual Survey of Data Augmentation in NLP
- Task-independent data augmentation for NLP
- Robust, Unbiased Natural Language Processing pdf
Methods
- General
- random insertion, deletion, word, sentence shuffling
- Replacing words with synonyms
- Replace the words from dicitionary of the same label
- Perturbations (letter, word, or sentence level)
- Language model
- Back translation
- Round-trip translation
-
Leverage External Data
- Using external data derived from Wikipedia. linking wikipedia articles to arbitrary input text. The idea is that if the input text were on Wikipedia, it would have links to other Wikipedia articles (that are semantically related and provide additional info).
- break the input text into n-grams
- check whether each n-gram exists as a wikipedia article to create a set of ‘candidate links’
- prune the candidate links by computing the similarity of the input text and the abstract of each candidate
- Using external data derived from Wikipedia. linking wikipedia articles to arbitrary input text. The idea is that if the input text were on Wikipedia, it would have links to other Wikipedia articles (that are semantically related and provide additional info).
- Conversational Systems
- Reading Comprehension
Library
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].