outcastofmusic / Quick Nlp
Licence: mit
Pytorch NLP library based on FastAI
Stars: ✭ 279
Programming Languages
python
139335 projects - #7 most used programming language
Projects that are alternatives of or similar to Quick Nlp
classy
classy is a simple-to-use library for building high-performance Machine Learning models in NLP.
Stars: ✭ 61 (-78.14%)
Mutual labels: seq2seq, nlp-library
Transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Stars: ✭ 55,742 (+19879.21%)
Mutual labels: nlp-library, seq2seq
DLCV2018SPRING
Deep Learning for Computer Vision (CommE 5052) in NTU
Stars: ✭ 38 (-86.38%)
Mutual labels: seq2seq
TaLKConvolutions
Official PyTorch implementation of Time-aware Large Kernel (TaLK) Convolutions (ICML 2020)
Stars: ✭ 26 (-90.68%)
Mutual labels: seq2seq
Seq2seq chatbot links
Links to the implementations of neural conversational models for different frameworks
Stars: ✭ 270 (-3.23%)
Mutual labels: seq2seq
dts
A Keras library for multi-step time-series forecasting.
Stars: ✭ 130 (-53.41%)
Mutual labels: seq2seq
Nagisa
A Japanese tokenizer based on recurrent neural networks
Stars: ✭ 260 (-6.81%)
Mutual labels: nlp-library
NeuralTextSimplification
Exploring Neural Text Simplification
Stars: ✭ 64 (-77.06%)
Mutual labels: seq2seq
NLP-tools
Useful python NLP tools (evaluation, GUI interface, tokenization)
Stars: ✭ 39 (-86.02%)
Mutual labels: nlp-library
chatbot
🤖️ 基于 PyTorch 的任务型聊天机器人(支持私有部署和 docker 部署的 Chatbot)
Stars: ✭ 77 (-72.4%)
Mutual labels: seq2seq
2D-LSTM-Seq2Seq
PyTorch implementation of a 2D-LSTM Seq2Seq Model for NMT.
Stars: ✭ 25 (-91.04%)
Mutual labels: seq2seq
Deepqa
My tensorflow implementation of "A neural conversational model", a Deep learning based chatbot
Stars: ✭ 2,811 (+907.53%)
Mutual labels: seq2seq
clj-duckling
Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings. (a duckling clojure fork)
Stars: ✭ 15 (-94.62%)
Mutual labels: nlp-library
Keras Text Summarization
Text summarization using seq2seq in Keras
Stars: ✭ 260 (-6.81%)
Mutual labels: seq2seq
Giveme5W
Extraction of the five journalistic W-questions (5W) from news articles
Stars: ✭ 16 (-94.27%)
Mutual labels: nlp-library
torch-asg
Auto Segmentation Criterion (ASG) implemented in pytorch
Stars: ✭ 42 (-84.95%)
Mutual labels: seq2seq
keras seq2seq word level
Implementation of seq2seq word-level model using keras
Stars: ✭ 12 (-95.7%)
Mutual labels: seq2seq
Chatbot ner
chatbot_ner: Named Entity Recognition for chatbots.
Stars: ✭ 273 (-2.15%)
Mutual labels: nlp-library
Quick NLP
Quick NLP is a deep learning nlp library inspired by the fast.ai library <https://github.com/fastai/fastai>
_
It follows the same api as fastai and extends it allowing for quick and easy running of nlp models
Features
- Python 3.6 code
- Tight-knit integration with Fast.ai library:
- Fast.ai style DataLoader objects for sentence to sentence algorithms
- Fast.ai style DataLoader objects for dialogue algorithms
- Fast.ai style DataModel objects for training nlp models
- Can run a seq2seq model with a few lines of code similar to existing fast.ai examples
- Easy to expand/train and try different models or use different data
- Ready made algorithms to try out
- Seq2Seq https://arxiv.org/abs/1506.05869
- Seq2Seq with Attention https://arxiv.org/abs/1703.03906
- HRED http://www.aaai.org/ocs/index.php/AAAI/AAAI17/paper/download/14567/14219
- Attention is all you need http://papers.nips.cc/paper/7181-attention-is-all-you-need
- Depthwise Separable Convolutions for Neural Machine Translation (TODO) https://arxiv.org/abs/1706.03059
Installation
Installation of fast.ai library is required. Please install using the instructions here <https://github.com/fastai/fastai>
_ .
It is important that the latest version of fast.ai is used and not the pip version which is not up to date.
After setting up an environment using the fasta.ai instructions please clone the quick-nlp repo and use pip install to install the package as follows:
.. code-block:: bash
git clone https://github.com/outcastofmusic/quick-nlp
cd quick-nlp
pip install .
Docker Image
A docker image with the latest master is available to use it please run:
.. code-block:: bash
docker run --runtime nvidia -it -p 8888:8888 --mount type=bind,source="$(pwd)",target=/workspace agispof/quicknlp:latest
this will mount your current directory to /workspace and start a jupyter lab session in that directory
Usage Example
-------------
The main goal of quick-nlp is to provided the easy interface of the fast.ai library for seq2seq models.
For example Lets assume that we have a dataset_path with folders for training, validation files.
Each file is a tsv file where each row is two sentences separated by a tab. For example a file inside the train folder can be a eng_to_fr.tsv file with the following first few lines::
Go. Va !
Run! Cours !
Run! Courez !
Wow! Ça alors !
Fire! Au feu !
Help! À l'aide !
Jump. Saute.
Stop! Ça suffit !
Stop! Stop !
Stop! Arrête-toi !
Wait! Attends !
Wait! Attendez !
I see. Je comprends.
loading the data from the directory is as simple as:
.. code-block:: python
from fastai.plots import *
from torchtext.data import Field
from fastai.core import SGD_Momentum
from fastai.lm_rnn import seq2seq_reg
from quicknlp import SpacyTokenizer, print_batch, S2SModelData
INIT_TOKEN = "<sos>"
EOS_TOKEN = "<eos>"
DATAPATH = "dataset_path"
fields = [
("english", Field(init_token=INIT_TOKEN, eos_token=EOS_TOKEN, tokenize=SpacyTokenizer('en'), lower=True)),
("french", Field(init_token=INIT_TOKEN, eos_token=EOS_TOKEN, tokenize=SpacyTokenizer('fr'), lower=True))
]
batch_size = 64
data = S2SModelData.from_text_files(path=DATAPATH, fields=fields,
train="train",
validation="validation",
source_names=["english", "french"],
target_names=["french"],
bs= batch_size
)
Finally, to train a seq2seq model with the data we only need to do:
.. code-block:: python
emb_size = 300
nh = 1024
nl = 3
learner = data.get_model(opt_fn=SGD_Momentum(0.7), emb_sz=emb_size,
nhid=nh,
nlayers=nl,
bidir=True,
)
clip = 0.3
learner.reg_fn = reg_fn
learner.clip = clip
learner.fit(2.0, wds=1e-6)
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].