Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → iwasaki-kenta → Keita

iwasaki-kenta / Keita

My personal toolkit for PyTorch development.

Programming Languages

python

139335 projects - #7 most used programming language

Labels

machine-learning pytorch library computer-vision natural-language-processing meta-learning

Projects that are alternatives of or similar to Keita

Pynlpl

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).

Stars: ✭ 426 (+243.55%)

Mutual labels: library, natural-language-processing

Nlp bahasa resources

A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia

Stars: ✭ 158 (+27.42%)

Mutual labels: library, natural-language-processing

Ml Classify Text Js

Machine learning based text classification in JavaScript using n-grams and cosine similarity

Stars: ✭ 38 (-69.35%)

Mutual labels: library, natural-language-processing

Acl Anthology

Data and software for building the ACL Anthology.

Stars: ✭ 168 (+35.48%)

Mutual labels: library, natural-language-processing

Lingua Franca

Mycroft's multilingual text parsing and formatting library

Stars: ✭ 51 (-58.87%)

Mutual labels: library, natural-language-processing

Mapdrawingtools

this library Drawing polygon, polyline and points in Google Map and return coordinates to your App

Stars: ✭ 122 (-1.61%)

Mutual labels: library

Ratifier

Ratifier is a form validation library for Android.

Stars: ✭ 123 (-0.81%)

Mutual labels: library

Hubspot3

python3.5+ hubspot client based on hapipy, but modified to use the newer endpoints and non-legacy python

Stars: ✭ 121 (-2.42%)

Mutual labels: library

Meta Learning Lstm Pytorch

pytorch implementation of Optimization as a Model for Few-shot Learning

Stars: ✭ 121 (-2.42%)

Mutual labels: meta-learning

Gradle Maven Plugin

Gradle 5.x Maven Publish Plugin to deploy artifacts

Stars: ✭ 124 (+0%)

Mutual labels: library

Aws Machine Learning University Accelerated Nlp

Machine Learning University: Accelerated Natural Language Processing Class

Stars: ✭ 1,695 (+1266.94%)

Mutual labels: natural-language-processing

Angular Feather

A-la-carte integration of Feather Icons in Angular applications

Stars: ✭ 123 (-0.81%)

Mutual labels: library

Files2rouge

Calculating ROUGE score between two files (line-by-line)

Stars: ✭ 120 (-3.23%)

Mutual labels: natural-language-processing

Fnc 1 Baseline

A baseline implementation for FNC-1

Stars: ✭ 123 (-0.81%)

Mutual labels: natural-language-processing

Nlp Pretrained Model

A collection of Natural language processing pre-trained models.

Stars: ✭ 122 (-1.61%)

Mutual labels: natural-language-processing

Riko

A Python stream processing engine modeled after Yahoo! Pipes

Stars: ✭ 1,571 (+1166.94%)

Mutual labels: library

Imagezipper

An image compresssion library in android.

Stars: ✭ 121 (-2.42%)

Mutual labels: library

Clicr

Machine reading comprehension on clinical case reports

Stars: ✭ 123 (-0.81%)

Mutual labels: natural-language-processing

Colore

A powerful C# library for Razer Chroma's SDK

Stars: ✭ 121 (-2.42%)

Mutual labels: library

Spacy Js

🎀 JavaScript API for spaCy with Python REST API

Stars: ✭ 123 (-0.81%)

Mutual labels: natural-language-processing

View All Similar Projects ➔

Keita: A PyTorch Toolkit

Description

A couple of PyTorch utilities, dataset loaders, and layers suitable for natural language processing, computer vision, meta-learning, etc. which I'm opening out to the community.

I cannot guarantee fixing potential bugs you may find whatsoever; though if you'd like to report any then feel free to file an issue/pull request and I'll try my luck on it. Feedback and suggestions are definitely appreciated!

In terms of code organization, I would like to clarify that I myself am not a fan of using huge repositories of highly un-maintained, dependant code and thus intend to keep this repository as modular as possible. Hence, for all modules you wish to use in your project, copy-pasting the module alongside a few utility methods should be all that you need to do to get it incorporated into your project.

I intend to make the code as clean and well-documented as possible by keeping the code style consistent and developer-friendly (clear variable names, simple references to different modules within the toolkit, etc.).

Dependencies

PyTorch, TorchVision, TQDM, and the bleeding edge build version of TorchText required if you wish to use all the modules within this toolkit.

Deep metric learning losses. (mahalonobis-distance hard negative mining)
Probabilistic/non-linear models. (gaussian mixture models, conditional random fields)
Meta-learning models. (temporal convolution meta-learner)
Activation unit layers. (gated activation unit for PixelCNN)
Extended convolution layer support. (separable convolutions, causal convolutions)
Convolution/recurrent-based inter-attention layers (additive, dot-product, concat, bidirectional, bilinear)
Convolution/recurrent-based text classification models.
Convolution/recurrent-based sentence embedding models.
TorchText extensions for training (test/validation dataset split, word embeddings)
Text/vision dataset loaders. (Omniglot, normal <-> simple wikipedia)
Modular PyTorch model training utilities w/ model checkpoints, and validation loss/accuracy checks.
How-to example PyTorch code snippets.

Papers I've Implemented w/ Keita

A Deep Reinforced Model for Abstractive Summarization
Meta-Learning with Temporal Convolutions
Conditional Image Generation with PixelCNN Decoders
WaveNet: A Generative Model for Raw Audio
Deep Metric Learning via Lifted Structured Feature Embedding
Max-Margin Object Detection
Neural Machine Translation by Jointly Learning to Align and Translate
Effective Approaches to Attention-based Neural Machine Translation
DeXpression: Deep Convolutional Neural Network for Expression Recognition
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
YOLO9000: Better, Faster, Stronger
A Deep Reinforced Model for Abstractive Summarization
Bidirectional LSTM-CRF Models for Sequence Tagging
Discriminative Deep Metric Learning for Face Verification in the Wild
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
A Neural Representation of Sketch Drawings
Hierarchical Attention Networks for Document Classification

Example Snippets

"""
Create a PyTorch trainer which handles model checkpointing/loss/accuracy tracking given
training and validation dataset iterators.
"""

from text.models import classifiers
from text.models.cnn import encoders
from datasets import text
from torchtext import data
from torch import nn, optim
from train.utils import train_epoch, TrainingProgress
import torch

batch_size = 32
embed_size = 300

model = classifiers.LinearNet(embed_dim=embed_size, hidden_dim=64,
                              encoder=encoders.HierarchialNetwork1D,
                              num_classes=2)
if torch.cuda.is_available(): model = model.cuda()

train, valid, vocab = text.simple_wikipedia(split_factor=0.9)
vocab.vectors = vocab.vectors.cpu()

sort_key = lambda batch: data.interleave_keys(len(batch.normal), len(batch.simple))
train_iterator = data.iterator.Iterator(train, batch_size, shuffle=True, device=-1, repeat=False, sort_key=sort_key)
valid_iterator = data.iterator.Iterator(valid, batch_size, device=-1, train=False, sort_key=sort_key)

optimizer = optim.Adam(model.parameters())
criterion = nn.CrossEntropyLoss()

progress = TrainingProgress()

def training_process(batch, train):
    # Process batch here and return torch.autograd.Variable's representing loss and accuracy.
    return loss, acc

for epoch in range(100):
        train_epoch(epoch, model, train_iterator, valid_iterator, processor=training_process, progress=progress)

"""
Load a text dataset padded, embedded w/ GloVe word vectors, sorted according to sentence length
for direct use with PyTorch's pad packing for RNN modules and print some statistics.
"""

from text import utils
from torchtext.data.iterator import Iterator
from datasets.text import simple_wikipedia
from torchtext import data

train, valid, vocab = simple_wikipedia()

sort_key = lambda batch: data.interleave_keys(len(batch.normal), len(batch.simple))
train_iterator = Iterator(train, 32, shuffle=True, device=-1, repeat=False, sort_key=sort_key)
valid_iterator = Iterator(valid, 32, device=-1, train=False, sort_key=sort_key)

train_batch = next(iter(train_iterator))
valid_batch = next(iter(valid_iterator))

normal_sentences, normal_sentence_lengths = train_batch.normal
normal_sentences = utils.embed_sentences(normal_sentences, vocab.vectors)

print("A normal batch looks like %s. " % str(normal_sentences.size()))
print("The dataset contains %d train samples, %d validation samples w/ a vocabulary size of %d. " % (
    len(train), len(valid), len(vocab)))

"""
Paulus et al. encoder/decoder attention layer example usage for the paper
"A Deep Reinforced Model for Abstractive Summarization"

https://arxiv.org/abs/1705.04304
"""

from layers.attention import BilinearAttention
import torch

decoder_state = torch.autograd.Variable(torch.rand(32, 128))
decoder_states = torch.autograd.Variable(torch.rand(3, 32, 128))

decoder_attention = BilinearAttention(hidden_size=128)
decoder_attention_weights = decoder_attention(decoder_state, decoder_states)
print("Paulus et al. attended decoder size:", decoder_attention_weights.size())

encoder_states = torch.autograd.Variable(torch.rand(100, 32, 99))

encoder_attention = BilinearAttention(hidden_size=128, encoder_dim=99)
encoder_attention_weights = encoder_attention(decoder_state, encoder_states)
print("Paulus et al. attended encoder size:", encoder_attention_weights.size())

encoder_attention_weights = encoder_attention_weights.expand(*decoder_state.size())
decoder_attention_weights = decoder_attention_weights.expand(*decoder_state.size())

final_context_vector = torch.cat(
    [decoder_state, decoder_attention_weights * decoder_state, encoder_attention_weights * decoder_state])
print("Paulus et al. final context vector size:", final_context_vector.size())

"""
1D dilated causal convolutions for models like WaveNet and the Temporal Convolution Meta-Learner (TCML).

WaveNet: https://deepmind.com/blog/wavenet-generative-model-raw-audio/
TCML: https://arxiv.org/abs/1707.03141
"""

from layers.convolution import CausalConv1d
import torch

image = torch.arange(0, 4).unsqueeze(0).unsqueeze(0)
image = torch.autograd.Variable(image)

layer = CausalConv1d(in_channels=1, out_channels=1, kernel_size=2, dilation=1)
layer.weight.data.fill_(1)
layer.bias.data.fill_(0)

print(image.data.numpy())
print(layer(image).round().data.numpy())

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 124

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

iwasaki-kenta / Keita

Programming Languages

Labels

Projects that are alternatives of or similar to Keita

Keita: A PyTorch Toolkit

Description

Dependencies

Contents

Papers I've Implemented w/ Keita

Example Snippets