All Projects → lucidrains → n-grammer-pytorch

lucidrains / n-grammer-pytorch

Licence: MIT License
Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to n-grammer-pytorch

text2keywords
Trained T5 and T5-large model for creating keywords from text
Stars: ✭ 53 (+6%)
Mutual labels:  transformers
TorchBlocks
A PyTorch-based toolkit for natural language processing
Stars: ✭ 85 (+70%)
Mutual labels:  transformers
sensu-plugins-memory-checks
This plugin provides native memory instrumentation for monitoring and metrics collection, including: memory usage via `free` and `vmstat`, including metrics. Note that this plugin may have cross-platform issues.
Stars: ✭ 15 (-70%)
Mutual labels:  memory
ttt
A package for fine-tuning Transformers with TPUs, written in Tensorflow2.0+
Stars: ✭ 35 (-30%)
Mutual labels:  transformers
robo-vln
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"
Stars: ✭ 34 (-32%)
Mutual labels:  transformers
bluerain
BlueRain is a fully-featured, managed memory manipulation library written in C#
Stars: ✭ 36 (-28%)
Mutual labels:  memory
eve-bot
EVE bot, a customer service chatbot to enhance virtual engagement for Twitter Apple Support
Stars: ✭ 31 (-38%)
Mutual labels:  transformers
smaller-transformers
Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.
Stars: ✭ 66 (+32%)
Mutual labels:  transformers
nuwa-pytorch
Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch
Stars: ✭ 347 (+594%)
Mutual labels:  transformers
redis-key-dashboard
This tool allows you to do a small analysis of the amount of keys and memory you use in Redis. It allows you to see overlooked keys and notice overuse.
Stars: ✭ 42 (-16%)
Mutual labels:  memory
Product-Categorization-NLP
Multi-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).
Stars: ✭ 30 (-40%)
Mutual labels:  transformers
spark-transformers
Spark-Transformers: Library for exporting Apache Spark MLLIB models to use them in any Java application with no other dependencies.
Stars: ✭ 39 (-22%)
Mutual labels:  transformers
minicons
Utility for analyzing Transformer based representations of language.
Stars: ✭ 28 (-44%)
Mutual labels:  transformers
golgotha
Contextualised Embeddings and Language Modelling using BERT and Friends using R
Stars: ✭ 39 (-22%)
Mutual labels:  transformers
text2class
Multi-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT
Stars: ✭ 15 (-70%)
Mutual labels:  transformers
small-text
Active Learning for Text Classification in Python
Stars: ✭ 241 (+382%)
Mutual labels:  transformers
CPU-MEM-monitor
A simple script to log Linux CPU and memory usage (using top or pidstat command) over time and output an Excel- or OpenOfficeCalc-friendly report
Stars: ✭ 41 (-18%)
Mutual labels:  memory
text2text
Text2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (+276%)
Mutual labels:  transformers
serverless-transformers-on-aws-lambda
Deploy transformers serverless on AWS Lambda
Stars: ✭ 100 (+100%)
Mutual labels:  transformers
Text and Audio classification with Bert
Text Classification in Turkish Texts with Bert
Stars: ✭ 34 (-32%)
Mutual labels:  transformers

N-Grammer - Pytorch

Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch

Install

$ pip install n-grammer-pytorch

Usage

import torch
from n_grammer_pytorch import VQNgrammer

vq_ngram = VQNgrammer(
    num_clusters = 1024,             # number of clusters
    dim_per_head = 32,               # dimension per head
    num_heads = 16,                  # number of heads
    ngram_vocab_size = 768 * 256,    # ngram vocab size
    ngram_emb_dim = 16,              # ngram embedding dimension
    decay = 0.999                    # exponential moving decay value
)

x = torch.randn(1, 1024, 32 * 16)
vq_ngram(x) # (1, 1024, 32 * 16)

Learning Rates

Like product key memories, Ngrammer parameters need to have a higher learning rate (1e-2 was recommended in the paper). The repository offers an easy way to generate the parameter groups.

from torch.optim import Adam
from n_grammer_pytorch import get_ngrammer_parameters

# this helper function, for your root model, finds all the VQNgrammer models and the embedding parameters
ngrammer_parameters, other_parameters = get_ngrammer_parameters(transformer)

optim = Adam([
    {'params': other_parameters},
    {'params': ngrammer_parameters, 'lr': 1e-2}
], lr = 3e-4)

Or, even more simply

from torch.optim import Adam
from n_grammer_pytorch import get_ngrammer_param_groups

param_groups = get_ngrammer_param_groups(model) # automatically creates array of parameter settings with learning rate set at 1e-2 for ngrammer parameter values
optim = Adam(param_groups, lr = 3e-4)

Citations

@inproceedings{thai2020using,
    title   = {N-grammer: Augmenting Transformers with latent n-grams},
    author  = {Anonymous},
    year    = {2021},
    url     = {https://openreview.net/forum?id=GxjCYmQAody}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].