Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → lilianweng → Transformer Tensorflow

lilianweng / Transformer Tensorflow

Implementation of Transformer Model in Tensorflow

Programming Languages

python

139335 projects - #7 most used programming language

Labels

transformer tensorflow-models

Projects that are alternatives of or similar to Transformer Tensorflow

galerkin-transformer

[NeurIPS 2021] Galerkin Transformer: a linear attention without softmax

Stars: ✭ 111 (-61.19%)

Mutual labels: transformer

Deep Learning In Production

In this repository, I will share some useful notes and references about deploying deep learning-based models in production.

Stars: ✭ 3,104 (+985.31%)

Mutual labels: tensorflow-models

Transformer

Implementation of Transformer model (originally from Attention is All You Need) applied to Time Series.

Stars: ✭ 273 (-4.55%)

Mutual labels: transformer

Swin-Transformer-Tensorflow

Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)

Stars: ✭ 45 (-84.27%)

Mutual labels: transformer

bert in a flask

A dockerized flask API, serving ALBERT and BERT predictions using TensorFlow 2.0.

Stars: ✭ 32 (-88.81%)

Mutual labels: transformer

Multiple Relations Extraction Only Look Once

Multiple-Relations-Extraction-Only-Look-Once. Just look at the sentence once and extract the multiple pairs of entities and their corresponding relations. 端到端联合多关系抽取模型，可用于 http://lic2019.ccf.org.cn/kg 信息抽取。

Stars: ✭ 269 (-5.94%)

Mutual labels: tensorflow-models

TextPruner

A PyTorch-based model pruning toolkit for pre-trained language models

Stars: ✭ 94 (-67.13%)

Mutual labels: transformer

Transformer

Easy Attributed String Creator

Stars: ✭ 278 (-2.8%)

Mutual labels: transformer

ai challenger 2018 sentiment analysis

Fine-grained Sentiment Analysis of User Reviews --- AI CHALLENGER 2018

Stars: ✭ 16 (-94.41%)

Mutual labels: transformer

Keras Transformer

Transformer implemented in Keras

Stars: ✭ 273 (-4.55%)

Mutual labels: transformer

uformer-pytorch

Implementation of Uformer, Attention-based Unet, in Pytorch

Stars: ✭ 54 (-81.12%)

Mutual labels: transformer

AITQA

resources for the IBM Airlines Table-Question-Answering Benchmark

Stars: ✭ 12 (-95.8%)

Mutual labels: transformer

Allrank

allRank is a framework for training learning-to-rank neural models based on PyTorch.

Stars: ✭ 269 (-5.94%)

Mutual labels: transformer

SwinIR

SwinIR: Image Restoration Using Swin Transformer (official repository)

Stars: ✭ 1,260 (+340.56%)

Mutual labels: transformer

Demo Chinese Text Binary Classification With Bert

Stars: ✭ 276 (-3.5%)

Mutual labels: transformer

SIGIR2021 Conure

One Person, One Model, One World: Learning Continual User Representation without Forgetting

Stars: ✭ 23 (-91.96%)

Mutual labels: transformer

Nlp Interview Notes

本项目是作者们根据个人面试和经验总结出的自然语言处理(NLP)面试准备的学习笔记与资料，该资料目前包含自然语言处理各领域的面试题积累。

Stars: ✭ 207 (-27.62%)

Mutual labels: transformer

Viewpagertransition

viewpager with parallax pages, together with vertical sliding (or click) and activity transition

Stars: ✭ 3,017 (+954.9%)

Mutual labels: transformer

Bmw Tensorflow Inference Api Gpu

This is a repository for an object detection inference API using the Tensorflow framework.

Stars: ✭ 277 (-3.15%)

Mutual labels: tensorflow-models

Remi

"Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions", ACM Multimedia 2020

Stars: ✭ 273 (-4.55%)

Mutual labels: transformer

View All Similar Projects ➔

Transformer

Implementation of the Transformer model in the paper:

Ashish Vaswani, et al. "Attention is all you need." NIPS 2017.

Check my blog post on attention and transformer:

Attention? Attention!

Implementations that helped me:

Setup

$ git clone https://github.com/lilianweng/transformer-tensorflow.git
$ cd transformer-tensorflow
$ pip install -r requirements.txt

Train a Model

# Check the help message:

$ python train.py --help

Usage: train.py [OPTIONS]

Options:
  --seq-len INTEGER               Input sequence length.  [default: 20]
  --d-model INTEGER               d_model  [default: 512]
  --d-ff INTEGER                  d_ff  [default: 2048]
  --n-head INTEGER                n_head  [default: 8]
  --batch-size INTEGER            Batch size  [default: 128]
  --max-steps INTEGER             Max train steps.  [default: 300000]
  --dataset [iwslt15|wmt14|wmt15]
                                  Which translation dataset to use.  [default:
                                  iwslt15]
  --help                          Show this message and exit.

# Train a model on dataset WMT14:

$ python train.py --dataset wmt14

Evaluate a Trained Model

Let's say, the model is saved in folder transformer-wmt14-seq20-d512-head8-1541573730 in checkpoints folder.

$ python eval.py transformer-wmt14-seq20-d512-head8-1541573730

With the default config, this implementation gets BLEU ~ 20 on wmt14 test set.

Implementation Notes

[WIP] A couple of tricking points in the implementation.

How to construct the mask correctly?
How to correctly shift decoder input (as training input) and decoder target (as ground truth in the loss function)?
How to make the prediction in an autoregressive way?
Keeping the embedding of <pad> as a constant zero vector is sorta important.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 286

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (6) 🔗