A prototype version of our submitted paper: Conversion Prediction Using Multi-task Conditional Attention Networks to Support the Creation of Effective Ad Creatives.

Stars: ✭ 21 (+75%)

Mutual labels: chainer

deep-learning-platforms

deep-learning platforms,framework,data（深度学习平台、框架、资料）

Stars: ✭ 17 (+41.67%)

Mutual labels: chainer

chainer-graph-cnn

Chainer implementation of 'Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering' (https://arxiv.org/abs/1606.09375)

Stars: ✭ 67 (+458.33%)

Mutual labels: chainer

chainer-ResDrop

Deep Networks with Stochastic Depth implementation by Chainer

Stars: ✭ 40 (+233.33%)

Mutual labels: chainer

chainer-LSGAN

Least Squares Generative Adversarial Network implemented in Chainer

Stars: ✭ 16 (+33.33%)

Mutual labels: chainer

rocgan

Chainer implementation of the paper Robust Conditional Generative Adversarial Networks

Stars: ✭ 15 (+25%)

Mutual labels: chainer

chainer-notebooks

Jupyter notebooks for Chainer hands-on

Stars: ✭ 23 (+91.67%)

Mutual labels: chainer

char-rnnlm-tensorflow

Char RNN Language Model based on Tensorflow

Stars: ✭ 14 (+16.67%)

Mutual labels: rnn-language-model

sp2cp

Imageboard bot with recurrent neural network (RNN, GRU)

Stars: ✭ 23 (+91.67%)

Mutual labels: rnn-language-model

kaggle-champs-scalar-coupling

19th place solution in "Predicting Molecular Properties"

Stars: ✭ 26 (+116.67%)

Mutual labels: chainer

NCE-loss

Tensorflow NCE loss in Keras

Stars: ✭ 30 (+150%)

Mutual labels: softmax

char-rnn-text-generation

Character Embeddings Recurrent Neural Network Text Generation Models

Stars: ✭ 64 (+433.33%)

Mutual labels: chainer

View All Similar Projects ➔

Efficient Softmax Approximation

Implementations of Blackout and Adaptive Softmax for efficiently calculating word distribution for language modeling of very large vocabularies.

LSTM language models are derived from rnnlm_chainer.

Available output layers are as follows

Linear + softmax with cross entropy loss. A usual output layer.
--share-embedding: A variant using the word embedding matrix shared with the input layer for the output layer.
--adaptive-softmax: Adaptive softmax
--blackout: BlackOut (BlackOut is not faster on GPU.)

Adaptive Softmax

Efficient softmax approximation for GPUs
Edouard Grave, Armand Joulin, Moustapha Cissé, David Grangier, Hervé Jégou, ICML 2017
paper
authors' Lua code

BlackOut

BlackOut: Speeding up Recurrent Neural Network Language Models With Very Large Vocabularies
Shihao Ji, S. V. N. Vishwanathan, Nadathur Satish, Michael J. Anderson, Pradeep Dubey, ICLR 2016
paper
authors' C++ code

How to Run

python -u train.py -g 0

Datasets

PennTreeBank
Wikitext-2
Wikitext-103

For wikitext, run prepare_wikitext.sh for downloading the datasets.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

soskek / efficient_softmax

Programming Languages

Labels

Projects that are alternatives of or similar to efficient softmax

Efficient Softmax Approximation

Adaptive Softmax

BlackOut

How to Run

Datasets