All Projects → ishita-dg → ScrambleTests

ishita-dg / ScrambleTests

Licence: other
Running compostionality tests on InferSent embedding on SNLI

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to ScrambleTests

chainer-notebooks
Jupyter notebooks for Chainer hands-on
Stars: ✭ 23 (+43.75%)
Mutual labels:  rnn
mango
Question-Answering NLP model with character-level RNN (TensorFlow).
Stars: ✭ 15 (-6.25%)
Mutual labels:  rnn
CDRP TF
CNN Event Detection & RNN Phase Picking (in Tensorflow)
Stars: ✭ 20 (+25%)
Mutual labels:  rnn
sequence-rnn-py
Sequence analyzing using Recurrent Neural Networks (RNN) based on Keras
Stars: ✭ 28 (+75%)
Mutual labels:  rnn
yunyi
2018“云移杯- 景区口碑评价分值预测
Stars: ✭ 29 (+81.25%)
Mutual labels:  rnn
automatic-personality-prediction
[AAAI 2020] Modeling Personality with Attentive Networks and Contextual Embeddings
Stars: ✭ 43 (+168.75%)
Mutual labels:  rnn
SpeakerDiarization RNN CNN LSTM
Speaker Diarization is the problem of separating speakers in an audio. There could be any number of speakers and final result should state when speaker starts and ends. In this project, we analyze given audio file with 2 channels and 2 speakers (on separate channels).
Stars: ✭ 56 (+250%)
Mutual labels:  rnn
NoiseReductionUsingGRU
This is my graduation project in BIT. Title: Noise Reduction Using GRU.
Stars: ✭ 25 (+56.25%)
Mutual labels:  rnn
Tensorflow-RNN-Tutorial
Tensorflow RNN Tutorial
Stars: ✭ 24 (+50%)
Mutual labels:  rnn
GestureAI
RNN(Recurrent Nerural network) model which recognize hand-gestures drawing 5 figures.
Stars: ✭ 20 (+25%)
Mutual labels:  rnn
lstm-electric-load-forecast
Electric load forecast using Long-Short-Term-Memory (LSTM) recurrent neural network
Stars: ✭ 56 (+250%)
Mutual labels:  rnn
theano-recurrence
Recurrent Neural Networks (RNN, GRU, LSTM) and their Bidirectional versions (BiRNN, BiGRU, BiLSTM) for word & character level language modelling in Theano
Stars: ✭ 40 (+150%)
Mutual labels:  rnn
char-VAE
Inspired by the neural style algorithm in the computer vision field, we propose a high-level language model with the aim of adapting the linguistic style.
Stars: ✭ 18 (+12.5%)
Mutual labels:  rnn
DeepLearning-Lab
Code lab for deep learning. Including rnn,seq2seq,word2vec,cross entropy,bidirectional rnn,convolution operation,pooling operation,InceptionV3,transfer learning.
Stars: ✭ 83 (+418.75%)
Mutual labels:  rnn
myDL
Deep Learning
Stars: ✭ 18 (+12.5%)
Mutual labels:  rnn
Deep-Learning-Coursera
Projects from the Deep Learning Specialization from deeplearning.ai provided by Coursera
Stars: ✭ 123 (+668.75%)
Mutual labels:  rnn
Speech-Recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
Stars: ✭ 21 (+31.25%)
Mutual labels:  rnn
GAN-RNN Timeseries-imputation
Recurrent GAN for imputation of time series data. Implemented in TensorFlow 2 on Wikipedia Web Traffic Forecast dataset from Kaggle.
Stars: ✭ 107 (+568.75%)
Mutual labels:  rnn
Base-On-Relation-Method-Extract-News-DA-RNN-Model-For-Stock-Prediction--Pytorch
基於關聯式新聞提取方法之雙階段注意力機制模型用於股票預測
Stars: ✭ 33 (+106.25%)
Mutual labels:  rnn
ConvLSTM-PyTorch
ConvLSTM/ConvGRU (Encoder-Decoder) with PyTorch on Moving-MNIST
Stars: ✭ 202 (+1162.5%)
Mutual labels:  rnn

Evaluating compositionality in sentence embeddings

Codebase for Dasgupta et al. 2018, https://arxiv.org/abs/1802.04302

An updated version of this work is Dasgupta, Ishita, et al. "Analyzing machine-learned representations: A natural language case study." arXiv preprint arXiv:1909.05885 (2019). Linked here: https://arxiv.org/abs/1909.05885

Code to generate Compositional dataset based on comparisons, SNLI data analysis and scripts for augmented training is in the training-experiments branch.

Dataset used in the paper is here: https://github.com/ishita-dg/ScrambleTests/tree/training-experiment/testData/new

Code in main branch generates a smaller but more general dataset, sets up classifiers, downloads and tokenizes data.

Instructions

Getting data

In the Downloads folder, run: ./get_data.bash Requires 7za to unzip downloaded files, download and install from https://sourceforge.net/projects/p7zip/files/p7zip/ Path to sed tokenizer might need to be adjusted.

Run-through with toy

Run: python main.py, with toy = True. This should run through training the classifier and test code on toy data sets (provided).

Setting it False will run the true classifier and take a long time, and very high memory (~150+ GB) for InferSent embeddings.

GPU for classifier training

Set useCudaReg = True in main.py

Analysing tests

The logistic regression models (in ./models/) as well as their outputs on the true scramble-test results (in ./regout/) are provided. So you can run the analysis script directly.

In AnalyseTests.ipynb, setting Scram = True runs tests for Scramble test data, Scram = False runs it for te SNLI test and/or dev sets. Produces the plots (inline as well as in ./figures/), and displays high-margin BOW misclassifications (inline).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].