MahmoudWahdan / Siamese-Sentence-Similarity

Licence: other

Keras and Tensorflow implementation of Siamese Recurrent Architectures for Learning Sentence Similarity

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Siamese-Sentence-Similarity

Manhattan-LSTM

Keras and PyTorch implementations of the MaLSTM model for computing Semantic Similarity.

Stars: ✭ 28 (-40.43%)

Mutual labels: semantic-similarity, siamese-recurrent-architectures

Siamese-Recurrent-Architectures

Usage of Siamese Recurrent Neural network architectures for semantic textual similarity

Stars: ✭ 19 (-59.57%)

Mutual labels: siamese-network, siamese-recurrent-architectures

TorchBlocks

A PyTorch-based toolkit for natural language processing

Stars: ✭ 85 (+80.85%)

Mutual labels: siamese-network

DOSE

😷 Disease Ontology Semantic and Enrichment analysis

Stars: ✭ 86 (+82.98%)

Mutual labels: semantic-similarity

Siamese Triplet

Siamese and triplet networks with online pair/triplet mining in PyTorch

Stars: ✭ 2,564 (+5355.32%)

Mutual labels: siamese-network

finetuner

Finetuning any DNN for better embedding on neural search tasks

Stars: ✭ 442 (+840.43%)

Mutual labels: siamese-network

stripnet

STriP Net: Semantic Similarity of Scientific Papers (S3P) Network

Stars: ✭ 82 (+74.47%)

Mutual labels: semantic-similarity

pedestrian recognition

A simple human recognition api for re-ID usage, power by paper https://arxiv.org/abs/1703.07737

Stars: ✭ 29 (-38.3%)

Mutual labels: siamese-network

SentenceSimilarity

The enhanced RCNN model used for sentence similarity classification

Stars: ✭ 41 (-12.77%)

Mutual labels: semantic-similarity

Pysot

SenseTime Research platform for single object tracking, implementing algorithms like SiamRPN and SiamMask.

Stars: ✭ 3,898 (+8193.62%)

Mutual labels: siamese-network

OfflineSignatureVerification

Writer independent offline signature verification using convolutional siamese networks

Stars: ✭ 49 (+4.26%)

Mutual labels: siamese-network

SiamFC-tf

A TensorFlow implementation of the SiamFC tracker, use with your own camera and video, or integrate to your own project 实时物体追踪，封装API，可整合到自己的项目中

Stars: ✭ 22 (-53.19%)

Mutual labels: siamese-network

nxontology

NetworkX-based Python library for representing ontologies

Stars: ✭ 45 (-4.26%)

Mutual labels: semantic-similarity

image triplet loss

Image similarity using Triplet Loss

Stars: ✭ 76 (+61.7%)

Mutual labels: siamese-network

deep-char-cnn-lstm

Deep Character CNN LSTM Encoder with Classification and Similarity Models

Stars: ✭ 20 (-57.45%)

Mutual labels: semantic-similarity

farm-animal-tracking

Farm Animal Tracking (FAT)

Stars: ✭ 19 (-59.57%)

Mutual labels: siamese-network

FDCNN

The implementation of FDCNN in paper - A Feature Difference Convolutional Neural Network-Based Change Detection Method

Stars: ✭ 54 (+14.89%)

Mutual labels: siamese-network

russe

RUSSE: Russian Semantic Evaluation.

Stars: ✭ 11 (-76.6%)

Mutual labels: semantic-similarity

awesome-semantic-search

A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

Stars: ✭ 161 (+242.55%)

Mutual labels: semantic-similarity

LSCDetection

Data Sets and Models for Evaluation of Lexical Semantic Change Detection

Stars: ✭ 17 (-63.83%)

Mutual labels: semantic-similarity

View All Similar Projects ➔

Keras and Tensorflow implementation of Siamese Recurrent Architectures for Learning Sentence Similarity

The Keras implementation for the paper Siamese Recurrent Architectures for Learning Sentence Similarity which implements Siamese Architecture using LSTM to provide a state-of-the-art yet simpler model for Semantic Textual Similarity (STS) task.

Architecture:

Input: Two sentences.
Output: Semantic similarity between the input two sentences.
Sentences encoded using Word2Vec (download from here)
Siamese network.
Use one LSTM.
Distance: Manhattan distance.
Both left LSTM and right LSTM have the same weights.

Implementation Details:

The LSTM learns a mapping from the space of variable length sequences of 300 dimensional vectors into 50
Optimization of the parameters using Adadelta.
Use L1 (Manhattan distance).
LSTM takes as input embeddings of 300-dimensional word2vec.
This method do not require extensive manual feature generation beyond the separately trained word2vec vectors.
The siamese network is trained using backpropagation-through-time under the mean squared error (MSE) loss function (after rescaling the training-set relatedness labels to lie in [0, 1]).
LSTM weights initialized with small random Gaussian entries.
Pre-training on separate sentence-pair data is provided for the earlier SemEval 2013 Semantic Textual Similarity task.

TODO:

Dataset thesaurus-based augmentation.
Learned weights visualization as provided in the paper.
We plan to provide Pytorch implementation.

Keras Implementation Notes:

Although we provide a Keras correct backend implementation to pearson_correlation, You shouldn't rely on pearson_correlation result that is returned from evaluate function unless you specify a batch_size >= the testing set size. This is because Keras apply metrics in batchs and don't apply the metric for the whole set!
We provided implementation to pearson_correlation using Keras backend in order to visualize the learning curves. It gives only indications not the correct pearson_correlation measures.

Training the model with SICK-like data:

Required Parameters:

--word2vec or -w Path to word2vec .bin file with 300 dims.
--data or -d Path to SICK data used for training.

Optional Parameters:

--pretrained or -p Path to pre-trained weights.
--epochs or -e Number of epochs.
--save or -s Folder path to save both the trained model and its weights.
--cudnnlstm or -c Use CUDNN LSTM for fast training. This requires GPU and CUDA.

python train.py --word2vec=/path/to/word2vec/GoogleNews-vectors-negative300.bin --data=/path/to/sick/SICK.txt  --epochs=50  --cudnnlstm=true

Testing the model with SICK-like data:

Required Parameters:

--model or -p Path to trained model.
--word2vec or -w Path to word2vec .bin file with 300 dims.
--data or -d Path to SICK data used for testing.

Optional Parameters:

--save or -s csv file path to save test output.

python test.py --model=/path/to/model/model.h5 --word2vec=/path/to/word2vec/GoogleNews-vectors-negative300.bin --data=/path/to/sick/SICK.txt  --save=/path/to/save/location/test.csv

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

MahmoudWahdan / Siamese-Sentence-Similarity

Programming Languages

Labels

Projects that are alternatives of or similar to Siamese-Sentence-Similarity

Keras and Tensorflow implementation of Siamese Recurrent Architectures for Learning Sentence Similarity

Architecture:

Implementation Details:

TODO:

Keras Implementation Notes:

Training the model with SICK-like data:

Required Parameters:

Optional Parameters:

Testing the model with SICK-like data:

Required Parameters:

Optional Parameters: