All Projects → Huffon → Sentence Similarity

Huffon / Sentence Similarity

This repository contains various ways to calculate sentence vector similarity using NLP models

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Sentence Similarity

Vntk
Vietnamese NLP Toolkit for Node
Stars: ✭ 170 (-6.59%)
Mutual labels:  natural-language-processing
Transformers.jl
Julia Implementation of Transformer models
Stars: ✭ 173 (-4.95%)
Mutual labels:  natural-language-processing
Cookiecutter Spacy Fastapi
Cookiecutter API for creating Custom Skills for Azure Search using Python and Docker
Stars: ✭ 179 (-1.65%)
Mutual labels:  natural-language-processing
Syfertext
A privacy preserving NLP framework
Stars: ✭ 170 (-6.59%)
Mutual labels:  natural-language-processing
Multimodal Sentiment Analysis
Attention-based multimodal fusion for sentiment analysis
Stars: ✭ 172 (-5.49%)
Mutual labels:  natural-language-processing
Cleannlp
R package providing annotators and a normalized data model for natural language processing
Stars: ✭ 174 (-4.4%)
Mutual labels:  natural-language-processing
Data Science Toolkit
Collection of stats, modeling, and data science tools in Python and R.
Stars: ✭ 169 (-7.14%)
Mutual labels:  natural-language-processing
Deeptoxic
top 1% solution to toxic comment classification challenge on Kaggle.
Stars: ✭ 180 (-1.1%)
Mutual labels:  natural-language-processing
Deep Math Machine Learning.ai
A blog which talks about machine learning, deep learning algorithms and the Math. and Machine learning algorithms written from scratch.
Stars: ✭ 173 (-4.95%)
Mutual labels:  natural-language-processing
Cs224n 2019
My completed implementation solutions for CS224N 2019
Stars: ✭ 178 (-2.2%)
Mutual labels:  natural-language-processing
Dive Into Dl Pytorch
本项目将《动手学深度学习》(Dive into Deep Learning)原书中的MXNet实现改为PyTorch实现。
Stars: ✭ 14,234 (+7720.88%)
Mutual labels:  natural-language-processing
Knockknock
🚪✊Knock Knock: Get notified when your training ends with only two additional lines of code
Stars: ✭ 2,304 (+1165.93%)
Mutual labels:  natural-language-processing
Fastnlp
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Stars: ✭ 2,441 (+1241.21%)
Mutual labels:  natural-language-processing
Efaqa Corpus Zh
❤️Emotional First Aid Dataset, 心理咨询问答、聊天机器人语料库
Stars: ✭ 170 (-6.59%)
Mutual labels:  natural-language-processing
Stopwords
Default English stopword lists from many different sources
Stars: ✭ 179 (-1.65%)
Mutual labels:  natural-language-processing
Open Sesame
A frame-semantic parsing system based on a softmax-margin SegRNN.
Stars: ✭ 170 (-6.59%)
Mutual labels:  natural-language-processing
Web Database Analytics
Web scrapping and related analytics using Python tools
Stars: ✭ 175 (-3.85%)
Mutual labels:  natural-language-processing
Kb Infobot
A dialogue bot for information access
Stars: ✭ 181 (-0.55%)
Mutual labels:  natural-language-processing
Nlp profiler
A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
Stars: ✭ 181 (-0.55%)
Mutual labels:  natural-language-processing
Nel
Entity linking framework
Stars: ✭ 176 (-3.3%)
Mutual labels:  natural-language-processing

Sentence Similarity Calculator

This repo contains various ways to calculate the similarity between source and target sentences. You can choose the pre-trained models you want to use such as ELMo, BERT and Universal Sentence Encoder (USE).

And you can also choose the method to be used to get the similarity:

1. Cosine similarity
2. Manhattan distance
3. Euclidean distance
4. Angular distance
5. Inner product
6. TS-SS score
7. Pairwise-cosine similarity
8. Pairwise-cosine similarity + IDF

You can experiment with (The number of models) x (The number of methods) combinations!


Installation

  • This project is developed under conda enviroment
  • After cloning this repository, you can simply install all the dependent libraries described in requirements.txt with bash install.sh
conda create -n sensim python=3.7
conda activate sensim
git clone https://github.com/Huffon/sentence-similarity.git
cd sentence-similarity
bash install.sh

Usage

  • To test your own sentences, you should fill out corpus.txt with sentences as below:
I ate an apple.
I went to the Apple.
I ate an orange.
...
  • Then, choose the model and method to be used to calculate the similarity between source and target sentences
python sensim.py
    --model    MODEL_NAME  [use, bert, elmo]
    --method   METHOD_NAME [cosine, manhattan, euclidean, inner,
                            ts-ss, angular, pairwise, pairwise-idf]
    --verbose  LOG_OPTION (bool)

Examples

  • In this section, you can see the example result of sentence-similarity
  • As you know, there is a no silver-bullet which can calculate perfect similarity between sentences
  • You should conduct various experiments with your dataset
    • Caution: TS-SS score might not fit with sentence similarity task, since this method originally devised to calculate the similarity between long documents
  • Result:


References

Papers


Libraries


Articles

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].