All Projects → natsheh → sensim

natsheh / sensim

Licence: BSD-3-Clause License
Sentence Similarity Estimator (SenSim)

Programming Languages

python
139335 projects - #7 most used programming language
perl
6916 projects

Projects that are alternatives of or similar to sensim

Nlp Papers
Papers and Book to look at when starting NLP 📚
Stars: ✭ 111 (+640%)
Mutual labels:  paper, nlu
How To Mine Newsfeed Data And Extract Interactive Insights In Python
A practical guide to topic mining and interactive visualizations
Stars: ✭ 61 (+306.67%)
Mutual labels:  text-mining, nlp-machine-learning
Machine Learning Resources
A curated list of awesome machine learning frameworks, libraries, courses, books and many more.
Stars: ✭ 226 (+1406.67%)
Mutual labels:  paper, nlp-machine-learning
Rasa Ui
Rasa UI is a frontend for the Rasa Framework
Stars: ✭ 796 (+5206.67%)
Mutual labels:  nlu, nlp-machine-learning
Nlp profiler
A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
Stars: ✭ 181 (+1106.67%)
Mutual labels:  text-mining, nlp-machine-learning
Nlp Paper
自然语言处理领域下的对话语音领域,整理相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)
Stars: ✭ 67 (+346.67%)
Mutual labels:  paper, nlp-machine-learning
Awesome Sentiment Analysis
Repository with all what is necessary for sentiment analysis and related areas
Stars: ✭ 459 (+2960%)
Mutual labels:  text-mining, nlp-machine-learning
Zzz Retired openstt
RETIRED - OpenSTT is now retired. If you would like more information on Mycroft AI's open source STT projects, please visit:
Stars: ✭ 146 (+873.33%)
Mutual labels:  nlu, nlp-machine-learning
Hands On Natural Language Processing With Python
This repository is for my students of Udemy. You can find all lecture codes along with mentioned files for reading in here. So, feel free to clone it and if you have any problem just raise a question.
Stars: ✭ 146 (+873.33%)
Mutual labels:  text-mining, nlp-machine-learning
Awesome Hungarian Nlp
A curated list of NLP resources for Hungarian
Stars: ✭ 121 (+706.67%)
Mutual labels:  text-mining, nlu
Text mining resources
Resources for learning about Text Mining and Natural Language Processing
Stars: ✭ 358 (+2286.67%)
Mutual labels:  text-mining, nlp-machine-learning
NLP-Natural-Language-Processing
Projects and useful articles / links
Stars: ✭ 149 (+893.33%)
Mutual labels:  paper, nlp-machine-learning
Lda Topic Modeling
A PureScript, browser-based implementation of LDA topic modeling.
Stars: ✭ 91 (+506.67%)
Mutual labels:  text-mining, nlp-machine-learning
alter-nlu
Natural language understanding library for chatbots with intent recognition and entity extraction.
Stars: ✭ 45 (+200%)
Mutual labels:  nlu, nlp-machine-learning
converse
Conversational text Analysis using various NLP techniques
Stars: ✭ 147 (+880%)
Mutual labels:  text-mining, nlu
Object-Detection-Confidence-Bias
Code for "The Box Size Confidence Bias Harms Your Object Detector" (https://arxiv.org/abs/2112.01901)
Stars: ✭ 22 (+46.67%)
Mutual labels:  paper
nlp newsletter
Natural language processing (NLP) newsletter right on GitHub
Stars: ✭ 57 (+280%)
Mutual labels:  nlp-machine-learning
SparseLSH
A Locality Sensitive Hashing (LSH) library with an emphasis on large, highly-dimensional datasets.
Stars: ✭ 127 (+746.67%)
Mutual labels:  text-mining
easyNLP
Do NLP without coding!
Stars: ✭ 19 (+26.67%)
Mutual labels:  nlp-machine-learning
advanced-text-mining
TEANAPS 라이브러리를 활용한 자연어 처리와 텍스트 분석 방법론에 대해 다룹니다.
Stars: ✭ 15 (+0%)
Mutual labels:  text-mining

SenSim

Sentence Similarity Estimator (SenSim)

Dependancies

This repository currently supports Python 2.7
For the used default values in sts.py/sts_light.py, you need the following:
sklearn==0.18
polyglot==16.07.04 
	Dependencies: (python-numpy libicu-dev)
	(to use in ubuntu/debian) sudo apt-get install python-numpy libicu-dev
beard==0.2
digify==0.2
enchant==1.6.8
spacy==0.100.5
	Needed models: python -m spacy.en.download glove

Usage to reproduce the results in the paper

After cloning the repositpry, use sts.py with its documented arguments

Usage to reproduce the results against the STS Benchmark

After cloning the repositpry, use sts_benchmark.py with its default param

Access to the paper

http://www.aclweb.org/anthology/S17-2013

Please cite using the following BibTex entry

@InProceedings{alnatsheh-EtAl:2017:SemEval,
  author    = {Al-Natsheh, Hussein T.  and  Martinet, Lucie  and  Muhlenbach, Fabrice  and  ZIGHED, Djamel Abdelkader},
  title     = {UdL at SemEval-2017 Task 1: Semantic Textual Similarity Estimation of English Sentence Pairs Using Regression Model over Pairwise Features},
  booktitle = {Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)},
  month     = {August},
  year      = {2017},
  address   = {Vancouver, Canada},
  publisher = {Association for Computational Linguistics},
  pages     = {115--119},
  abstract  = {This paper describes the model UdL we proposed to solve the semantic textual
	similarity task of SemEval 2017 workshop. The track we participated in was
	estimating the semantics relatedness of a given set of sentence pairs in
	English. The best run out of three submitted runs of our model achieved a
	Pearson correlation score of 0.8004 compared to a hidden human annotation of
	250~pairs. We used random forest ensemble learning to map an expandable set of
	extracted pairwise features into a semantic similarity estimated value bounded
	between 0 and 5. Most of these features were calculated using word embedding
	vectors similarity to align Part of Speech (PoS) and Name Entities (NE) tagged
	tokens of each sentence pair. Among other pairwise features, we experimented a
	classical tf-idf weighted Bag of Words (BoW) vector model but with
	character-based range of n-grams instead of words. This sentence vector
	BoW-based feature gave a relatively high importance value percentage in the
	feature importances analysis of the ensemble learning.},
  url       = {http://www.aclweb.org/anthology/S17-2013}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].