Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → gentom → sentences-similarity-cluster

gentom / sentences-similarity-cluster

Licence: MIT License

Calculate similarity of sentences & Cluster the result.

Programming Languages

139335 projects - #7 most used programming language

14818 projects

Labels

docker docker-compose levenshtein-distance scipy matplotlib hierarchical-clustering sentence-similarity

Projects that are alternatives of or similar to sentences-similarity-cluster

Ncar Python Tutorial

Numerical & Scientific Computing with Python Tutorial

Stars: ✭ 50 (+257.14%)

Mutual labels: scipy, matplotlib

Essential Cheat Sheets for deep learning and machine learning researchers https://medium.com/@kailashahirwar/essential-cheat-sheets-for-machine-learning-and-deep-learning-researchers-efb6a8ebd2e5

Stars: ✭ 14,095 (+100578.57%)

Mutual labels: scipy, matplotlib

A constantly updated python machine learning cheatsheet

Stars: ✭ 136 (+871.43%)

Mutual labels: scipy, matplotlib

Data Science Ipython Notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+157385.71%)

Mutual labels: scipy, matplotlib

scipy-crash-course

Material for a 24 hours course on Scientific Python

Stars: ✭ 98 (+600%)

Mutual labels: scipy, matplotlib

中文 Python 笔记

Stars: ✭ 6,127 (+43664.29%)

Mutual labels: scipy, matplotlib

A Python module for time-frequency analysis

Stars: ✭ 185 (+1221.43%)

Mutual labels: scipy, matplotlib

Audio Spectrum Analyzer In Python

A series of Jupyter notebooks and python files which stream audio from a microphone using pyaudio, then processes it.

Stars: ✭ 273 (+1850%)

Mutual labels: scipy, matplotlib

jupyter boilerplate

Adds a customizable menu item to Jupyter (IPython) notebooks to insert boilerplate snippets of code

Stars: ✭ 69 (+392.86%)

Mutual labels: scipy, matplotlib

Computational Neuroscience Crash Course (CNCC 2019)

Stars: ✭ 26 (+85.71%)

Mutual labels: scipy, matplotlib

Scipy-Bordeaux-2017

Course taught at the University of Bordeaux in the academic year 2017 for PhD students.

Stars: ✭ 16 (+14.29%)

Mutual labels: scipy, matplotlib

Nested Sampling post-processing and plotting

Stars: ✭ 34 (+142.86%)

Mutual labels: scipy, matplotlib

Stats Maths With Python

General statistics, mathematical programming, and numerical/scientific computing scripts and notebooks in Python

Stars: ✭ 381 (+2621.43%)

Mutual labels: scipy, matplotlib

Open Machine Learning Course

Stars: ✭ 7,963 (+56778.57%)

Mutual labels: scipy, matplotlib

Scipy Lecture Notes Zh Cn

中文版scipy-lecture-notes. 网站下线, 以离线HTML的形式继续更新, 见release.

Stars: ✭ 362 (+2485.71%)

Mutual labels: scipy, matplotlib

主要是爬虫与数据分析项目总结，外加建模与机器学习，模型的评估。

Stars: ✭ 142 (+914.29%)

Mutual labels: scipy, matplotlib

Python-Matematica

Explorando aspectos fundamentais da matemática com Python e Jupyter

Stars: ✭ 41 (+192.86%)

Mutual labels: scipy, matplotlib

The Elements Of Statistical Learning Notebooks

Jupyter notebooks for summarizing and reproducing the textbook "The Elements of Statistical Learning" 2/E by Hastie, Tibshirani, and Friedman

Stars: ✭ 241 (+1621.43%)

Mutual labels: scipy, matplotlib

Algorithmic-Trading

Algorithmic trading using machine learning.

Stars: ✭ 102 (+628.57%)

Mutual labels: scipy, matplotlib

introduction to ml with python

도서 "[개정판] 파이썬 라이브러리를 활용한 머신 러닝"의 주피터 노트북과 코드입니다.

Stars: ✭ 211 (+1407.14%)

Mutual labels: scipy, matplotlib

View All Similar Projects ➔

sentences-similarity-cluster

sensim_cluster calculates the similarity of text data(from file) using Levenshtein distance and clusters(hierarchical clustering) the result. Clustering results are displayed with dendrogram.

Usage

Prepare your data file
Run this program below

# -*- coding: utf-8 -*-
import sys
from sensim_cluster.sensim_cluster import SensimCluster
from matplotlib import pyplot as plt
from scipy.cluster.hierarchy import dendrogram

cluster = SensimCluster('YOUR_DATAFILE_PATH')
ids = cluster.get_ids()
result = cluster.ward()
mod_ids = [id[-6:] for id in ids]
r = dendrogram(result, p=100, truncate_mode='lastp', labels=mod_ids, leaf_rotation=90)
print(r['leaves'])
print(r['ivl'])
plt.ylim(ymin=-10.0)
plt.show()

Docker-Compose

# build from docker-compose.yml
docker-compose build

# run container "app"
docker-compose run app

# kill container
docker-compose kill

# delete container
docker-compose rm

sentences-similarity-cluster (Old Version)

sim_cluster.py calculates the similarity of text data(from file) using Levenshtein distance and clusters(hierarchical clustering) the result. Clustering results are displayed with dendrogram.

Usage

1. Prepare your data file

2. Execute

python sim_cluster.py your_file

Example

1. Prepare the data file

./data/dummydata.csv

A,helloworld
B,hallawerld
C,helldwoody
D,hallowarld
E,galloworld
F,herroworld

2. Execute

python sim_cluster.py ./data/dummydata.csv

3. Result

[['A', 'helloworld'], ['B', 'hallawerld'], ['C', 'helldwoody'], ['D', 'hallowarld'], ['E', 'galloworld'], ['F', 'herroworld']]
['A', 'B', 'C', 'D', 'E', 'F']
['helloworld', 'hallawerld', 'helldwoody', 'hallowarld', 'galloworld', 'herroworld']
n0, n0 : 0
n0, n1 : 3
n0, n2 : 4
n0, n3 : 2
n0, n4 : 2
n0, n5 : 2
n1, n0 : 3
n1, n1 : 0
n1, n2 : 6
n1, n3 : 2
n1, n4 : 3
n1, n5 : 5
n2, n0 : 4
n2, n1 : 6
n2, n2 : 0
n2, n3 : 6
n2, n4 : 6
n2, n5 : 6
n3, n0 : 2
n3, n1 : 2
n3, n2 : 6
n3, n3 : 0
n3, n4 : 2
n3, n5 : 4
n4, n0 : 2
n4, n1 : 3
n4, n2 : 6
n4, n3 : 2
n4, n4 : 0
n4, n5 : 4
n5, n0 : 2
n5, n1 : 5
n5, n2 : 6
n5, n3 : 4
n5, n4 : 4
n5, n5 : 0
-------------------------
matrix: [[0, 3, 4, 2, 2, 2], [3, 0, 6, 2, 3, 5], [4, 6, 0, 6, 6, 6], [2, 2, 6, 0, 2, 4], [2, 3, 6, 2, 0, 4], [2, 5, 6, 4, 4, 0]]
-------------------------
[[  3.           4.           3.           2.        ]
 [  1.           6.           4.2031734    3.        ]
 [  0.           5.           4.89897949   2.        ]
 [  7.           8.           7.57187779   5.        ]
 [  2.           9.          12.05542755   6.        ]]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 14

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗