Content-based-Recommender-SystemIt is a content based recommender system that uses tf-idf and cosine similarity for N Most SImilar Items from a dataset
Stars: ✭ 64 (+204.76%)
bns-short-text-similarity📖 Use Bi-normal Separation to find document vectors which is used to compute similarity for shorter sentences.
Stars: ✭ 24 (+14.29%)
NewsSearch主要使用python+Scrapy框架去抓取新闻网站
Stars: ✭ 23 (+9.52%)
Recommender-SystemsImplementing Content based and Collaborative filtering(with KNN, Matrix Factorization and Neural Networks) in Python
Stars: ✭ 46 (+119.05%)
NlpSelected Machine Learning algorithms for natural language processing and semantic analysis in Golang
Stars: ✭ 304 (+1347.62%)
Nepali-News-ClassifierText Classification of Nepali Language Document. This Mini Project was done for the partial fulfillment of NLP Course : COMP 473.
Stars: ✭ 13 (-38.1%)
SoqalArabic Open Domain Question Answering System using Neural Reading Comprehension
Stars: ✭ 72 (+242.86%)
pygramsExtracts key terminology (n-grams) from any large collection of documents (>1000) and forecasts emergence
Stars: ✭ 52 (+147.62%)
lorcaNatural Language Processing for Spanish in Node.js. Stemmer, sentiment analysis, readability, tf-idf with batteries, concordance and more!
Stars: ✭ 95 (+352.38%)
Nlp In PracticeStarter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
Stars: ✭ 790 (+3661.9%)
devsearchA web search engine built with Python which uses TF-IDF and PageRank to sort search results.
Stars: ✭ 52 (+147.62%)
clusterixVisual exploration of clustered data.
Stars: ✭ 44 (+109.52%)
VntkVietnamese NLP Toolkit for Node
Stars: ✭ 170 (+709.52%)
ResumeRiseAn NLP tool which classifies and summarizes resumes
Stars: ✭ 29 (+38.1%)
iresearchIResearch is a cross-platform, high-performance document oriented search engine library written entirely in C++ with the focus on a pluggability of different ranking/similarity models
Stars: ✭ 121 (+476.19%)
keras-knnCode for the blog post Nearest Neighbors with Keras and CoreML
Stars: ✭ 25 (+19.05%)
GreynirThe greynir.is natural language processing website for Icelandic
Stars: ✭ 47 (+123.81%)
set-sketch-paperSetSketch: Filling the Gap between MinHash and HyperLogLog
Stars: ✭ 23 (+9.52%)
weibo-summary微博自动摘要系统 Chinese Microblog Automatic Summary System
Stars: ✭ 28 (+33.33%)
Naive-Resume-MatchingText Similarity Applied to resume, to compare Resumes with Job Descriptions and create a score to rank them. Similar to an ATS.
Stars: ✭ 27 (+28.57%)
soanSocial Analysis based on Whatsapp data
Stars: ✭ 106 (+404.76%)
SentimentAnalysis(BOW, TF-IDF, Word2Vec, BERT) Word Embeddings + (SVM, Naive Bayes, Decision Tree, Random Forest) Base Classifiers + Pre-trained BERT on Tensorflow Hub + 1-D CNN and Bi-Directional LSTM on IMDB Movie Reviews Dataset
Stars: ✭ 40 (+90.48%)
VtextSimple NLP in Rust with Python bindings
Stars: ✭ 108 (+414.29%)
tf-idf-pythonTerm frequency–inverse document frequency for Chinese novel/documents implemented in python.
Stars: ✭ 98 (+366.67%)
MovieboxMachine learning movie recommending system
Stars: ✭ 504 (+2300%)
TextAudit一个短视频app文本审核模块的实现思路及demo
Stars: ✭ 63 (+200%)
CadmiumNatural Language Processing (NLP) library for Crystal
Stars: ✭ 172 (+719.05%)
PolyfuzzFuzzy string matching, grouping, and evaluation.
Stars: ✭ 292 (+1290.48%)
Keyword-ExtracterProblem Statement: Given a particular PDF/Text document ,How to extract keywords and arrange in order of their weightage using Python?
Stars: ✭ 17 (-19.05%)
StringlifierStringlifier is on Opensource ML Library for detecting random strings in raw text. It can be used in sanitising logs, detecting accidentally exposed credentials and as a pre-processing step in unsupervised ML-based analysis of application text data.
Stars: ✭ 85 (+304.76%)
TextminingPython文本挖掘系统 Research of Text Mining System
Stars: ✭ 268 (+1176.19%)
Python Tf IdfAn extremely simple Python library to perform TF-IDF document comparison.
Stars: ✭ 214 (+919.05%)
text2textText2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (+795.24%)
tika-similarityTika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
Stars: ✭ 92 (+338.1%)
KeywordExtractionImplementation of algorithm in keyword extraction,including TextRank,TF-IDF and the combination of both
Stars: ✭ 95 (+352.38%)
Java String SimilarityImplementation of various string similarity and distance algorithms: Levenshtein, Jaro-winkler, n-Gram, Q-Gram, Jaccard index, Longest Common Subsequence edit distance, cosine similarity ...
Stars: ✭ 2,403 (+11342.86%)
lucillaFast, efficient, in-memory Full Text Search for Kotlin
Stars: ✭ 102 (+385.71%)
Simple-Plagiarism-CheckerWeb Application for checking the similarity between query and document using the concept of Cosine Similarity.
Stars: ✭ 47 (+123.81%)
TextvecText vectorization tool to outperform TFIDF for classification tasks
Stars: ✭ 167 (+695.24%)
AI-for-Trading📈This repo contains detailed notes and multiple projects implemented in Python related to AI and Finance. Follow the blog here: https://purvasingh.medium.com
Stars: ✭ 59 (+180.95%)
watchmanWatchman: An open-source social-media event-detection system
Stars: ✭ 18 (-14.29%)
live-cctvTo detect any reasonable change in a live cctv to avoid large storage of data. Once, we notice a change, our goal would be track that object or person causing it. We would be using Computer vision concepts. Our major focus will be on Deep Learning and will try to add as many features in the process.
Stars: ✭ 23 (+9.52%)
occupationcoderGiven a job title and job description, the algorithm assigns a standard occupational classification (SOC) code to the job.
Stars: ✭ 30 (+42.86%)
SnowballImplementation with some extensions of the paper "Snowball: Extracting Relations from Large Plain-Text Collections" (Agichtein and Gravano, 2000)
Stars: ✭ 131 (+523.81%)
DefactonlpDeFactoNLP: An Automated Fact-checking System that uses Named Entity Recognition, TF-IDF vector comparison and Decomposable Attention models.
Stars: ✭ 30 (+42.86%)
fb scraperFBLYZE is a Facebook scraping system and analysis system.
Stars: ✭ 61 (+190.48%)