Stringlifier is on Opensource ML Library for detecting random strings in raw text. It can be used in sanitising logs, detecting accidentally exposed credentials and as a pre-processing step in unsupervised ML-based analysis of application text data.

Stars: ✭ 85 (-10.53%)

Mutual labels: tf-idf

tf-idf-python

Term frequency–inverse document frequency for Chinese novel/documents implemented in python.

Stars: ✭ 98 (+3.16%)

Mutual labels: tf-idf

Polyfuzz

Fuzzy string matching, grouping, and evaluation.

Stars: ✭ 292 (+207.37%)

Mutual labels: tf-idf

Cadmium

Natural Language Processing (NLP) library for Crystal

Stars: ✭ 172 (+81.05%)

Mutual labels: tf-idf

text2text

Text2Text: Cross-lingual natural language processing and generation toolkit

Stars: ✭ 188 (+97.89%)

Mutual labels: tf-idf

TextSummarizer

TextRank implementation for C#

Stars: ✭ 29 (-69.47%)

Mutual labels: textrank

watchman

Watchman: An open-source social-media event-detection system

Stars: ✭ 18 (-81.05%)

Mutual labels: tf-idf

Vtext

Simple NLP in Rust with Python bindings

Stars: ✭ 108 (+13.68%)

Mutual labels: tf-idf

SentimentAnalysis

(BOW, TF-IDF, Word2Vec, BERT) Word Embeddings + (SVM, Naive Bayes, Decision Tree, Random Forest) Base Classifiers + Pre-trained BERT on Tensorflow Hub + 1-D CNN and Bi-Directional LSTM on IMDB Movie Reviews Dataset

Stars: ✭ 40 (-57.89%)

Mutual labels: tf-idf

KeywordAnalysis

Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends

Stars: ✭ 49 (-48.42%)

Mutual labels: keyword-extraction

How To Mine Newsfeed Data And Extract Interactive Insights In Python

A practical guide to topic mining and interactive visualizations

Stars: ✭ 61 (-35.79%)

Mutual labels: tf-idf

TextAudit

一个短视频app文本审核模块的实现思路及demo

Stars: ✭ 63 (-33.68%)

Mutual labels: tf-idf

text-classification-cn

中文文本分类实践，基于搜狗新闻语料库，采用传统机器学习方法以及预训练模型等方法

Stars: ✭ 81 (-14.74%)

Mutual labels: tf-idf

text-classification-baseline

Pipeline for fast building text classification TF-IDF + LogReg baselines.

Stars: ✭ 55 (-42.11%)

Mutual labels: tf-idf

NLP-paper

🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/

Stars: ✭ 23 (-75.79%)

Mutual labels: textrank

Predicting Myers Briggs Type Indicator With Recurrent Neural Networks

Stars: ✭ 43 (-54.74%)

Mutual labels: tf-idf

topic modelling financial news

Topic modelling on financial news with Natural Language Processing

Stars: ✭ 51 (-46.32%)

Mutual labels: tf-idf

Nlp

Selected Machine Learning algorithms for natural language processing and semantic analysis in Golang

Stars: ✭ 304 (+220%)

Mutual labels: tf-idf

Textclassification

several methods for text classification

Stars: ✭ 180 (+89.47%)

Mutual labels: tf-idf

2018 Machinelearning Lectures Esa

Machine Learning Lectures at the European Space Agency (ESA) in 2018

Stars: ✭ 280 (+194.74%)

Mutual labels: tf-idf

基于Nonebot的QQ群机器人🤖️，特色功能是利用机器学习算法，基于每日聊天记录生成每日总结。可在酷Q/Mirai平台上运行

Stars: ✭ 74 (-22.11%)

Mutual labels: textrank

NewsSearch

主要使用python+Scrapy框架去抓取新闻网站

Stars: ✭ 23 (-75.79%)

Mutual labels: tf-idf

Vntk

Vietnamese NLP Toolkit for Node

Stars: ✭ 170 (+78.95%)

Mutual labels: tf-idf

iresearch

IResearch is a cross-platform, high-performance document oriented search engine library written entirely in C++ with the focus on a pluggability of different ranking/similarity models

Stars: ✭ 121 (+27.37%)

Mutual labels: tf-idf

Document-Classification-using-LSA

Document classification using Latent semantic analysis in python

Stars: ✭ 16 (-83.16%)

Mutual labels: tf-idf

lorca

Natural Language Processing for Spanish in Node.js. Stemmer, sentiment analysis, readability, tf-idf with batteries, concordance and more!

Stars: ✭ 95 (+0%)

Mutual labels: tf-idf

Snowball

Implementation with some extensions of the paper "Snowball: Extracting Relations from Large Plain-Text Collections" (Agichtein and Gravano, 2000)

Stars: ✭ 131 (+37.89%)

Mutual labels: tf-idf

occupationcoder

Given a job title and job description, the algorithm assigns a standard occupational classification (SOC) code to the job.

Stars: ✭ 30 (-68.42%)

Mutual labels: tf-idf

TextRankPlus

基于深度学习的中文NLP工具

Stars: ✭ 36 (-62.11%)

Mutual labels: textrank

soan

Social Analysis based on Whatsapp data

Stars: ✭ 106 (+11.58%)

Mutual labels: tf-idf

Textclustering

Stars: ✭ 89 (-6.32%)

Mutual labels: tf-idf

Content-based-Recommender-System

It is a content based recommender system that uses tf-idf and cosine similarity for N Most SImilar Items from a dataset

Stars: ✭ 64 (-32.63%)

Mutual labels: tf-idf

Coursera Uw Machine Learning Clustering Retrieval

Stars: ✭ 25 (-73.68%)

Mutual labels: tf-idf

ResumeRise

An NLP tool which classifies and summarizes resumes

Stars: ✭ 29 (-69.47%)

Mutual labels: tf-idf

devsearch

A web search engine built with Python which uses TF-IDF and PageRank to sort search results.

Stars: ✭ 52 (-45.26%)

Mutual labels: tf-idf

Soqal

Arabic Open Domain Question Answering System using Neural Reading Comprehension

Stars: ✭ 72 (-24.21%)

Mutual labels: tf-idf

minimal-search-engine

最小のサーチエンジン/PageRank/tf-idf

Stars: ✭ 18 (-81.05%)

Mutual labels: tf-idf

Greynir

The greynir.is natural language processing website for Icelandic

Stars: ✭ 47 (-50.53%)

Mutual labels: tf-idf

bns-short-text-similarity

📖 Use Bi-normal Separation to find document vectors which is used to compute similarity for shorter sentences.

Stars: ✭ 24 (-74.74%)

Mutual labels: tf-idf

Nepali-News-Classifier

Text Classification of Nepali Language Document. This Mini Project was done for the partial fulfillment of NLP Course : COMP 473.

Stars: ✭ 13 (-86.32%)

Mutual labels: tf-idf

Pytextrank

Python implementation of TextRank for phrase extraction and summarization of text documents

Stars: ✭ 1,675 (+1663.16%)

Mutual labels: textrank

clusterix

Visual exploration of clustered data.

Stars: ✭ 44 (-53.68%)

Mutual labels: tf-idf

Defactonlp

DeFactoNLP: An Automated Fact-checking System that uses Named Entity Recognition, TF-IDF vector comparison and Decomposable Attention models.

Stars: ✭ 30 (-68.42%)

Mutual labels: tf-idf

Recommender-Systems

Implementing Content based and Collaborative filtering(with KNN, Matrix Factorization and Neural Networks) in Python

Stars: ✭ 46 (-51.58%)

Mutual labels: tf-idf

textrank-js

TextRank algorithm implementation in Javascript

Stars: ✭ 35 (-63.16%)

Mutual labels: textrank

Nlp In Practice

Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.

Stars: ✭ 790 (+731.58%)

Mutual labels: tf-idf

pygrams

Extracts key terminology (n-grams) from any large collection of documents (>1000) and forecasts emergence

Stars: ✭ 52 (-45.26%)

Mutual labels: tf-idf

koolsla

Food recommendation tool with Machine learning.

Stars: ✭ 21 (-77.89%)

Mutual labels: tf-idf

TextRank-node

No description or website provided.