DE-LIMITDeEpLearning models for MultIlingual haTespeech (DELIMIT): Benchmarking multilingual models across 9 languages and 16 datasets.
Stars: ✭ 90 (+172.73%)
PyodA Python Toolbox for Scalable Outlier Detection (Anomaly Detection)
Stars: ✭ 5,083 (+15303.03%)
nlp-ltNatural Language Processing for Lithuanian language
Stars: ✭ 17 (-48.48%)
Lightldafast sampling algorithm based on CGS
Stars: ✭ 49 (+48.48%)
Corex topicHierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx
Stars: ✭ 439 (+1230.3%)
SttmShort Text Topic Modeling, JAVA
Stars: ✭ 100 (+203.03%)
Pytorch-NLUPytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech ta…
Stars: ✭ 151 (+357.58%)
Spark NlpState of the Art Natural Language Processing
Stars: ✭ 2,518 (+7530.3%)
ML2017FALLMachine Learning (EE 5184) in NTU
Stars: ✭ 66 (+100%)
Nlp chinese corpus大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+20069.7%)
DoctopicsVarious examples of topic modeling and other text analysis
Stars: ✭ 32 (-3.03%)
KashgariKashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Stars: ✭ 2,235 (+6672.73%)
LdavisR package for web-based interactive topic model visualization.
Stars: ✭ 466 (+1312.12%)
text2classMulti-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT
Stars: ✭ 15 (-54.55%)
Open Semantic SearchOpen Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Stars: ✭ 386 (+1069.7%)
mirror-bert[EMNLP 2021] Mirror-BERT: Converting Pretrained Language Models to universal text encoders without labels.
Stars: ✭ 56 (+69.7%)
Learning Social Media Analytics With RThis repository contains code and bonus content which will be added from time to time for the book "Learning Social Media Analytics with R" by Packt
Stars: ✭ 102 (+209.09%)
nlpbuddyA text analysis application for performing common NLP tasks through a web dashboard interface and an API
Stars: ✭ 115 (+248.48%)
R Text DataList of textual data sources to be used for text mining in R
Stars: ✭ 85 (+157.58%)
NSP-BERTThe code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"
Stars: ✭ 166 (+403.03%)
TRUNAJOD2.0An easy-to-use library to extract indices from texts.
Stars: ✭ 18 (-45.45%)
ganbert-pytorchEnhancing the BERT training with Semi-supervised Generative Adversarial Networks in Pytorch/HuggingFace
Stars: ✭ 60 (+81.82%)
KGE-LDAKnowledge Graph Embedding LDA. AAAI 2017
Stars: ✭ 35 (+6.06%)
converseConversational text Analysis using various NLP techniques
Stars: ✭ 147 (+345.45%)
deepvismachine learning algorithms in Swift
Stars: ✭ 54 (+63.64%)
clustextEasy, fast clustering of texts
Stars: ✭ 18 (-45.45%)
Ask2TransformersA Framework for Textual Entailment based Zero Shot text classification
Stars: ✭ 102 (+209.09%)
teanaps자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.
Stars: ✭ 91 (+175.76%)
tagifyTagify produces a set of tags from a given source. Source can be either an HTML page, a Markdown document or a plain text. Supports English, Russian, Chinese, Hindi, Spanish, Arabic, Japanese, German, Hebrew, French and Korean languages.
Stars: ✭ 24 (-27.27%)
WSDM-Cup-2019[ACM-WSDM] 3rd place solution at WSDM Cup 2019, Fake News Classification on Kaggle.
Stars: ✭ 62 (+87.88%)
KARENKAREN: Unifying Hatespeech Detection and Benchmarking
Stars: ✭ 18 (-45.45%)
NLP-paper🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/
Stars: ✭ 23 (-30.3%)
hldaGibbs sampler for the Hierarchical Latent Dirichlet Allocation topic model
Stars: ✭ 138 (+318.18%)
textgoText preprocessing, representation, similarity calculation, text search and classification. Let's go and play with text!
Stars: ✭ 33 (+0%)
JoSH[KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding
Stars: ✭ 55 (+66.67%)
amazon-reviewsSentiment Analysis & Topic Modeling with Amazon Reviews
Stars: ✭ 26 (-21.21%)
learning-stmLearning structural topic modeling using the stm R package.
Stars: ✭ 103 (+212.12%)
Naive-Resume-MatchingText Similarity Applied to resume, to compare Resumes with Job Descriptions and create a score to rank them. Similar to an ATS.
Stars: ✭ 27 (-18.18%)
TorchBlocksA PyTorch-based toolkit for natural language processing
Stars: ✭ 85 (+157.58%)
MengziMengzi Pretrained Models
Stars: ✭ 238 (+621.21%)
LSXA word embeddings-based semi-supervised model for document scaling
Stars: ✭ 42 (+27.27%)
deep-INFOMAXChainer implementation of deep-INFOMAX
Stars: ✭ 32 (-3.03%)
sherlock🔎 Find usernames across social networks.
Stars: ✭ 47 (+42.42%)
OpenDialogAn Open-Source Package for Chinese Open-domain Conversational Chatbot (中文闲聊对话系统,一键部署微信闲聊机器人)
Stars: ✭ 94 (+184.85%)
classyclassy is a simple-to-use library for building high-performance Machine Learning models in NLP.
Stars: ✭ 61 (+84.85%)