Transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Stars: ✭ 55,742 (+154738.89%)
classyclassy is a simple-to-use library for building high-performance Machine Learning models in NLP.
Stars: ✭ 61 (+69.44%)
Pytorch Sentiment AnalysisTutorials on getting started with PyTorch and TorchText for sentiment analysis.
Stars: ✭ 3,209 (+8813.89%)
Atr4sToolkit with state-of-the-art Automatic Terms Recognition methods in Scala
Stars: ✭ 23 (-36.11%)
Tika PythonTika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Stars: ✭ 997 (+2669.44%)
UrduhackAn NLP library for the Urdu language. It comes with a lot of battery included features to help you process Urdu data in the easiest way possible.
Stars: ✭ 200 (+455.56%)
PythainlpThai Natural Language Processing in Python.
Stars: ✭ 582 (+1516.67%)
Awesome Pytorch ListA comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
Stars: ✭ 12,475 (+34552.78%)
EkphrasisEkphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Stars: ✭ 433 (+1102.78%)
Giveme5w1hExtraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?
Stars: ✭ 316 (+777.78%)
Farm🏡 Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
Stars: ✭ 1,140 (+3066.67%)
Fnlp中文自然语言处理工具包 Toolkit for Chinese natural language processing
Stars: ✭ 2,468 (+6755.56%)
KuromojiKuromoji is a self-contained and very easy to use Japanese morphological analyzer designed for search
Stars: ✭ 745 (+1969.44%)
Nlp profilerA simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
Stars: ✭ 181 (+402.78%)
SudachiA Japanese Tokenizer for Business
Stars: ✭ 496 (+1277.78%)
pn-summaryA well-structured summarization dataset for the Persian language!
Stars: ✭ 29 (-19.44%)
Lingua👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Stars: ✭ 341 (+847.22%)
Lingopackage lingo provides the data structures and algorithms required for natural language processing
Stars: ✭ 113 (+213.89%)
Cool-NLPCVSome Cool NLP and CV Repositories and Solutions (收集NLP中常见任务的开源解决方案、数据集、工具、学习资料等)
Stars: ✭ 143 (+297.22%)
Chatbot nerchatbot_ner: Named Entity Recognition for chatbots.
Stars: ✭ 273 (+658.33%)
Turkish DeasciifierTurkish deasciifier in Python based on Deniz Yüret's turkish-mode for Emacs
Stars: ✭ 108 (+200%)
NLP-toolsUseful python NLP tools (evaluation, GUI interface, tokenization)
Stars: ✭ 39 (+8.33%)
clj-ducklingLanguage, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings. (a duckling clojure fork)
Stars: ✭ 15 (-58.33%)
SimstringA Python implementation of the SimString, a simple and efficient algorithm for approximate string matching.
Stars: ✭ 79 (+119.44%)
Multi Task Nlpmulti_task_NLP is a utility toolkit enabling NLP developers to easily train and infer a single model for multiple tasks.
Stars: ✭ 221 (+513.89%)
Node OpennlpApache OpenNLP wrapper for Nodejs
Stars: ✭ 55 (+52.78%)
npo classifierAutomated coding using machine-learning and remapping the U.S. nonprofit sector: A guide and benchmark
Stars: ✭ 18 (-50%)
SudachipyPython version of Sudachi, a Japanese tokenizer.
Stars: ✭ 207 (+475%)
NatasPython 3 library for processing historical English
Stars: ✭ 28 (-22.22%)
TradeTheEventImplementation of "Trade the Event: Corporate Events Detection for News-Based Event-Driven Trading." In Findings of ACL2021
Stars: ✭ 64 (+77.78%)
UndertheseaUnderthesea - Vietnamese NLP Toolkit
Stars: ✭ 823 (+2186.11%)
Pyarabicpyarabic
Stars: ✭ 183 (+408.33%)
JanomeJapanese morphological analysis engine written in pure Python
Stars: ✭ 630 (+1650%)
AiSpaceAiSpace: Better practices for deep learning model development and deployment For Tensorflow 2.0
Stars: ✭ 28 (-22.22%)
KagomeSelf-contained Japanese Morphological Analyzer written in pure Go
Stars: ✭ 554 (+1438.89%)
FastnlpfastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Stars: ✭ 2,441 (+6680.56%)
Spacy💫 Industrial-strength Natural Language Processing (NLP) in Python
Stars: ✭ 21,978 (+60950%)
vietnamese-robertaA Robustly Optimized BERT Pretraining Approach for Vietnamese
Stars: ✭ 22 (-38.89%)
PynlplPyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
Stars: ✭ 426 (+1083.33%)
Camel toolsA suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
Stars: ✭ 124 (+244.44%)
Contextualized Topic ModelsA python package to run contextualized topic modeling. CTMs combine BERT with topic models to get coherent topics. Also supports multilingual tasks. Cross-lingual Zero-shot model published at EACL 2021.
Stars: ✭ 318 (+783.33%)
Quick NlpPytorch NLP library based on FastAI
Stars: ✭ 279 (+675%)
DanlpDaNLP is a repository for Natural Language Processing resources for the Danish Language.
Stars: ✭ 111 (+208.33%)
NagisaA Japanese tokenizer based on recurrent neural networks
Stars: ✭ 260 (+622.22%)
FinBERT-QAFinancial Domain Question Answering with pre-trained BERT Language Model
Stars: ✭ 70 (+94.44%)
NLP ToolkitLibrary of state-of-the-art models (PyTorch) for NLP tasks
Stars: ✭ 92 (+155.56%)
Giveme5WExtraction of the five journalistic W-questions (5W) from news articles
Stars: ✭ 16 (-55.56%)
Nuts自然语言处理常见任务(主要包括文本分类,序列标注,自动问答等)解决方案试验田
Stars: ✭ 21 (-41.67%)
ToiroA comparison tool of Japanese tokenizers
Stars: ✭ 95 (+163.89%)
bert-sentimentFine-grained Sentiment Classification Using BERT
Stars: ✭ 49 (+36.11%)
spaczzFuzzy matching and more functionality for spaCy.
Stars: ✭ 215 (+497.22%)
DrFAQDrFAQ is a plug-and-play question answering NLP chatbot that can be generally applied to any organisation's text corpora.
Stars: ✭ 29 (-19.44%)
Cn2an📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化)
Stars: ✭ 249 (+591.67%)