Nlp profilerA simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
Stars: ✭ 181 (+60.18%)
Transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Stars: ✭ 55,742 (+49229.2%)
Lingua👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Stars: ✭ 341 (+201.77%)
CodesearchnetDatasets, tools, and benchmarks for representation learning of code.
Stars: ✭ 1,378 (+1119.47%)
HntitlenatorTest your HN title against a neural network
Stars: ✭ 184 (+62.83%)
schrutepyThe Entire Transcript from the Office in Tidy Format
Stars: ✭ 22 (-80.53%)
Chatbot nerchatbot_ner: Named Entity Recognition for chatbots.
Stars: ✭ 273 (+141.59%)
ArticutapiAPI of Articut 中文斷詞 (兼具語意詞性標記):「斷詞」又稱「分詞」,是中文資訊處理的基礎。Articut 不用機器學習,不需資料模型,只用現代白話中文語法規則,即能達到 SIGHAN 2005 F1-measure 94% 以上,Recall 96% 以上的成績。
Stars: ✭ 252 (+123.01%)
Contextualized Topic ModelsA python package to run contextualized topic modeling. CTMs combine BERT with topic models to get coherent topics. Also supports multilingual tasks. Cross-lingual Zero-shot model published at EACL 2021.
Stars: ✭ 318 (+181.42%)
Dl Nlp ReadingsMy Reading Lists of Deep Learning and Natural Language Processing
Stars: ✭ 656 (+480.53%)
Repo 2016R, Python and Mathematica Codes in Machine Learning, Deep Learning, Artificial Intelligence, NLP and Geolocation
Stars: ✭ 103 (-8.85%)
Bert Sklearna sklearn wrapper for Google's BERT model
Stars: ✭ 182 (+61.06%)
Machine Learning ResourcesA curated list of awesome machine learning frameworks, libraries, courses, books and many more.
Stars: ✭ 226 (+100%)
Pyhanlp中文分词 词性标注 命名实体识别 依存句法分析 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁 自然语言处理
Stars: ✭ 2,564 (+2169.03%)
OpenPromptAn Open-Source Framework for Prompt-Learning.
Stars: ✭ 1,769 (+1465.49%)
TrankitTrankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing
Stars: ✭ 311 (+175.22%)
empythyAutomated NLP sentiment predictions- batteries included, or use your own data
Stars: ✭ 17 (-84.96%)
Easy BertA Dead Simple BERT API for Python and Java (https://github.com/google-research/bert)
Stars: ✭ 106 (-6.19%)
Awesome Bert NlpA curated list of NLP resources focused on BERT, attention mechanism, Transformer networks, and transfer learning.
Stars: ✭ 567 (+401.77%)
KuromojiKuromoji is a self-contained and very easy to use Japanese morphological analyzer designed for search
Stars: ✭ 745 (+559.29%)
SpagoSelf-contained Machine Learning and Natural Language Processing library in Go
Stars: ✭ 854 (+655.75%)
Pytorch Pos TaggingA tutorial on how to implement models for part-of-speech tagging using PyTorch and TorchText.
Stars: ✭ 96 (-15.04%)
Gpt2PyTorch Implementation of OpenAI GPT-2
Stars: ✭ 64 (-43.36%)
ToiroA comparison tool of Japanese tokenizers
Stars: ✭ 95 (-15.93%)
FastnlpfastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Stars: ✭ 2,441 (+2060.18%)
PynlpA pythonic wrapper for Stanford CoreNLP.
Stars: ✭ 103 (-8.85%)
Spark NlpState of the Art Natural Language Processing
Stars: ✭ 2,518 (+2128.32%)
Character Based CnnImplementation of character based convolutional neural network
Stars: ✭ 205 (+81.42%)
Attention MechanismsImplementations for a family of attention mechanisms, suitable for all kinds of natural language processing tasks and compatible with TensorFlow 2.0 and Keras.
Stars: ✭ 203 (+79.65%)
LazynlpLibrary to scrape and clean web pages to create massive datasets.
Stars: ✭ 1,985 (+1656.64%)
TextFeatureSelectionPython library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for improving text classification models. Helps improve your machine learning models
Stars: ✭ 42 (-62.83%)
GrammarEngineГрамматический Словарь Русского Языка (+ английский, японский, etc)
Stars: ✭ 68 (-39.82%)
Nuts自然语言处理常见任务(主要包括文本分类,序列标注,自动问答等)解决方案试验田
Stars: ✭ 21 (-81.42%)
mlconjug3A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.
Stars: ✭ 47 (-58.41%)
NerNamed Entity Recognition
Stars: ✭ 288 (+154.87%)
BluebertBlueBERT, pre-trained on PubMed abstracts and clinical notes (MIMIC-III).
Stars: ✭ 273 (+141.59%)
Lda Topic ModelingA PureScript, browser-based implementation of LDA topic modeling.
Stars: ✭ 91 (-19.47%)
Awesome Pytorch ListA comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
Stars: ✭ 12,475 (+10939.82%)
Tokenizers💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Stars: ✭ 5,077 (+4392.92%)
Awesome Persian Nlp IrCurated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
Stars: ✭ 460 (+307.08%)
PythainlpThai Natural Language Processing in Python.
Stars: ✭ 582 (+415.04%)
Spacy💫 Industrial-strength Natural Language Processing (NLP) in Python
Stars: ✭ 21,978 (+19349.56%)
Spacy Transformers🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
Stars: ✭ 919 (+713.27%)
UndertheseaUnderthesea - Vietnamese NLP Toolkit
Stars: ✭ 823 (+628.32%)
Textaugmentation Gpt2Fine-tuned pre-trained GPT2 for custom topic specific text generation. Such system can be used for Text Augmentation.
Stars: ✭ 104 (-7.96%)
PynlplPyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
Stars: ✭ 426 (+276.99%)
Textblob ArArabic support for textblob
Stars: ✭ 60 (-46.9%)
Text mining resourcesResources for learning about Text Mining and Natural Language Processing
Stars: ✭ 358 (+216.81%)
Tika PythonTika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Stars: ✭ 997 (+782.3%)
Greek BertA Greek edition of BERT pre-trained language model
Stars: ✭ 84 (-25.66%)
DanlpDaNLP is a repository for Natural Language Processing resources for the Danish Language.
Stars: ✭ 111 (-1.77%)