NagisaA Japanese tokenizer based on recurrent neural networks
Stars: ✭ 260 (+173.68%)
PythainlpThai Natural Language Processing in Python.
Stars: ✭ 582 (+512.63%)
DanlpDaNLP is a repository for Natural Language Processing resources for the Danish Language.
Stars: ✭ 111 (+16.84%)
Spacy💫 Industrial-strength Natural Language Processing (NLP) in Python
Stars: ✭ 21,978 (+23034.74%)
EkphrasisEkphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Stars: ✭ 433 (+355.79%)
UndertheseaUnderthesea - Vietnamese NLP Toolkit
Stars: ✭ 823 (+766.32%)
JumanppJuman++ (a Morphological Analyzer Toolkit)
Stars: ✭ 254 (+167.37%)
FastnlpfastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Stars: ✭ 2,441 (+2469.47%)
PykakasiNLP: Convert Japanese Kana-kanji sentences into Kana-Roman in simple algorithm.
Stars: ✭ 238 (+150.53%)
Nlp profilerA simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
Stars: ✭ 181 (+90.53%)
Chatbot nerchatbot_ner: Named Entity Recognition for chatbots.
Stars: ✭ 273 (+187.37%)
Lingua👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Stars: ✭ 341 (+258.95%)
KagomeSelf-contained Japanese Morphological Analyzer written in pure Go
Stars: ✭ 554 (+483.16%)
SentencepieceUnsupervised text tokenizer for Neural Network-based text generation.
Stars: ✭ 5,540 (+5731.58%)
YoutokentomeUnsupervised text tokenizer focused on computational efficiency
Stars: ✭ 728 (+666.32%)
Transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Stars: ✭ 55,742 (+58575.79%)
Lingopackage lingo provides the data structures and algorithms required for natural language processing
Stars: ✭ 113 (+18.95%)
PynlplPyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
Stars: ✭ 426 (+348.42%)
Konoha🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
Stars: ✭ 130 (+36.84%)
PycantoneseCantonese Linguistics and NLP in Python
Stars: ✭ 147 (+54.74%)
Awesome Pytorch ListA comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
Stars: ✭ 12,475 (+13031.58%)
Awesome Bert Japanese📝 A list of pre-trained BERT models for Japanese with word/subword tokenization + vocabulary construction algorithm information
Stars: ✭ 76 (-20%)
VncorenlpA Vietnamese natural language processing toolkit (NAACL 2018)
Stars: ✭ 354 (+272.63%)
KuromojiKuromoji is a self-contained and very easy to use Japanese morphological analyzer designed for search
Stars: ✭ 745 (+684.21%)
Scanrefer[ECCV 2020] ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
Stars: ✭ 84 (-11.58%)
Practical OpenOxford Deep NLP 2017 course - Open practical
Stars: ✭ 84 (-11.58%)
Uer PyOpen Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo
Stars: ✭ 1,295 (+1263.16%)
AbydosAbydos NLP/IR library for Python
Stars: ✭ 91 (-4.21%)
Greek BertA Greek edition of BERT pre-trained language model
Stars: ✭ 84 (-11.58%)
MepropmeProp: Sparsified Back Propagation for Accelerated Deep Learning (ICML 2017)
Stars: ✭ 90 (-5.26%)
QolibriContinuation of the qolibri EPWING dictionary/book reader
Stars: ✭ 82 (-13.68%)
Momdo.github.ioJapanese translation of the W3C/WHATWG specification(s).
Stars: ✭ 81 (-14.74%)
NlpThis is where I put all my work in Natural Language Processing
Stars: ✭ 90 (-5.26%)
CwsSource code for an ACL2016 paper of Chinese word segmentation
Stars: ✭ 81 (-14.74%)
SimplednnSimpleDNN is a machine learning lightweight open-source library written in Kotlin designed to support relevant neural network architectures in natural language processing tasks
Stars: ✭ 81 (-14.74%)
DexterLet your talking do the code
Stars: ✭ 93 (-2.11%)
GeotextGeotext extracts country and city mentions from text
Stars: ✭ 91 (-4.21%)
Bible text gcnPytorch implementation of "Graph Convolutional Networks for Text Classification"
Stars: ✭ 90 (-5.26%)
Spacy Graphql🤹♀️ Query spaCy's linguistic annotations using GraphQL
Stars: ✭ 81 (-14.74%)
TypenovelA simple markup language to write novel with types.
Stars: ✭ 80 (-15.79%)
TextattackTextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP
Stars: ✭ 1,291 (+1258.95%)
SimstringA Python implementation of the SimString, a simple and efficient algorithm for approximate string matching.
Stars: ✭ 79 (-16.84%)
Opennmt TfNeural machine translation and sequence learning using TensorFlow
Stars: ✭ 1,223 (+1187.37%)
Multiffn NliImplementation of the multi feed-forward network architecture by Parikh et al. (2016) for Natural Language Inference.
Stars: ✭ 89 (-6.32%)
Ja.text8Japanese text8 corpus for word embedding.
Stars: ✭ 79 (-16.84%)
DeepmojiState-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc.
Stars: ✭ 1,215 (+1178.95%)
Practical 3 Oxford Deep NLP 2017 course - Practical 3: Text Classification with RNNs
Stars: ✭ 78 (-17.89%)
Pytreebank😡😇 Stanford Sentiment Treebank loader in Python
Stars: ✭ 93 (-2.11%)
Tageditor🏖TagEditor - Annotation tool for spaCy
Stars: ✭ 92 (-3.16%)
Lda Topic ModelingA PureScript, browser-based implementation of LDA topic modeling.
Stars: ✭ 91 (-4.21%)
Character MiningMining individual characters in multiparty dialogue
Stars: ✭ 89 (-6.32%)
Chinese XlnetPre-Trained Chinese XLNet(中文XLNet预训练模型)
Stars: ✭ 1,213 (+1176.84%)
Multimodal ToolkitMultimodal model for text and tabular data with HuggingFace transformers as building block for text data
Stars: ✭ 78 (-17.89%)
Punkt SegmenterRuby port of the NLTK Punkt sentence segmentation algorithm
Stars: ✭ 88 (-7.37%)
AbigsurveyA collection of 500+ survey papers on Natural Language Processing (NLP) and Machine Learning (ML)
Stars: ✭ 1,203 (+1166.32%)
Dialogue UnderstandingThis repository contains PyTorch implementation for the baseline models from the paper Utterance-level Dialogue Understanding: An Empirical Study
Stars: ✭ 77 (-18.95%)