EkphrasisEkphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Stars: ✭ 433 (-1.14%)
Char Rnn TensorflowMulti-layer Recurrent Neural Networks for character-level language models implements by TensorFlow
Stars: ✭ 58 (-86.76%)
KagomeSelf-contained Japanese Morphological Analyzer written in pure Go
Stars: ✭ 554 (+26.48%)
Py NltoolsA collection of basic python modules for spoken natural language processing
Stars: ✭ 46 (-89.5%)
TokenizerFast and customizable text tokenization library with BPE and SentencePiece support
Stars: ✭ 132 (-69.86%)
Cogcomp NlpyCogComp's light-weight Python NLP annotators
Stars: ✭ 115 (-73.74%)
NlprePython library for Natural Language Preprocessing (NLPre)
Stars: ✭ 158 (-63.93%)
StringiTHE String Processing Package for R (with ICU)
Stars: ✭ 204 (-53.42%)
TextvecText vectorization tool to outperform TFIDF for classification tasks
Stars: ✭ 167 (-61.87%)
ThotThot toolkit for statistical machine translation
Stars: ✭ 53 (-87.9%)
Text-Classification-LSTMs-PyTorchThe aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.
Stars: ✭ 45 (-89.73%)
python-mecabA repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)
Stars: ✭ 27 (-93.84%)
ArabicProcessingCogA Python package that do stemming, tokenization, sentence breaking, segmentation, normalization, POS tagging for Arabic language.
Stars: ✭ 19 (-95.66%)
Kor2vecLibrary for Korean morpheme and word vector representation
Stars: ✭ 64 (-85.39%)
Pytorch Bert Crf NerKoBERT와 CRF로 만든 한국어 개체명인식기 (BERT+CRF based Named Entity Recognition model for Korean)
Stars: ✭ 236 (-46.12%)
Lingua FrancaMycroft's multilingual text parsing and formatting library
Stars: ✭ 51 (-88.36%)
Konoha🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
Stars: ✭ 130 (-70.32%)
KadotKadot, the unsupervised natural language processing library.
Stars: ✭ 108 (-75.34%)
Hunspell Dict KoKorean spellchecking dictionary for Hunspell
Stars: ✭ 187 (-57.31%)
PrenlpPreprocessing Library for Natural Language Processing
Stars: ✭ 130 (-70.32%)
Stanza OldStanford NLP group's shared Python tools.
Stars: ✭ 142 (-67.58%)
GreynirThe greynir.is natural language processing website for Icelandic
Stars: ✭ 47 (-89.27%)
FastnlpfastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Stars: ✭ 2,441 (+457.31%)
UdpipeR package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit
Stars: ✭ 160 (-63.47%)
hama-py🦛 파이썬 한글 처리 라이브러리. Python Korean Morphological Analyzer
Stars: ✭ 16 (-96.35%)
PynlplPyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
Stars: ✭ 426 (-2.74%)
Mlinterview A curated awesome list of AI Startups in India & Machine Learning Interview Guide. Feel free to contribute!
Stars: ✭ 410 (-6.39%)
Transformers TutorialsGithub repo with tutorials to fine tune transformers for diff NLP tasks
Stars: ✭ 384 (-12.33%)
MultiwozSource code for end-to-end dialogue model from the MultiWOZ paper (Budzianowski et al. 2018, EMNLP)
Stars: ✭ 384 (-12.33%)
JflexThe fast scanner generator for Java™ with full Unicode support
Stars: ✭ 380 (-13.24%)
ErnieOfficial implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.
Stars: ✭ 4,659 (+963.7%)
Cogcomp NlpCogComp's Natural Language Processing libraries and Demos:
Stars: ✭ 410 (-6.39%)
NlpnetA neural network architecture for NLP tasks, using cython for fast performance. Currently, it can perform POS tagging, SRL and dependency parsing.
Stars: ✭ 379 (-13.47%)
Nlp ProgressRepository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
Stars: ✭ 19,518 (+4356.16%)
Natural Language ProcessingProgramming Assignments and Lectures for Stanford's CS 224: Natural Language Processing with Deep Learning
Stars: ✭ 377 (-13.93%)
Aho CorasickA fast implementation of Aho-Corasick in Rust.
Stars: ✭ 424 (-3.2%)
ReductioAutomatic summarizer text in Swift
Stars: ✭ 406 (-7.31%)
Beginner nlpA curated list of beginner resources in Natural Language Processing
Stars: ✭ 376 (-14.16%)
Gnn4nlp PapersA list of recent papers about Graph Neural Network methods applied in NLP areas.
Stars: ✭ 405 (-7.53%)
Data ScienceCollection of useful data science topics along with code and articles
Stars: ✭ 315 (-28.08%)
Bert Embedding🔡 Token level embeddings from BERT model on mxnet and gluonnlp
Stars: ✭ 424 (-3.2%)
Ln2sqlA tool to query a database in natural language
Stars: ✭ 403 (-7.99%)
Southkorea MapsSouth Korea administrative divisions in ESRI Shapefile, GeoJSON and TopoJSON formats.
Stars: ✭ 367 (-16.21%)
Nlp[UNMANTEINED] Extract values from strings and fill your structs with nlp.
Stars: ✭ 367 (-16.21%)
D2l VnMột cuốn sách tương tác về học sâu có mã nguồn, toán và thảo luận. Đề cập đến nhiều framework phổ biến (TensorFlow, Pytorch & MXNet) và được sử dụng tại 175 trường Đại học.
Stars: ✭ 402 (-8.22%)
Matchzoo PyFacilitating the design, comparison and sharing of deep text matching models.
Stars: ✭ 362 (-17.35%)
Code searchCode For Medium Article: "How To Create Natural Language Semantic Search for Arbitrary Objects With Deep Learning"
Stars: ✭ 436 (-0.46%)
MooOptimised tokenizer/lexer generator! 🐄 Uses /y for performance. Moo.
Stars: ✭ 434 (-0.91%)
Deep Learning Nlp📡 Organized Resources for Deep Learning in Natural Language Processing
Stars: ✭ 421 (-3.88%)
Anlp19Course repo for Applied Natural Language Processing (Spring 2019)
Stars: ✭ 402 (-8.22%)
Awesome SearchAwesome Search - this is all about the (e-commerce) search and its awesomeness
Stars: ✭ 361 (-17.58%)
Spacy Streamlit👑 spaCy building blocks and visualizers for Streamlit apps
Stars: ✭ 360 (-17.81%)
Projects🪐 End-to-end NLP workflows from prototype to production
Stars: ✭ 397 (-9.36%)
Text mining resourcesResources for learning about Text Mining and Natural Language Processing
Stars: ✭ 358 (-18.26%)