SynThaiThai Word Segmentation and Part-of-Speech Tagging with Deep Learning
Stars: ✭ 41 (+64%)
Mutual labels: word-segmentation
customized-symspellJava port of SymSpell: 1 million times faster through Symmetric Delete spelling correction algorithm
Stars: ✭ 51 (+104%)
Mutual labels: word-segmentation
spellSpelling correction and string segmentation written in Go
Stars: ✭ 24 (-4%)
Mutual labels: word-segmentation
sktSanskrit compound segmentation using seq2seq model
Stars: ✭ 21 (-16%)
Mutual labels: word-segmentation
codeprepA toolkit for pre-processing large source code corpora
Stars: ✭ 39 (+56%)
Mutual labels: word-segmentation
MonpaMONPA 罔拍是一個提供正體中文斷詞、詞性標註以及命名實體辨識的多任務模型
Stars: ✭ 203 (+712%)
Mutual labels: word-segmentation
youtokentome-rubyHigh performance unsupervised text tokenization for Ruby
Stars: ✭ 17 (-32%)
Mutual labels: word-segmentation
sentencepiece-jniJava JNI wrapper for SentencePiece: unsupervised text tokenizer for Neural Network-based text generation.
Stars: ✭ 26 (+4%)
Mutual labels: word-segmentation
hanzi-toolsConverts from Chinese characters to pinyin, between simplified and traditional, and does word segmentation.
Stars: ✭ 69 (+176%)
Mutual labels: word-segmentation
sentencepieceR package for Byte Pair Encoding / Unigram modelling based on Sentencepiece
Stars: ✭ 22 (-12%)
Mutual labels: word-segmentation
ckipnlpCKIP CoreNLP Toolkits
Stars: ✭ 92 (+268%)
Mutual labels: word-segmentation
dnn-lstm-word-segmentChinese Word Segmention Base on the Deep Learning and LSTM Neural Network
Stars: ✭ 24 (-4%)
Mutual labels: word-segmentation
word tokenizeVietnamese Word Tokenize
Stars: ✭ 45 (+80%)
Mutual labels: word-segmentation
SymSpellCppPyFast SymSpell written in c++ and exposes to python via pybind11
Stars: ✭ 28 (+12%)
Mutual labels: word-segmentation
esappAn unsupervised Chinese word segmentation tool.
Stars: ✭ 13 (-48%)
Mutual labels: word-segmentation
sylbreakSyllable segmentation tool for Myanmar language (Burmese) by Ye.
Stars: ✭ 44 (+76%)
Mutual labels: word-segmentation
rakutenma-pythonRakuten MA (Python version)
Stars: ✭ 15 (-40%)
Mutual labels: word-segmentation
UETsegmenterA toolkit for Vietnamese word segmentation
Stars: ✭ 60 (+140%)
Mutual labels: word-segmentation
Pytorch-NLUPytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech ta…
Stars: ✭ 151 (+504%)
Mutual labels: word-segmentation