codeprepA toolkit for pre-processing large source code corpora
Stars: ✭ 39 (+85.71%)
sylbreakSyllable segmentation tool for Myanmar language (Burmese) by Ye.
Stars: ✭ 44 (+109.52%)
sentencepiece-jniJava JNI wrapper for SentencePiece: unsupervised text tokenizer for Neural Network-based text generation.
Stars: ✭ 26 (+23.81%)