SudachiA Japanese Tokenizer for Business
Stars: ✭ 496 (-10.47%)
SudachipyPython version of Sudachi, a Japanese tokenizer.
Stars: ✭ 207 (-62.64%)
JumanppJuman++ (a Morphological Analyzer Toolkit)
Stars: ✭ 254 (-54.15%)
NagisaA Japanese tokenizer based on recurrent neural networks
Stars: ✭ 260 (-53.07%)
SudachidictA lexicon for Sudachi
Stars: ✭ 127 (-77.08%)
Languagepod101 ScraperPython scraper for Language Pods such as Japanesepod101.com 👹 🗾 🍣 Compatible with Japanese, Chinese, French, German, Italian, Korean, Portuguese, Russian, Spanish and many more! ✨
Stars: ✭ 104 (-81.23%)
Genki Study ResourcesA collection of exercises for practicing what is taught in Genki: An Integrated Course in Elementary Japanese.
Stars: ✭ 232 (-58.12%)
KonlpyPython package for Korean natural language processing.
Stars: ✭ 1,098 (+98.19%)
jmdict-simplifiedJMdict, JMnedict, Kanjidic, KRADFILE/RADKFILE in JSON format
Stars: ✭ 96 (-82.67%)
nippon日语N5-N2语法笔记~ 🍻
Stars: ✭ 84 (-84.84%)
suikaSuika 🍉 is a Japanese morphological analyzer written in pure Ruby
Stars: ✭ 31 (-94.4%)
google-news-scraperGoogle News Scraper for languages like Japanese, Chinese... [VPN Support]
Stars: ✭ 88 (-84.12%)
Owasp MasvsThe Mobile Application Security Verification Standard (MASVS) is a standard for mobile app security.
Stars: ✭ 1,030 (+85.92%)
The Tab Of WordsA minimal Chrome / Firefox extension to help you learn Japanese words in each new tab.
Stars: ✭ 94 (-83.03%)
Kanji Data MediaJapanese language data on kanji and radicals, media files, fonts and related resources from Kanji alive
Stars: ✭ 186 (-66.43%)
TopokanjiTopologically ordered lists of kanji for effective learning
Stars: ✭ 108 (-80.51%)
RikaikunTranslate Japanese by hovering over words.
Stars: ✭ 200 (-63.9%)
kanji-frequencyKanji usage frequency data collected from various sources
Stars: ✭ 92 (-83.39%)
Hibi[No Active Development] An Android app for learning Japanese by keeping a journal.
Stars: ✭ 37 (-93.32%)
KWDLCKyoto University Web Document Leads Corpus
Stars: ✭ 64 (-88.45%)
udarUDAR Does Accented Russian: A finite-state morphological analyzer of Russian that handles stressed wordforms.
Stars: ✭ 15 (-97.29%)
KuromojiKuromoji is a self-contained and very easy to use Japanese morphological analyzer designed for search
Stars: ✭ 745 (+34.48%)
Sejong CorpusKorean sejong corpus download and simple analysis
Stars: ✭ 116 (-79.06%)
ToiroA comparison tool of Japanese tokenizers
Stars: ✭ 95 (-82.85%)
KiwiKiwi(지능형 한국어 형태소 분석기)
Stars: ✭ 107 (-80.69%)
FugashiA Cython MeCab wrapper for fast, pythonic Japanese tokenization and morphological analysis.
Stars: ✭ 125 (-77.44%)
IchiranLinguistic tools for texts in Japanese language
Stars: ✭ 120 (-78.34%)
PythainlpThai Natural Language Processing in Python.
Stars: ✭ 582 (+5.05%)
Nlp profilerA simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
Stars: ✭ 181 (-67.33%)
Just Newsa userscript project that parses korean news site and then making more readable view
Stars: ✭ 173 (-68.77%)
MatrixprofileA Python 3 library making time series data mining tasks, utilizing matrix profile algorithms, accessible to everyone.
Stars: ✭ 141 (-74.55%)
ark-pixel-fontOpen source Pan-CJK pixel font / 开源的泛中日韩像素字体
Stars: ✭ 1,767 (+218.95%)
sinlingA collection of NLP tools for Sinhalese (සිංහල).
Stars: ✭ 38 (-93.14%)
Nihonoari-AppA little and minimalist Japanese Kana training
Stars: ✭ 66 (-88.09%)
JanomeJapanese morphological analysis engine written in pure Python
Stars: ✭ 630 (+13.72%)
kotobaA Discord bot for helping with learning Japanese.
Stars: ✭ 118 (-78.7%)
limelightA php Japanese language text analyzer and parser.
Stars: ✭ 76 (-86.28%)
unihandecodeunihandecode is a transliteration library to convert all characters/words in Unicode into ASCII alphabet that aware with Language preference priorities
Stars: ✭ 71 (-87.18%)
kanji-web-appAngular.js kanji web application
Stars: ✭ 45 (-91.88%)
simplemmaSimple multilingual lemmatizer for Python, especially useful for speed and efficiency
Stars: ✭ 32 (-94.22%)
GrammarEngineГрамматический Словарь Русского Языка (+ английский, японский, etc)
Stars: ✭ 68 (-87.73%)
ArabicProcessingCogA Python package that do stemming, tokenization, sentence breaking, segmentation, normalization, POS tagging for Arabic language.
Stars: ✭ 19 (-96.57%)
TALPCoTUFS Asian Language Parallel Corpus
Stars: ✭ 32 (-94.22%)
unofficial-jisho-apiEncapsulates the official Jisho.org API and also provides kanji, example, and stroke diagram search.
Stars: ✭ 88 (-84.12%)
CharlescdCharlesCD is an open source tool that makes deployments more agile, continuous and safe, which allows development teams to perform hypothesis validations with a specific group of users, simultaneously.
Stars: ✭ 275 (-50.36%)
EkphrasisEkphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Stars: ✭ 433 (-21.84%)
SyntokText tokenization and sentence segmentation (segtok v2)
Stars: ✭ 123 (-77.8%)
UdpipeR package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit
Stars: ✭ 160 (-71.12%)
FCH-TTSA fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese, Japanese, Korean, Russian and Tibetan (so far). 快速语音合成模型,适用于英语、普通话/中文、日语、韩语、俄语和藏语(当前已测试)。
Stars: ✭ 154 (-72.2%)