megsA merged version of multiple open-source German speech datasets.
Stars: ✭ 21 (-77.17%)
nerusLarge silver standart Russian corpus with NER, morphology and syntax markup
Stars: ✭ 47 (-48.91%)
Vimdoc JaA project which translate Vim documents into Japanese.
Stars: ✭ 245 (+166.3%)
thaigov-corpusโครงการเก็บรวบรวมข่าวสารจากเว็บไซต์รัฐบาลไทย
Stars: ✭ 19 (-79.35%)
Core🔞 JAVClub - 让你的大姐姐不再走丢
Stars: ✭ 2,728 (+2865.22%)
Nadesiko3Japanese Programming Language Nadesiko v3 (JavaScript)
Stars: ✭ 125 (+35.87%)
malay-datasetText corpus for Bahasa Malaysia, https://malaya.readthedocs.io/en/latest/Dataset.html
Stars: ✭ 189 (+105.43%)
trafilaturaPython & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (+672.83%)
kage-editorThe graphical KAGE glyph editor
Stars: ✭ 27 (-70.65%)
CutletJapanese to romaji converter in Python
Stars: ✭ 124 (+34.78%)
SafarikaiSafari extension for translating Japanese words.
Stars: ✭ 177 (+92.39%)
GseGo efficient multilingual NLP and text segmentation; support english, chinese, japanese and other. Go 高性能多语言 NLP 和分词
Stars: ✭ 1,695 (+1742.39%)
RcppMeCabRcppMeCab: Rcpp Interface of CJK Morpheme Analyzer MeCab
Stars: ✭ 24 (-73.91%)
nihongoJapanese Dictionary
Stars: ✭ 77 (-16.3%)
Posuto🏣📮〠 Japanese postal code data.
Stars: ✭ 109 (+18.48%)
Japanese.jsUtil collection for Japanese text processing. Hiraganize, Katakanize, and Romanize.
Stars: ✭ 150 (+63.04%)
tvsubTVsub: DCU-Tencent Chinese-English Dialogue Corpus
Stars: ✭ 40 (-56.52%)
MusubiiSimple CSS Framework for JP
Stars: ✭ 138 (+50%)
Konoha🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
Stars: ✭ 130 (+41.3%)
sample-ui-vue-pagesBootstrap + Vue.js [ Scss / Babel ] (Multi-Page/SSR Model)
Stars: ✭ 20 (-78.26%)
FugashiA Cython MeCab wrapper for fast, pythonic Japanese tokenization and morphological analysis.
Stars: ✭ 125 (+35.87%)
jaco-jsJapanese character optimizer for JavaScript
Stars: ✭ 72 (-21.74%)
proiel-treebankOfficial releases of the PROIEL treebank of ancient Indo-European languages
Stars: ✭ 30 (-67.39%)
migemojsa JavaScript implementation of Migemo
Stars: ✭ 29 (-68.48%)
Nodejs JaNode.js 日本語ローカリゼーション
Stars: ✭ 98 (+6.52%)
CJK-character-countProgram that counts the amount of CJK characters based on Unicode ranges and Chinese encoding standards 字体汉字计数软件
Stars: ✭ 195 (+111.96%)
next-qrcodeReact hooks for generating QRCode for your next React apps.
Stars: ✭ 87 (-5.43%)
ToiroA comparison tool of Japanese tokenizers
Stars: ✭ 95 (+3.26%)
JconvPure-JavaScript converter for Japanese character encodings.
Stars: ✭ 91 (-1.09%)
nytwitNew York Times Word Innovation Types dataset
Stars: ✭ 21 (-77.17%)
lang-jaManage Japanese language files which distributed with vim.
Stars: ✭ 20 (-78.26%)
DANeSDANeS is an open-source E-newspaper dataset by collaboration between DATASET JSC (dataset.vn) and AIV Group (aivgroup.vn)
Stars: ✭ 64 (-30.43%)
QolibriContinuation of the qolibri EPWING dictionary/book reader
Stars: ✭ 82 (-10.87%)
Momdo.github.ioJapanese translation of the W3C/WHATWG specification(s).
Stars: ✭ 81 (-11.96%)
jmdict-kindleJapanese - English dictionary for Kindle based on the JMdict / EDICT database
Stars: ✭ 151 (+64.13%)
Awesome Bert Japanese📝 A list of pre-trained BERT models for Japanese with word/subword tokenization + vocabulary construction algorithm information
Stars: ✭ 76 (-17.39%)
Risingstars2016A complete overview of the JavaScript landscape in 2016: trends about front-end and node.js frameworks, tooling... Available in English, Japanese and Chinese.
Stars: ✭ 75 (-18.48%)
Memorize🚀 Japanese-English-Mongolian dictionary. It lets you find words, kanji and more quickly and easily
Stars: ✭ 72 (-21.74%)
rclcRich Context leaderboard competition, including the corpus and current SOTA for required tasks.
Stars: ✭ 20 (-78.26%)
KanaGolang library for conversion between Japanese hiragana, katakana and romaji
Stars: ✭ 68 (-26.09%)
JapanesetabA Chrome extension that helps you learn Japanese with every new tab 🔴
Stars: ✭ 55 (-40.22%)
german-nounsA list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the data and parse compound words.
Stars: ✭ 101 (+9.78%)
Vanilla AutokanaA Vanilla-JavaScript library to complete furigana automatically.
Stars: ✭ 48 (-47.83%)