NagisaA Japanese tokenizer based on recurrent neural networks
Stars: ✭ 260 (-65.1%)
ToiroA comparison tool of Japanese tokenizers
Stars: ✭ 95 (-87.25%)
KagomeSelf-contained Japanese Morphological Analyzer written in pure Go
Stars: ✭ 554 (-25.64%)
JumanppJuman++ (a Morphological Analyzer Toolkit)
Stars: ✭ 254 (-65.91%)
Lingopackage lingo provides the data structures and algorithms required for natural language processing
Stars: ✭ 113 (-84.83%)
GrammarEngineГрамматический Словарь Русского Языка (+ английский, японский, etc)
Stars: ✭ 68 (-90.87%)
sembei🍘 単語分割を経由しない単語埋め込み 🍘
Stars: ✭ 14 (-98.12%)
YakuhanjpYakumono-Hankaku Only Web Fonts
Stars: ✭ 288 (-61.34%)
EkphrasisEkphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Stars: ✭ 433 (-41.88%)
wana kana rustUtility library for checking and converting between Japanese characters - Hiragana, Katakana - and Romaji
Stars: ✭ 46 (-93.83%)
gazouJapanese OCR for Linux & Windows
Stars: ✭ 32 (-95.7%)
clj-ducklingLanguage, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings. (a duckling clojure fork)
Stars: ✭ 15 (-97.99%)
Contextualized Topic ModelsA python package to run contextualized topic modeling. CTMs combine BERT with topic models to get coherent topics. Also supports multilingual tasks. Cross-lingual Zero-shot model published at EACL 2021.
Stars: ✭ 318 (-57.32%)
unidic-pyUnidic packaged for installation via pip.
Stars: ✭ 17 (-97.72%)
Spacy💫 Industrial-strength Natural Language Processing (NLP) in Python
Stars: ✭ 21,978 (+2850.07%)
classyclassy is a simple-to-use library for building high-performance Machine Learning models in NLP.
Stars: ✭ 61 (-91.81%)
SitecpprefjpサイトのMarkdownソース
Stars: ✭ 275 (-63.09%)
Wanikani For AndroidAn android client application for the awesome kanji learning website wanikani.com
Stars: ✭ 506 (-32.08%)
ArticutapiAPI of Articut 中文斷詞 (兼具語意詞性標記):「斷詞」又稱「分詞」,是中文資訊處理的基礎。Articut 不用機器學習,不需資料模型,只用現代白話中文語法規則,即能達到 SIGHAN 2005 F1-measure 94% 以上,Recall 96% 以上的成績。
Stars: ✭ 252 (-66.17%)
jp-ocr-prunned-cnnAttempting feature map prunning on a CNN trained for Japanese OCR
Stars: ✭ 15 (-97.99%)
kanjiHaskell suite for determining what 級 (level) of the 漢字検定 (national Kanji exam) a given Kanji belongs to.
Stars: ✭ 19 (-97.45%)
KuroshiroJapanese language library for converting Japanese sentence to Hiragana, Katakana or Romaji with furigana and okurigana modes supported.
Stars: ✭ 386 (-48.19%)
Hibi[No Active Development] An Android app for learning Japanese by keeping a journal.
Stars: ✭ 37 (-95.03%)
ZipanguA library for compatibility about Japan.
Stars: ✭ 27 (-96.38%)
Lingua👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Stars: ✭ 341 (-54.23%)
unofficial-jisho-apiEncapsulates the official Jisho.org API and also provides kanji, example, and stroke diagram search.
Stars: ✭ 88 (-88.19%)
Awesome Persian Nlp IrCurated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
Stars: ✭ 460 (-38.26%)
Giveme5w1hExtraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?
Stars: ✭ 316 (-57.58%)
sample-ui-reactMaterial-UI+ React.js + Redux [ Pug / Scss / Babel ]
Stars: ✭ 15 (-97.99%)
Quick NlpPytorch NLP library based on FastAI
Stars: ✭ 279 (-62.55%)
Pokemon FontGAME BOY font from Pokémon R/G/B/Y/G/S/C, Unicode extended.
Stars: ✭ 437 (-41.34%)
Giveme5WExtraction of the five journalistic W-questions (5W) from news articles
Stars: ✭ 16 (-97.85%)
Chatbot nerchatbot_ner: Named Entity Recognition for chatbots.
Stars: ✭ 273 (-63.36%)
JanomeJapanese morphological analysis engine written in pure Python
Stars: ✭ 630 (-15.44%)
kanji-web-appAngular.js kanji web application
Stars: ✭ 45 (-93.96%)
sakubunA tool that helps you improve your Japanese vocabulary and kanji skills with practice that's customized to your needs.
Stars: ✭ 20 (-97.32%)
KWDLCKyoto University Web Document Leads Corpus
Stars: ✭ 64 (-91.41%)
PynlplPyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
Stars: ✭ 426 (-42.82%)
textlint-jatextlintの日本語コミュニティ/ルールのアイデア
Stars: ✭ 41 (-94.5%)
TALPCoTUFS Asian Language Parallel Corpus
Stars: ✭ 32 (-95.7%)
NLP ToolkitLibrary of state-of-the-art models (PyTorch) for NLP tasks
Stars: ✭ 92 (-87.65%)
SudachiA Japanese Tokenizer for Business
Stars: ✭ 496 (-33.42%)
Nuts自然语言处理常见任务(主要包括文本分类,序列标注,自动问答等)解决方案试验田
Stars: ✭ 21 (-97.18%)
scoop-for-jpScoop bucket for ALL Japanese users.
Stars: ✭ 17 (-97.72%)
Nlp CubeNatural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing
Stars: ✭ 353 (-52.62%)
extra-modelCode to run the ExtRA algorithm for unsupervised topic/aspect extraction on English texts.
Stars: ✭ 43 (-94.23%)
knpA Japanese Parser
Stars: ✭ 16 (-97.85%)
simple NERsimple rule based named entity recognition
Stars: ✭ 29 (-96.11%)
YuzuMarker🍋 [WIP] Manga Translation Tool
Stars: ✭ 76 (-89.8%)
HebPipeAn NLP pipeline for Hebrew
Stars: ✭ 15 (-97.99%)
Mouse Dictionary📘A super fast dictionary for Chrome/Firefox
Stars: ✭ 670 (-10.07%)
PythainlpThai Natural Language Processing in Python.
Stars: ✭ 582 (-21.88%)