All Projects → Japanesetokenizers → Similar Projects or Alternatives

136 Open source projects that are alternatives of or similar to Japanesetokenizers

Kagome
Self-contained Japanese Morphological Analyzer written in pure Go
Stars: ✭ 554 (+361.67%)
Mutual labels:  japanese-language, tokenizer
Lfuzzer
Fuzzing Parsers with Tokens
Stars: ✭ 28 (-76.67%)
Mutual labels:  tokenizer
Moo
Optimised tokenizer/lexer generator! 🐄 Uses /y for performance. Moo.
Stars: ✭ 434 (+261.67%)
Mutual labels:  tokenizer
pascal-interpreter
A simple interpreter for a large subset of Pascal language written for educational purposes
Stars: ✭ 21 (-82.5%)
Mutual labels:  tokenizer
Yomichan
Japanese pop-up dictionary extension for Chrome and Firefox.
Stars: ✭ 464 (+286.67%)
Mutual labels:  japanese-language
Talismane
NLP framework: sentence detector, tokeniser, pos-tagger and dependency parser
Stars: ✭ 38 (-68.33%)
Mutual labels:  tokenizer
Friso
High performance Chinese tokenizer with both GBK and UTF-8 charset support based on MMSEG algorithm developed by ANSI C. Completely based on modular implementation and can be easily embedded in other programs, like: MySQL, PostgreSQL, PHP, etc.
Stars: ✭ 313 (+160.83%)
Mutual labels:  tokenizer
Sentence Splitter
Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.
Stars: ✭ 82 (-31.67%)
Mutual labels:  tokenizer
Snl Compiler
SNL(Small Nested Language) Compiler. Maven jUnit Tokenizer Lexer Syntax Parser. 编译原理 词法分析 语法分析
Stars: ✭ 19 (-84.17%)
Mutual labels:  tokenizer
rakutenma-python
Rakuten MA (Python version)
Stars: ✭ 15 (-87.5%)
Mutual labels:  japanese-language
japanese-pitch-accent-resources
Trying to consolidate japanese phonetic, and in particular pitch accent resources into one list
Stars: ✭ 64 (-46.67%)
Mutual labels:  japanese-language
Awesome Japanese
Awesome Japanese learning resource
Stars: ✭ 563 (+369.17%)
Mutual labels:  japanese-language
Greynir
The greynir.is natural language processing website for Icelandic
Stars: ✭ 47 (-60.83%)
Mutual labels:  tokenizer
Smoothnlp
专注于可解释的NLP技术 An NLP Toolset With A Focus on Explainable Inference
Stars: ✭ 435 (+262.5%)
Mutual labels:  tokenizer
Djurl
Simple yet helpful library for writing Django urls by an easy, short and intuitive way.
Stars: ✭ 85 (-29.17%)
Mutual labels:  tokenizer
Jflex
The fast scanner generator for Java™ with full Unicode support
Stars: ✭ 380 (+216.67%)
Mutual labels:  tokenizer
Nlp Js Tools French
POS Tagger, lemmatizer and stemmer for french language in javascript
Stars: ✭ 32 (-73.33%)
Mutual labels:  tokenizer
Sacremoses
Python port of Moses tokenizer, truecaser and normalizer
Stars: ✭ 293 (+144.17%)
Mutual labels:  tokenizer
Languagepod101 Scraper
Python scraper for Language Pods such as Japanesepod101.com 👹 🗾 🍣 Compatible with Japanese, Chinese, French, German, Italian, Korean, Portuguese, Russian, Spanish and many more! ✨
Stars: ✭ 104 (-13.33%)
Mutual labels:  japanese-language
KanaQuiz
A simple app to quiz the user on identifying Japanese characters.
Stars: ✭ 19 (-84.17%)
Mutual labels:  japanese-language
React Input Tags
React component for tagging inputs.
Stars: ✭ 10 (-91.67%)
Mutual labels:  tokenizer
cang-jie
Chinese tokenizer for tantivy, based on jieba-rs
Stars: ✭ 48 (-60%)
Mutual labels:  tokenizer
Wirb
Ruby Object Inspection for IRB
Stars: ✭ 69 (-42.5%)
Mutual labels:  tokenizer
Mustard
🌭 Mustard is a Swift library for tokenizing strings when splitting by whitespace doesn't cut it.
Stars: ✭ 689 (+474.17%)
Mutual labels:  tokenizer
KanjiRecognitionDictionary
Perfect for those who forgets kanji pronunciation
Stars: ✭ 14 (-88.33%)
Mutual labels:  japanese-language
PaddleTokenizer
使用 PaddlePaddle 实现基于深度神经网络的中文分词引擎 | A DNN Chinese Tokenizer by Using PaddlePaddle
Stars: ✭ 14 (-88.33%)
Mutual labels:  tokenizer
Oseti
Dictionary based Sentiment Analysis for Japanese
Stars: ✭ 49 (-59.17%)
Mutual labels:  japanese-language
text2text
Text2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (+56.67%)
Mutual labels:  tokenizer
Tokenizer
A small library for converting tokenized PHP source code into XML (and potentially other formats)
Stars: ✭ 4,770 (+3875%)
Mutual labels:  tokenizer
Somajo
A tokenizer and sentence splitter for German and English web and social media texts.
Stars: ✭ 85 (-29.17%)
Mutual labels:  tokenizer
Open Korean Text
Open Korean Text Processor - An Open-source Korean Text Processor
Stars: ✭ 438 (+265%)
Mutual labels:  tokenizer
Py Nltools
A collection of basic python modules for spoken natural language processing
Stars: ✭ 46 (-61.67%)
Mutual labels:  tokenizer
Ekphrasis
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Stars: ✭ 433 (+260.83%)
Mutual labels:  tokenizer
Kadot
Kadot, the unsupervised natural language processing library.
Stars: ✭ 108 (-10%)
Mutual labels:  tokenizer
Php Parser
🌿 NodeJS PHP Parser - extract AST or tokens (PHP5 and PHP7)
Stars: ✭ 400 (+233.33%)
Mutual labels:  tokenizer
Sharpmath
A small .NET math library.
Stars: ✭ 36 (-70%)
Mutual labels:  tokenizer
Lexmachine
Lex machinary for go.
Stars: ✭ 335 (+179.17%)
Mutual labels:  tokenizer
Hippo
PHP standards checker.
Stars: ✭ 82 (-31.67%)
Mutual labels:  tokenizer
Sentences
A multilingual command line sentence tokenizer in Golang
Stars: ✭ 293 (+144.17%)
Mutual labels:  tokenizer
Omnicat Bayes
Naive Bayes text classification implementation as an OmniCat classifier strategy. (#ruby #naivebayes)
Stars: ✭ 30 (-75%)
Mutual labels:  tokenizer
Jumanpp
Juman++ (a Morphological Analyzer Toolkit)
Stars: ✭ 254 (+111.67%)
Mutual labels:  tokenizer
Tokenizer
Source code tokenizer
Stars: ✭ 119 (-0.83%)
Mutual labels:  tokenizer
ArabicProcessingCog
A Python package that do stemming, tokenization, sentence breaking, segmentation, normalization, POS tagging for Arabic language.
Stars: ✭ 19 (-84.17%)
Mutual labels:  tokenizer
Laravel Token
Laravel token management
Stars: ✭ 10 (-91.67%)
Mutual labels:  tokenizer
unofficial-jisho-api
Encapsulates the official Jisho.org API and also provides kanji, example, and stroke diagram search.
Stars: ✭ 88 (-26.67%)
Mutual labels:  japanese-language
Cols Agent Tasks
Colin's ALM Corner Custom Build Tasks
Stars: ✭ 70 (-41.67%)
Mutual labels:  tokenizer
Hebrew-Tokenizer
A very simple python tokenizer for Hebrew text.
Stars: ✭ 16 (-86.67%)
Mutual labels:  tokenizer
Lisp Esque Language
💠The Lel programming language
Stars: ✭ 24 (-80%)
Mutual labels:  tokenizer
ebe-dataset
Evidence-based Explanation Dataset (AACL-IJCNLP 2020)
Stars: ✭ 16 (-86.67%)
Mutual labels:  japanese-language
Megamark
😻 Markdown with easy tokenization, a fast highlighter, and a lean HTML sanitizer
Stars: ✭ 100 (-16.67%)
Mutual labels:  tokenizer
madomagiOOP
👨‍💻♐ OOP learning with anime magical girl. (魔法少女で学ぶオブジェクト指向)🧙
Stars: ✭ 17 (-85.83%)
Mutual labels:  japanese-language
Natasha
Solves basic Russian NLP tasks, API for lower level Natasha projects
Stars: ✭ 788 (+556.67%)
Mutual labels:  tokenizer
pandas-cheat-sheet-ja
pandas 公式チートシートの非公式翻訳版
Stars: ✭ 74 (-38.33%)
Mutual labels:  japanese-language
String Calc
PHP calculator library for mathematical terms (expressions) passed as strings
Stars: ✭ 60 (-50%)
Mutual labels:  tokenizer
Janome
Japanese morphological analysis engine written in pure Python
Stars: ✭ 630 (+425%)
Mutual labels:  japanese-language
Scattertext
Beautiful visualizations of how language differs among document types.
Stars: ✭ 1,722 (+1335%)
Mutual labels:  japanese-language
Topokanji
Topologically ordered lists of kanji for effective learning
Stars: ✭ 108 (-10%)
Mutual labels:  japanese-language
The Tab Of Words
A minimal Chrome / Firefox extension to help you learn Japanese words in each new tab.
Stars: ✭ 94 (-21.67%)
Mutual labels:  japanese-language
Thot
Thot toolkit for statistical machine translation
Stars: ✭ 53 (-55.83%)
Mutual labels:  tokenizer
Soynlp
한국어 자연어처리를 위한 파이썬 라이브러리입니다. 단어 추출/ 토크나이저 / 품사판별/ 전처리의 기능을 제공합니다.
Stars: ✭ 613 (+410.83%)
Mutual labels:  tokenizer
1-60 of 136 similar projects