All Projects → ikegami-yukino → Jaconv

ikegami-yukino / Jaconv

Licence: mit
Pure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku and Zenkaku

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Jaconv

Topokanji
Topologically ordered lists of kanji for effective learning
Stars: ✭ 108 (-31.21%)
Mutual labels:  japanese-language
Padatious
A neural network intent parser
Stars: ✭ 124 (-21.02%)
Mutual labels:  text-processing
Stanza Old
Stanford NLP group's shared Python tools.
Stars: ✭ 142 (-9.55%)
Mutual labels:  text-processing
Textcluster
短文本聚类预处理模块 Short text cluster
Stars: ✭ 115 (-26.75%)
Mutual labels:  text-processing
Ichiran
Linguistic tools for texts in Japanese language
Stars: ✭ 120 (-23.57%)
Mutual labels:  japanese-language
Konoha
🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
Stars: ✭ 130 (-17.2%)
Mutual labels:  text-processing
Languagepod101 Scraper
Python scraper for Language Pods such as Japanesepod101.com 👹 🗾 🍣 Compatible with Japanese, Chinese, French, German, Italian, Korean, Portuguese, Russian, Spanish and many more! ✨
Stars: ✭ 104 (-33.76%)
Mutual labels:  japanese-language
Negapoji
Japanese negative positive classification.日本語文書のネガポジを判定。
Stars: ✭ 148 (-5.73%)
Mutual labels:  japanese-language
Dan Jurafsky Chris Manning Nlp
My solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
Stars: ✭ 124 (-21.02%)
Mutual labels:  text-processing
Kanji Koohii
A web application to help Japanese language learners remember the kanji.
Stars: ✭ 137 (-12.74%)
Mutual labels:  japanese-language
Cogcomp Nlpy
CogComp's light-weight Python NLP annotators
Stars: ✭ 115 (-26.75%)
Mutual labels:  text-processing
Japanesetokenizers
aim to use JapaneseTokenizer as easy as possible
Stars: ✭ 120 (-23.57%)
Mutual labels:  japanese-language
Prenlp
Preprocessing Library for Natural Language Processing
Stars: ✭ 130 (-17.2%)
Mutual labels:  text-processing
Colibri Core
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
Stars: ✭ 112 (-28.66%)
Mutual labels:  text-processing
Browsecloud
A web app to create and browse text visualizations for automated customer listening.
Stars: ✭ 143 (-8.92%)
Mutual labels:  text-processing
Command Line Text Processing
⚡ From finding text to search and replace, from sorting to beautifying text and more 🎨
Stars: ✭ 9,771 (+6123.57%)
Mutual labels:  text-processing
Libasciidoc
A Golang library for processing Asciidoc files.
Stars: ✭ 129 (-17.83%)
Mutual labels:  text-processing
Japanese.js
Util collection for Japanese text processing. Hiraganize, Katakanize, and Romanize.
Stars: ✭ 150 (-4.46%)
Mutual labels:  text-processing
Xioc
Extract indicators of compromise from text, including "escaped" ones.
Stars: ✭ 148 (-5.73%)
Mutual labels:  text-processing
Tmtoolkit
Text Mining and Topic Modeling Toolkit for Python with parallel processing power
Stars: ✭ 135 (-14.01%)
Mutual labels:  text-processing

jaconv

|travis| |coveralls| |pyversion| |version| |license|

jaconv (Japanese Converter) is interconverter for Hiragana, Katakana, Hankaku (half-width character) and Zenkaku (full-width character)

Japanese README <https://github.com/ikegami-yukino/jaconv/blob/master/README_JP.rst>_ is available.

INSTALLATION

::

$ pip install jaconv

USAGE

See also document <http://ikegami-yukino.github.io/jaconv/jaconv.html>_

.. code:: python

import jaconv

Hiragana to Katakana

jaconv.hira2kata(u'ともえまみ')

=> u'トモエマミ'

Hiragana to half-width Katakana

jaconv.hira2hkata(u'ともえまみ')

=> u'トモエマミ'

Katakana to Hiragana

jaconv.kata2hira(u'巴マミ')

=> u'巴まみ'

half-width character to full-width character

jaconv.h2z(u'ティロ・フィナーレ')

=> u'ティロ・フィナーレ'

half-width character to full-width character

but only ascii characters

jaconv.h2z(u'abc', kana=False, ascii=True, digit=False)

=> u'abc'

half-width character to full-width character

but only digit characters

jaconv.h2z(u'123', kana=False, ascii=False, digit=True)

=> u'123'

half-width character to full-width character

except half-width Katakana

jaconv.h2z(u'アabc123', kana=False, digit=True, ascii=True)

=> u'アabc123'

full-width character to half-width character

jaconv.z2h(u'ティロ・フィナーレ')

=> u'ティロ・フィナーレ'

full-width character to half-width character

but only ascii characters

jaconv.z2h(u'abc', kana=False, ascii=True, digit=False)

=> u'abc'

full-width character to half-width character

but only digit characters

jaconv.z2h(u'123', kana=False, ascii=False, digit=True)

=> u'123'

full-width character to half-width character

except full-width Katakana

jaconv.z2h(u'アabc123', kana=False, digit=True, ascii=True)

=> u'アabc123'

normalize

jaconv.normalize(u'ティロ・フィナ〜レ', 'NFKC')

=> u'ティロ・フィナーレ'

Hiragana to alphabet

jaconv.kana2alphabet(u'じゃぱん')

=> japan

Alphabet to Hiragana

jaconv.alphabet2kana(u'japan')

=> じゃぱん

NOTE

jaconv.normalize method expand unicodedata.normalize for Japanese language processing.

.. code::

'〜' => 'ー'
'~' => 'ー'
"’" => "'"
'”'=> '"'
'“' => '``'
'―' => '-'
'‐' => '-'
'˗' => '-'
'֊' => '-'
'‐' => '-'
'‑' => '-'
'‒' => '-'
'–' => '-'
'⁃' => '-'
'⁻' => '-'
'₋' => '-'
'−' => '-'
'﹣' => 'ー'
'-' => 'ー'
'—' => 'ー'
'―' => 'ー'
'━' => 'ー'
'─' => 'ー'

.. |travis| image:: https://travis-ci.org/ikegami-yukino/jaconv.svg?branch=master :target: https://travis-ci.org/ikegami-yukino/jaconv :alt: travis-ci.org

.. |coveralls| image:: https://coveralls.io/repos/ikegami-yukino/jaconv/badge.svg?branch=master&service=github :target: https://coveralls.io/github/ikegami-yukino/jaconv?branch=master :alt: coveralls.io

.. |pyversion| image:: https://img.shields.io/pypi/pyversions/jaconv.svg

.. |version| image:: https://img.shields.io/pypi/v/jaconv.svg :target: http://pypi.python.org/pypi/jaconv/ :alt: latest version

.. |license| image:: https://img.shields.io/pypi/l/jaconv.svg :target: http://pypi.python.org/pypi/jaconv/ :alt: license

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].