All Projects → nihongodera → limelight

nihongodera / limelight

Licence: MIT license
A php Japanese language text analyzer and parser.

Programming Languages

PHP
23972 projects - #3 most used programming language

Projects that are alternatives of or similar to limelight

Kanji Data Media
Japanese language data on kanji and radicals, media files, fonts and related resources from Kanji alive
Stars: ✭ 186 (+144.74%)
Mutual labels:  japanese, japanese-language
nippon
日语N5-N2语法笔记~ 🍻
Stars: ✭ 84 (+10.53%)
Mutual labels:  japanese, japanese-language
Genki Study Resources
A collection of exercises for practicing what is taught in Genki: An Integrated Course in Elementary Japanese.
Stars: ✭ 232 (+205.26%)
Mutual labels:  japanese, japanese-language
Languagepod101 Scraper
Python scraper for Language Pods such as Japanesepod101.com 👹 🗾 🍣 Compatible with Japanese, Chinese, French, German, Italian, Korean, Portuguese, Russian, Spanish and many more! ✨
Stars: ✭ 104 (+36.84%)
Mutual labels:  japanese, japanese-language
kanji-frequency
Kanji usage frequency data collected from various sources
Stars: ✭ 92 (+21.05%)
Mutual labels:  japanese, japanese-language
Topokanji
Topologically ordered lists of kanji for effective learning
Stars: ✭ 108 (+42.11%)
Mutual labels:  japanese, japanese-language
jmdict-simplified
JMdict, JMnedict, Kanjidic, KRADFILE/RADKFILE in JSON format
Stars: ✭ 96 (+26.32%)
Mutual labels:  japanese, japanese-language
Yomichan
Japanese pop-up dictionary extension for Chrome and Firefox.
Stars: ✭ 464 (+510.53%)
Mutual labels:  japanese, japanese-language
Domino-English-Translation
🌏 Let's translate Domino, a Japanese MIDI editor!
Stars: ✭ 29 (-61.84%)
Mutual labels:  japanese, japanese-language
Kawazu
A C# library for converting Japanese sentence to Hiragana, Katakana or Romaji with furigana and okurigana modes supported. Inspired by project Kuroshiro.
Stars: ✭ 33 (-56.58%)
Mutual labels:  japanese, mecab
kotoba
A Discord bot for helping with learning Japanese.
Stars: ✭ 118 (+55.26%)
Mutual labels:  japanese, japanese-language
google-news-scraper
Google News Scraper for languages like Japanese, Chinese... [VPN Support]
Stars: ✭ 88 (+15.79%)
Mutual labels:  japanese, japanese-language
The Tab Of Words
A minimal Chrome / Firefox extension to help you learn Japanese words in each new tab.
Stars: ✭ 94 (+23.68%)
Mutual labels:  japanese, japanese-language
Ichiran
Linguistic tools for texts in Japanese language
Stars: ✭ 120 (+57.89%)
Mutual labels:  japanese, japanese-language
Kagome
Self-contained Japanese Morphological Analyzer written in pure Go
Stars: ✭ 554 (+628.95%)
Mutual labels:  japanese, japanese-language
python-doc-ja
Python ドキュメント日本語訳プロジェクト
Stars: ✭ 130 (+71.05%)
Mutual labels:  japanese, japanese-language
japanese-pitch-accent-resources
Trying to consolidate japanese phonetic, and in particular pitch accent resources into one list
Stars: ✭ 64 (-15.79%)
Mutual labels:  japanese, japanese-language
unofficial-jisho-api
Encapsulates the official Jisho.org API and also provides kanji, example, and stroke diagram search.
Stars: ✭ 88 (+15.79%)
Mutual labels:  japanese, japanese-language
Convert-Numbers-to-Japanese
Converts Arabic numerals, or 'western' style numbers, to a Japanese context.
Stars: ✭ 33 (-56.58%)
Mutual labels:  japanese, japanese-language
Nihonoari-App
A little and minimalist Japanese Kana training
Stars: ✭ 66 (-13.16%)
Mutual labels:  japanese, japanese-language

Limelight

Latest Stable Version License

A php Japanese language analyzer and parser.
  • Split Japanese text into individual, full words
  • Find parts of speech for words
  • Find dictionary entries (lemmas) for conjugated words
  • Get readings and pronunciations for words
  • Build furigana for words
  • Convert Japanese to romaji (English lettering)

Quick Guide

Version Notes

  • April 25, 2016: The Limelight API changed in Version 1.6.0. The new API uses collection methods to give developers better control of Limelight parse results. Please see the wiki for the updated documentation.
  • April 11, 2016: php-mecab, the MeCab bindings Limelight uses, were updated to version 0.6.0 in Dec. 2015 for php 7 support. The pre-0.6.0 bindings no longer work with the master branch of Limelight. If you are using an older version of php-mecab, please update your bindings or use the php-mecab_pre_0.6.0 version.

Install Limelight

Using Docker

From the project root, build the image:

docker build -f docker/Dockerfile -t limelight .

Once it is built, run the container:

docker run --name limelight -v /host/path/to/limelight:/usr/limelight -d --rm limelight

Access the project in the container:

docker exec -it limelight bash

Install composer dependencies from within the container:

composer install

Without Docker

Requirements
  • php > 5.6
Dependencies

Before installing Limelight, you must install both mecab and the php extension php-mecab on your system.

Linux Ubuntu Users

Use the install script included in this repository. The script only works for and php7. Download the script:

curl -O https://raw.githubusercontent.com/nihongodera/limelight/master/install_mecab_php-mecab.sh

Make the file executable:

chmod +x install_mecab_php-mecab.sh

Execute the script:

./install_mecab_php-mecab.sh

You may need to restart your server to complete the process.

For information about what the script does, see here.

Other Systems

Please see this page to learn more about installing on your system.

Install Limelight

Install Limelight through composer.

composer require nihongodera/limelight

Parse Text

Make a new instance of Limelight\Limelight. Limelight takes no arguments.

$limelight = new Limelight();

Use the parse() method on the Limelight object to parse Japanese text.

$results = $limelight->parse('庭でライムを育てています。');

The returned object is an instance of Limelight\Classes\LimelightResults.

Get Results

Get results for the entire text using methods available on LimelightResults.

$results = $limelight->parse('庭でライムを育てています。');

echo 'Words: ' . $results->string('word') . "\n";
echo 'Readings: ' . $results->string('reading') . "\n";
echo 'Pronunciations: ' . $results->string('pronunciation') . "\n";
echo 'Lemmas: ' . $results->string('lemma') . "\n";
echo 'Parts of speech: ' . $results->string('partOfSpeech') . "\n";
echo 'Hiragana: ' . $results->toHiragana()->string('word') . "\n";
echo 'Katakana: ' . $results->toKatakana()->string('word') . "\n";
echo 'Romaji: ' . $results->string('romaji', ' ') . "\n";
echo 'Furigana: ' . $results->string('furigana') . "\n";

Output: Words: 庭でライムを育てています。 Readings: ニワデライムヲソダテテイマス。 Pronunciations: ニワデライムヲソダテテイマス。 Lemmas: 庭でライムを育てる。 Parts of speech: noun postposition noun postposition verb symbol Hiragana: にわでらいむをそだてています。 Katakana: ニワデライムヲソダテテイマス。 Romaji: niwa de raimu o sodateteimasu. Furigana: (にわ)でライムを(そだ)てています。

Alter the collection of words however you like using the library of collection methods.

Get individual words off the LimelightResults object by using one of several applicable collection methods. Use methods available on the returned LimelightWord object.

$results = $limelight->parse('庭でライムを育てています。');

$word1 = $results->pull(2);

$word2 = $results->where('word', '庭');

echo $word1->string('romaji') . "\n";

echo $word2->string('furigana') . "\n";

Output: raimu にわ

Methods on the LimelightResults object and the LimelightWord object follow the same conventions, but LimelightResults methods are plural (words()) while LimelightWord methods are singular (word()).

Alternatively, loop through all the words on the LimelightResults object.

$results = $limelight->parse('庭でライムを育てています。');

foreach ($results as $word) {
    echo $word->word() . ' is a ' . $word->partOfSpeech() . ' read like ' . $word->reading() . "\n";
}

Output: 庭 is a noun read like ニワ で is a postposition read like デ ライム is a noun read like ライム を is a postposition read like ヲ 育てています is a verb read like ソダテテイマス 。 is a symbol read like 。

Full Documentation

Full documentation for Limelight can be found on the Limelight Wiki page.

Sources, Contributions, and Contributing

The Japanese parsing logic used in Limelight was adapted from Kimtaro's excellent Ruby program Ve. A big thank you to him and all the others who contributed on that project.

Limelight relies heavily on both MeCab and php-mecab.

Collection methods and methods in the Arr class were derived from Laravel's collection methods.

Contributors more than welcome.

Top

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].