All Projects → b00f → Lilak

b00f / Lilak

Licence: other
Persian Spell Checking Dictionary

Projects that are alternatives of or similar to Lilak

Zidian
28GB超大字典(dictionary )
Stars: ✭ 38 (-55.81%)
Mutual labels:  dictionary
Buckets Js
A complete, fully tested and documented data structure library written in pure JavaScript.
Stars: ✭ 1,128 (+1211.63%)
Mutual labels:  dictionary
Dyno
Package dyno is a utility to work with dynamic objects at ease.
Stars: ✭ 81 (-5.81%)
Mutual labels:  dictionary
Dictionary Builder
Real world example to demonstrate advanced techniques to unmarshall very large xml document with very low memory footprint.
Stars: ✭ 40 (-53.49%)
Mutual labels:  dictionary
Dictionarydata
高质量英语字典,400多本单词书+60000多个单词!单词书包括小学、中学、高中、考研、考博、出国(GRE、托福等等)等等,难道它不香吗?
Stars: ✭ 40 (-53.49%)
Mutual labels:  dictionary
Memorize
🚀 Japanese-English-Mongolian dictionary. It lets you find words, kanji and more quickly and easily
Stars: ✭ 72 (-16.28%)
Mutual labels:  dictionary
Probable Wordlists
Version 2 is live! Wordlists sorted by probability originally created for password generation and testing - make sure your passwords aren't popular!
Stars: ✭ 7,312 (+8402.33%)
Mutual labels:  dictionary
Pyglossary
A tool for converting dictionary files aka glossaries. The primary purpose is to be able to use our offline glossaries in any Open Source dictionary we like on any OS/device.
Stars: ✭ 1,257 (+1361.63%)
Mutual labels:  dictionary
Orthrus
A tool to manage, conduct, and assess dictionary-based fuzz testing
Stars: ✭ 61 (-29.07%)
Mutual labels:  dictionary
Dictionary
Dictionary of Ukrainian counterparts for technical terms
Stars: ✭ 79 (-8.14%)
Mutual labels:  dictionary
Ieml
IEML semantic language - a meaning-representation system based on semantic primitives and a regular grammar. Basic semantic relationships between concepts are automatically computed from syntactic similarities.
Stars: ✭ 41 (-52.33%)
Mutual labels:  dictionary
Phpcollections
A set of collections for PHP.
Stars: ✭ 53 (-38.37%)
Mutual labels:  dictionary
Color Names
Large list of handpicked color names 🌈
Stars: ✭ 1,198 (+1293.02%)
Mutual labels:  dictionary
Slackword
Dictionary in your slack....additionally, you can get random words.
Stars: ✭ 39 (-54.65%)
Mutual labels:  dictionary
Qolibri
Continuation of the qolibri EPWING dictionary/book reader
Stars: ✭ 82 (-4.65%)
Mutual labels:  dictionary
Google Ime Dictionary
日英変換・英語略語展開のための IME 追加辞書 📙 日本語から英語への和英変換や英語略語の展開を Google 日本語入力や ATOK などで可能にする IME 拡張辞書です
Stars: ✭ 30 (-65.12%)
Mutual labels:  dictionary
Mjextension
A fast, convenient and nonintrusive conversion framework between JSON and model. Your model class doesn't need to extend any base class. You don't need to modify any model file.
Stars: ✭ 8,458 (+9734.88%)
Mutual labels:  dictionary
Imtools
Fast and memory-efficient immutable collections and helper data structures
Stars: ✭ 85 (-1.16%)
Mutual labels:  dictionary
Awesome Pronunciation
💬 How to pronounce Programming words?
Stars: ✭ 84 (-2.33%)
Mutual labels:  dictionary
Dev Terms
A list of generic terminology used by developers
Stars: ✭ 76 (-11.63%)
Mutual labels:  dictionary

Lilak, Persian Spell Checking Dictionary

Build Status Donate

Lilak is an open source project for generating Persian dictionary for hunspell spell checker based on Persian Morphology.

In Persian language affixes can change the meaning of the word. Some suffixes attached to a word as short form of verbs. Part-of-speech plays an important role in Persian language. In some cases the pronunciation of the word can change the suffixes. Check the code for more information.

Lilak has a lexicon of Persian words with part-of-speech tags. Lilak builds a dictionary for hunspell to predict the best form of compound words based on morphological rules.

Content

lilak
  |-- build           : Build folder. Compiled dictionary goes here.
  |
  |-- src
  |   |-- data
  |   |   |-- lexicon       : Lexicon of Persian words with part-of-speech tags
  |   |   |-- affixes       : Affix (prefix or suffix) rules
  |   |   |-- dic_users     : List of words without POS tag.
  |   |   \-- verbs.htm     : List of Persian verbs (unstemmed)
  |   |
  |   |-- lilak.py    : Python script for building lilak dictionary
  |   \-- test.py     : Python script to test lilak accuracy
  |
  |-- test
  |   |-- text1       : "Farsi(Persian) is Sugar", A short story by Mohammad-Ali Jamalzadeh
  |   |-- text2       : "A Hekayat" By Saadi
  |   |-- text3       : "A Ghazal" By Hafez
  |   |-- text4       : "Yazdgerd Kingdom" By Ferdowsi
  |   |-- text5       : "A Ghazal" By Muhammad Husayn Tabataba'i
  |   |-- text6       : "Have a Safe Trip" A poem by Shafii Kadkani
  |   |-- text7       : "Se Tar" A short story by Jalal Al-e-Ahmad
  |   |-- text8       : "End of Shahname" By Mehdi Akhavan-Sales
  |   |-- text9       : "The Water"s Footsteps" By Sohrab Sepehri
  |   |-- text10      : "Nei Name" By Rumi
  |   \-- verbs       : Some inflected verbs
  |
  |-- README.md       :
  \-- LICENCE         : License file

Building Dictionary

Before using lilak please make sure you have install python 3.x.

To build the lilak dictionary, run lilak.py from src folder:

make build
make test

You can find the compiled dictionary at the build folder.

check result.log for test result.

How to contribute

The best way you can contribute on this project is collecting words with correct part-of-speech tags. Part-of-speech is important to build Lilak. It should classified in main types like: verb, noun, adjective, etc. Also some other tags will be useful. like tense of verb, singular or plural, etc. Check the src/data/lexicon for more information

Please open an issue if you find any mistakes while using lilak.

Using Lilak

  • You can find compiled dictionaries here.
  • Mozilla Firefox: Install lilak extension from here here.
  • Google Chrome: Go to Settings, find Language and input settings, add Persian language and make sure you have enabled the spell checker option.

Supporting Lilak

If you like this project, please donate or consider becoming a patron:

Become a patron

License

Lilak is published under Apache licence. You may freely use, reproduce, modify or distribute it. If you think lilak is useful please support it.

About the Name

lilac in English came from French lilac "shrub of genus Syringa with mauve flowers" from Spanish lilac, from Arabic lilak, from Persian lilak, variant of nilak "bluish"

In Memory of Abolhassan Najafi

Abolhassan Najafi was an associate member of Iran's Academy of Persian Language and Literature. His most famous books is "Ghalat Nanevisim" (Let’s not write incorrect).

Thanks

Special thanks to

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].