All Projects → srix → pytamil

srix / pytamil

Licence: MIT license
பைந்தமிழ் (pytamil) library is intended to be used in analysis of tamil literary work. A wealth of knowledge is hidden in old literature. They are time machines to past. Ever wondered what is the popular color or food in tamil speaking world in 500AD. The answer is hidden in literature. With right computer tools it becomes possible for us to dig…

Programming Languages

python
139335 projects - #7 most used programming language
ANTLR
299 projects

Projects that are alternatives of or similar to pytamil

monkeylearn-php
Official PHP client for the MonkeyLearn API. Build and consume machine learning models for language processing from your PHP apps.
Stars: ✭ 47 (+14.63%)
Mutual labels:  language-processing
ner-d
Python module for Named Entity Recognition (NER) using natural language processing.
Stars: ✭ 14 (-65.85%)
Mutual labels:  language-processing
schreib-gut
German extension for write-good
Stars: ✭ 34 (-17.07%)
Mutual labels:  language-processing
lingua-go
👄 The most accurate natural language detection library for Go, suitable for long and short text alike
Stars: ✭ 684 (+1568.29%)
Mutual labels:  language-processing
aprenda-python
Aprendizado, dicas e projetos sobre Python
Stars: ✭ 22 (-46.34%)
Mutual labels:  language-processing
YouTube to m3u
Grab .m3u8 from YouTube live channels and makes .m3u IPTV Playlist from various languages and Events. Tamil / Malayalam / English / Hindi / French / Kids / Sports / Urudu etc.
Stars: ✭ 48 (+17.07%)
Mutual labels:  tamil
parallel-corpora-tools
Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.
Stars: ✭ 35 (-14.63%)
Mutual labels:  language-processing
Thirukkural-Tamil-Dataset
திருக்குறள் by திருவள்ளுவர்.
Stars: ✭ 44 (+7.32%)
Mutual labels:  tamil
NLP Quickbook
NLP in Python with Deep Learning
Stars: ✭ 516 (+1158.54%)
Mutual labels:  language-processing
Spell4Wiki
Spell4Wiki is a mobile application to record and upload audio for Wiktionary words to Wikimedia commons. Also act as a Wiki-Dictionary.
Stars: ✭ 17 (-58.54%)
Mutual labels:  tamil
sinling
A collection of NLP tools for Sinhalese (සිංහල).
Stars: ✭ 38 (-7.32%)
Mutual labels:  language-processing
theedhum-nandrum
A sentiment classifier on mixed language (and mixed script) reviews in Tamil, Malayalam and English
Stars: ✭ 16 (-60.98%)
Mutual labels:  tamil
Thirukkural-English-Translation-Dataset
Thirukural in English
Stars: ✭ 12 (-70.73%)
Mutual labels:  tamil
govarnam
Easily Type Indian Languages on computer and mobile. GoVarnam is a cross-platform transliteration library. Manglish -> Malayalam, Thanglish -> Tamil, Hinglish -> Hindi plus another 10 languages. GoVarnam is a near-Go port of libvarnam
Stars: ✭ 97 (+136.59%)
Mutual labels:  tamil
monkeylearn-java
Official Java client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Java apps.
Stars: ✭ 23 (-43.9%)
Mutual labels:  language-processing
C90Compiler-EIE2
C90 to MIPS I Compiler done as a coursework for EE2-15
Stars: ✭ 15 (-63.41%)
Mutual labels:  language-processing

பைந்தமிழ் (pytamil)

A Library that can do the following

தமிழ் எழுத்து மற்றும் புணர்ச்சி

எழுத்து.மெல்லினம்

['ங்', 'ஞ்', 'ண்', 'ந்', 'ம்', 'ன்']

எழுத்து.குறில்

['அ', 'இ', 'உ', 'எ', 'ஒ']

புணர்ச்சி.தனிமொழி_ஆக்கு('விருந்தோம்பல்')

['விருந்து', 'ஓம்பல்']

புணர்ச்சி.தொடர்மொழி_ஆக்கு('விருந்து', 'ஓம்பல்' )

விருந்தோம்பல்

மாத்திரை.மாத்திரை_கொடு('பைந்தமிழ்')

[2, 0.5, 1, 1, 0.5]

தற்போதைய எழுத்துக்களை பண்டைய எழுத்துக்களாக மாற்றுதல்

தமிழ்.பிரம்மி('வணக்கம்')

𑀯𑀡𑀓𑀓𑀫

தமிழ்.பண்டைய_வாக்கியம்_ஆக்கு(வாக்கியம் = 'வணக்கம்', வருடம் = 300 )

யாப்பு ஆராய்தல்

திருக்குறள் : பொருட்பால் : குறள் 467

எண்ணித் துணிகக் கருமம் துணிந்தபின்
எண்ணுவம் என்பது இழுக்கு

kural_parse_tree

புகழேந்திப் புலவர் இயற்றிய நளவெண்பா : 1

ஆதித் தனிக்கோல மானா னடியவற்காச்
சோதித் திருத்தூணிற் றோன்றினான் வேதத்தின்
முன்னின்றான் வேழம் முதலே யெனவழைப்ப
என்னென்றா னெங்கட் கிறை

nerisai_parse_tree

Why Pytamil

பைந்தமிழ் (pytamil) library is intended to be used in analysis of tamil literary work. A wealth of knowledge is hidden in old literature. They are time machines to past. Ever wondered what is the popular color or food in tamil speaking world in 500AD. The answer is hidden in literature. With right computer tools it becomes possible for us to dig in to this wealth of knowledge.

Core philosophy of பைந்தமிழ் (pytamil) library is to clearly separarte tamil language conepts from the programming language. For example, Tamil புணர்ச்சி rules are captured in human readable text file புணர்ச்சிவிதிகள்.yaml in YAML format. This approach has two major benefits

  1. This allows people with no prior knowledge in computer programming to contribute to the project and have more meaningful and natural discussion on the language concepts.
  2. Similar approach can be used to implement libraries for other human languages like Sanskrit, Telugu, Kannada etc.
  3. Developers can use the core tamil language files to port this library to other computer languages like Javascript, c# etc.

List of Core tamil language files

TODO

If you have a feature in mind, Please add a feature request here with label as enhancement.

  • return original words when a combined word is presented by predictive deomposition using புணர்ச்சி விதிகள்
  • built pip module
  • and many more

For Developers

Getting started

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].