All Projects → masnormen → carakanjs

masnormen / carakanjs

Licence: MIT license
Convert/transliterate Latin script into Javanese script, also known as Aksara Jawa or Carakan.

Programming Languages

typescript
32286 projects
javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to carakanjs

transliterasijawa
Javanese Transliteration (Nulisa Aksara Jawa)
Stars: ✭ 55 (+139.13%)
Mutual labels:  transliteration, jawa, aksara-jawa, javanese
latintojavanese
Script sederhana untuk mengubah aksara latin menjadi aksara Jawa
Stars: ✭ 28 (+21.74%)
Mutual labels:  aksara-jawa, javanese
Urlify
A fast PHP slug generator and transliteration library that converts non-ascii characters for use in URLs.
Stars: ✭ 633 (+2652.17%)
Mutual labels:  transliteration
Avspeechsynthesizer Example
A companion project to the NSHipster article about AVSpeechSynthesizer
Stars: ✭ 157 (+582.61%)
Mutual labels:  transliteration
Speakingurl
Generate a slug – transliteration with a lot of options
Stars: ✭ 1,056 (+4491.3%)
Mutual labels:  transliteration
Hanbaobao
Mandarin Chinese text segmentation and mobile dictionary Android app (中文分词)
Stars: ✭ 17 (-26.09%)
Mutual labels:  transliteration
Cyrillic To Translit Js
Ultra-lightweight JavaScript library for converting Cyrillic symbols to Translit and vice versa
Stars: ✭ 91 (+295.65%)
Mutual labels:  transliteration
Limax
Node.js module to generate URL slugs. Another one? This one cares about i18n and transliterates non-Latin scripts to conform to the RFC3986 standard. Mostly API-compatible with similar modules.
Stars: ✭ 423 (+1739.13%)
Mutual labels:  transliteration
Transliterate
Bi-directional transliterator for Python. Transliterates (unicode) strings according to the rules specified in the language packs.
Stars: ✭ 193 (+739.13%)
Mutual labels:  transliteration
Slugify Cli
Slugify a string
Stars: ✭ 49 (+113.04%)
Mutual labels:  transliteration
Transliterate
Convert Unicode characters to Latin characters using transliteration
Stars: ✭ 152 (+560.87%)
Mutual labels:  transliteration
Deeptranslit
Efficient and easy to use transliteration for Indian languages
Stars: ✭ 41 (+78.26%)
Mutual labels:  transliteration
Lexical Sort
Sort Unicode strings lexicographically
Stars: ✭ 23 (+0%)
Mutual labels:  transliteration
Mrz
Machine Readable Zone generator and checker for official travel documents sizes 1, 2, 3, MRVA and MRVB (Passports, Visas, national id cards and other travel documents)
Stars: ✭ 119 (+417.39%)
Mutual labels:  transliteration
Slug Generator
Slug Generator Library for PHP, based on Unicode’s CLDR data
Stars: ✭ 740 (+3117.39%)
Mutual labels:  transliteration
Neural japanese transliterator
Can neural networks transliterate Romaji into Japanese correctly?
Stars: ✭ 170 (+639.13%)
Mutual labels:  transliteration
Transliteration
UTF-8 to ASCII transliteration / slugify module for node.js, browser, Web Worker, React Native, Electron and CLI.
Stars: ✭ 444 (+1830.43%)
Mutual labels:  transliteration
Crx Jtrans
jTransliter - the roman to unicode transliter as Google chrome extension
Stars: ✭ 13 (-43.48%)
Mutual labels:  transliteration
Lipika Ime
Input Method Engine (IME) for Mac OS X with built-in support for all Indic Languages
Stars: ✭ 76 (+230.43%)
Mutual labels:  transliteration
transliteration-php
🇺🇦 🇬🇧 🔡 🐘 PHP library for transliteration.
Stars: ✭ 34 (+47.83%)
Mutual labels:  transliteration

Carakan.js

npm size madein

Carakan.js is a small library for converting/transliterating Latin script into Javanese script, also known as Aksara Jawa or Carakan.

👀 Why this library?

Yes, I know there are already many Javanese script transliterating library out there, but they are not accurate. At least for some words with complicated syllable structure due to the nature of Javanese language. Like "ngglembyar", "nggrambyang".

The complexity of Javanese script writing rules made things difficult. Therefore, I want to create a library to create a more accurate transliteration from Latin into Javanese script and vice versa with the linguistic complexity and ease of use in mind, so we can just input the regular Javanese text we usually read and write in Latin text in our everyday conversations.

Carakan.js is also fast, needing only less than 2 milliseconds to convert a simple sentence. The library is also extensively tested using various sentences and use cases. You can see the tests here.

📖 Table of Contents

🚀 Features

Currently, Carakan.js can handle:

  • Basic Hanacaraka (20 basic characters) and its Pasangan
  • Sandhangan Swara (wulu, taling, pepet, suku, taling tarung)
  • Sandhangan Wyanjana (cakra, wignyan, etc) and Panjingan
  • Angka
  • Aksara Swara
  • Aksara Rekan
  • Aksara Murda
  • Aksara Ganten
  • Pada (Punctuations)
  • Supports accents (like Wikipedia Basa Jawa)
  • ...and many more (see the code yourself!)

📦 Installation

NPM:

$ npm install carakanjs

Yarn:

$ yarn add carakanjs

⌨️ Usage

Example with default options

import { toJavanese } from "carakanjs";

let x = toJavanese("blumbang gxmblundhung kxmambang");

// with default configs (optional)
let x = toJavanese("blumbang gxmblundhung kxmambang", {useAccents: false, useSwara: true, useMurda: true})

console.log(x)

// => ꦧ꧀ꦭꦸꦩ꧀ꦧꦁꦒꦼꦩ꧀ꦧ꧀ꦭꦸꦤ꧀ꦝꦸꦁꦏꦼꦩꦩ꧀ꦧꦁ

Writing Pepet and Taling sounds (with default config)

// pepet is "x"
// taling is "e"

toJavanese("es dawxt");
// => ꦲꦺꦱ꧀ꦢꦮꦼꦠ꧀

Writing Pepet and Taling sounds (with useAccents = true)

// pepet is "e"
// taling is "é", "è", or "e`" (e + backtick)


toJavanese("e`s dawet", {useAccents: true});
// or
toJavanese("és dawet", {useAccents: true});
// => ꦲꦺꦱ꧀ꦢꦮꦼꦠ꧀

// example text from Wikipedia basa Jawa
toJavanese(
  "référèndhum menika mutusaken Timor Wétan pisah",
  {useAccents: true}
);
// => ꦫꦺꦥ꦳ꦺꦫꦺꦤ꧀ꦝꦸꦩ꧀ꦩꦼꦤꦶꦏꦩꦸꦠꦸꦱꦏꦼꦤ꧀ꦡꦶꦩꦺꦴꦂꦮꦺꦠꦤ꧀ꦥꦶꦱꦃ

Writing Aksara Swara, Murda, and Rekan

toJavanese("GUSTI ALLAH YA KHALIK");

// => ꦓꦸꦯ꧀ꦡꦶꦄꦭ꧀ꦭꦃꦪꦏ꦳ꦭꦶꦑ꧀

Writing Angka (Numbers)

// pada pangkat (꧇) will be automatically added around numbers
toJavanese("tanggal 17 bulan 8 taun 1945");

// => ꦠꦁꦒꦭ꧀꧇꧑꧗꧇ꦧꦸꦭꦤ꧀꧇꧘꧇ꦠꦲꦸꦤ꧀꧇꧑꧙꧔꧕꧇

Writing Pada (Punctuations)

toJavanese("{<||,:.'\":()>}");

// => ꧁꧌꧋꧋꧈꧇꧉꧊꧊꧇꧊꧊꧍꧂

Writing Aksara Ganten & Panjingan

toJavanese("kreta krxtxg, lxmah rxgxd");

// => ꦏꦿꦺꦠꦏꦽꦠꦼꦒ꧀‌ꦊꦩꦃꦉꦒꦼꦢ꧀

*️⃣ Table of Punctuations

Name Input Output
Pada lingsa * ,
Pada lungsi * .
Pada pangkat :
Pada adeg " or ' or ( or )
Pada adeg-adeg |
Pada piseleh < ꧌ ......
Pada piseleh walik > ...... ꧍
Rerenggan kiwa { ꧁ ...
Rerenggan tengen } ... ꧂

*) Pada Lingsa (comma) will not be rendered if a Pangkon is next to it. Pada Lungsi (period) will be reduced into Pada Lingsa if a Pangkon is next to it. This behavior is adheres to the rules of Javanese writing.

🔥 API

Carakan.js package exports two things: toJavanese() function and CarakanHelper namespace which contains various helper.

toJavanese(input, config?)

Returns a string of Javanese script converted from input, using the set configs.

input

Type: string

A string of Latin character which will be transliterated into Javanese script.

config.useAccents

Type: boolean, default: false

A boolean indicating whether Carakan.js should convert the input string with accents. There are two modes of input:

  • Non-accented mode (default) In this mode, Carakan.js will treat the letter "x" as Pepet (schwa sound) and "e" as Taling (see examples above).
  • Accented mode The "formally and academically correct" way to write Javanese in Latin. Typically used in Wikipedia basa Jawa texts. In this mode, Carakan.js will treat the letter "e" as Pepet, "é"/"è"/"e`" as Taling. "x" will still be treated as Pepet (see examples above).

Basically, the transliterator engine can only read string in non-accented mode. When useAccents is set to true, Carakan.js will convert the accented input into non-accented mode first, so then it can convert them into Javanese script.

config.useSwara

Type: boolean, default: true

A boolean indicating whether Carakan.js should convert uppercase vowels (A, I, U, E, O) into Aksara Swara. If set to false, Carakan.js will render them as regular vowels sound written with the letter "ha".

config.useMurda

Type: boolean, default: true

A boolean indicating whether Carakan.js should convert some uppercase consonants (N, K, T, S, P, NY, G, B) into Aksara Murda. If set to false, Carakan.js will render them with their regular Javanese script character.

CarakanHelper

A namespace which contains various helper for the engine to convert latin letters into Javanese Script.

🧰 TODO

  • support transliteration of Javanese script back to Latin
  • support more Sandhangan: Swara Dirga (for long vowels, typically used to write Sanskrit)
  • support more punctuations: Pangrangkep, Pada Luhur, Pada Windu, Purwa Pada, Madya Pada, Wasana Pada

📚 References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].