All Projects → deniskyashif → ssfst

deniskyashif / ssfst

Licence: MIT license
📜 Rewrite text in linear time.

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to ssfst

State
Finite state machine for TypeScript and JavaScript
Stars: ✭ 118 (+51.28%)
Mutual labels:  finite-state-machine
Sm
🚀 SM – a static State Machine library
Stars: ✭ 149 (+91.03%)
Mutual labels:  finite-state-machine
Awesome Fsm
🤖 A curated list of awesome resources related to finite state machines and statecharts.
Stars: ✭ 189 (+142.31%)
Mutual labels:  finite-state-machine
Automata
A Python library for simulating finite automata, pushdown automata, and Turing machines
Stars: ✭ 121 (+55.13%)
Mutual labels:  finite-state-machine
Arduino Fsm
Arduino library for implementing a finite state machine.
Stars: ✭ 142 (+82.05%)
Mutual labels:  finite-state-machine
Dasync
Every developer deserves the right of creating microservices without using any framework 🤍
Stars: ✭ 154 (+97.44%)
Mutual labels:  finite-state-machine
Hsm
Finite state machine library based on the boost hana meta programming library. It follows the principles of the boost msm and boost sml libraries, but tries to reduce own complex meta programming code to a minimum.
Stars: ✭ 106 (+35.9%)
Mutual labels:  finite-state-machine
regexjs
A fast and minimal regular expression engine.
Stars: ✭ 64 (-17.95%)
Mutual labels:  finite-state-machine
Django Fsm
Django friendly finite state machine support
Stars: ✭ 1,898 (+2333.33%)
Mutual labels:  finite-state-machine
Python Statemachine
Python Finite State Machines made easy.
Stars: ✭ 184 (+135.9%)
Mutual labels:  finite-state-machine
Tilakone
Minimalistic finite state machine (FSM) in Clojure
Stars: ✭ 128 (+64.1%)
Mutual labels:  finite-state-machine
Statelin
A finite state machine for Kotlin and Android
Stars: ✭ 134 (+71.79%)
Mutual labels:  finite-state-machine
Entity Controller
Entity and lighting controller for managing devices via timers, scripts, and sun-based time restrictions.
Stars: ✭ 156 (+100%)
Mutual labels:  finite-state-machine
Microwf
A simple finite state machine (FSM) with workflow character where you define your workflows in code.
Stars: ✭ 122 (+56.41%)
Mutual labels:  finite-state-machine
Fluent State Machine
Fluent API for creating state machines in C#
Stars: ✭ 195 (+150%)
Mutual labels:  finite-state-machine
Afsm
C++14 Finite State Machine library
Stars: ✭ 113 (+44.87%)
Mutual labels:  finite-state-machine
Nanostate
🚦- Small Finite State Machines
Stars: ✭ 151 (+93.59%)
Mutual labels:  finite-state-machine
datafsm
Machine Learning Finite State Machine Models from Data with Genetic Algorithms
Stars: ✭ 14 (-82.05%)
Mutual labels:  finite-state-machine
Pluck
Pluck text in a fast and intuitive way 🐓
Stars: ✭ 202 (+158.97%)
Mutual labels:  finite-state-machine
Easy States
The simple, stupid state machine for Java
Stars: ✭ 167 (+114.1%)
Mutual labels:  finite-state-machine

Subsequential Finite State Transducer

Build Status Coverage Status Code Climate Dependencies

Given an input text, produces a new text by applying a fixed set of rewrite rules. The algorithm builds a minimal subsequential transducer and uses the "leftmost largest match" replacement strategy with skips. No overlap between the replaced parts is possible. The time needed to compute the transducer is linear in the size of the input dictionary. For any text t of length |t| the time it takes to perform a rewrite is also linear O(|t|+|t'|) where t' denotes the resulting output string.
Check out the Online Sandbox.

Usage

npm i --save ssfst

Example: Text Rewriting

const ssfst = require('ssfst');

const spellingCorrector = ssfst.init([
    { input: 'acheive', output: 'achieve'},
    { input: 'arguement', output: 'argument'},
    { input: 'independant', output: 'independent'},
    { input: 'posession', output: 'possession'},
    { input: 'mercy less', output: 'merciless' }
]);

spellingCorrector.process('independant'); // => "independent"
spellingCorrector.process('mercy less arguement'); // => "merciless argument"
spellingCorrector.process('they acheived a lot'); // => "they achieved a lot"

The init factory function takes a collection of pairs and returns a transducer. The transducer can be initialized by any iterable object.

function* dictGen() {
    yield { input: 'dog', output: '<a href="https://en.wikipedia.org/wiki/Dog">dog</a>' };
    yield { input: 'fox', output: '<a href="https://en.wikipedia.org/wiki/Fox">fox</a>' };
}

const transducer = ssfst.init(dictGen());
transducer.process('The quick brown fox jumped over the lazy dog.');
/* => The quick brown <a href="https://en.wikipedia.org/wiki/Fox">fox</a> jumped over the lazy <a href="https://en.wikipedia.org/wiki/Dog">dog</a>. */

Working with large datasets

Loading the full rewrite dictionary in memory is not optimal when working with large datasets. In this case we want to build the transducer by adding the entries asynchronously one at a time. This is achieved by using an async iterable.

For example, if our dataset is stored in a file, we can read its contents one line at a time.

Berlin,Germany
Buenos Aires,Argentina
London,United Kingdom
Sofia,Bulgaria
Tokyo,Japan

This is the dictionary text file. Each line contains an entry and its input and output values are separated by a comma. We implement a generator function which reads it asynchronously line by line and yields an object which is consumed by the initialization of the transducer.

const fs = require('fs');
const readline = require('readline');
const ssfst = require('ssfst');

async function* readLinesGenAsync() {
    const lineReader = readline.createInterface({
        input: fs.createReadStream(__dirname + '/capitals.txt')
    });

    for await (const line of lineReader) {
        const [input, output] = line.split(',');
        yield { input, output };
    }
}

We pass the async iterable to the initAsync factory function.

const transducer = await ssfst.initAsync(readLinesGenAsync());

Example: Key-Value Store

The subsequential transducer can also be used to efficiently store key-value pairs.

const val = transducer.process('Sofia'); // => Bulgaria
const invalid = transducer.process('Unknown Key'); // => Unknown Key

If there's no value for a given key, it will return the key itself, which simply reduces to processing a text without applying any rewrite rules.

Use with TypeScript

import * as ssfst from 'ssfst';

Run Locally

git clone https://github.com/deniskyashif/ssfst.git
cd ssfst
npm i

Sample implementations can be found in examples/.

Run the Tests

npm t

References

This implementation follows the construction presented in "Efficient Dictionary-Based Text Rewriting using Subsequential Transducers" by S. Mihov, K. Schulz

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].