All Projects → jedld → multi_string_replace

jedld / multi_string_replace

Licence: MIT license
A fast multiple string replace library for ruby. Uses a C implementation of the Aho–Corasick Algorithm based on https://github.com/morenice/ahocorasick while adding support for on the fly multiple string replacement. Faster alternative to String.gsub when dealing with non-regex (exact match) use cases

Programming Languages

c
50402 projects - #5 most used programming language
ruby
36898 projects - #4 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to multi string replace

stringbench
String matching algorithm benchmark
Stars: ✭ 31 (+93.75%)
Mutual labels:  string-matching, string-search
strsim
string similarity based on Dice's coefficient in go
Stars: ✭ 39 (+143.75%)
Mutual labels:  string-matching
PFAC
PFAC is an open library for exact string matching performed on NVIDIA GPUs
Stars: ✭ 41 (+156.25%)
Mutual labels:  string-matching
fuzzy-match
Library and command line utility to do approximate string matching of a source against a bitext index and get matched source and target.
Stars: ✭ 31 (+93.75%)
Mutual labels:  string-matching
fuzzywuzzy
Fuzzy string matching for PHP
Stars: ✭ 60 (+275%)
Mutual labels:  string-matching
effcee
Effcee is a C++ library for stateful pattern matching of strings, inspired by LLVM's FileCheck
Stars: ✭ 76 (+375%)
Mutual labels:  string-matching
levenshtein.c
Levenshtein algorithm in C
Stars: ✭ 77 (+381.25%)
Mutual labels:  string-matching
beda
Beda is a golang library for detecting how similar a two string
Stars: ✭ 34 (+112.5%)
Mutual labels:  string-matching
wildmatch
Simple string matching with questionmark- and star-wildcard operator
Stars: ✭ 37 (+131.25%)
Mutual labels:  string-matching
textics
📉 JavaScript Text Statistics that counts lines, words, chars, and spaces.
Stars: ✭ 36 (+125%)
Mutual labels:  string-search
node-red-contrib-string
Provides a string manipulation node with a chainable UI based on the concise and lightweight stringjs.com.
Stars: ✭ 15 (-6.25%)
Mutual labels:  string-matching
vbml
Way to check, match and resist.
Stars: ✭ 27 (+68.75%)
Mutual labels:  string-matching
strutil
Golang metrics for calculating string similarity and other string utility functions
Stars: ✭ 114 (+612.5%)
Mutual labels:  string-matching
ATGValidator
iOS validation framework with form validation support
Stars: ✭ 51 (+218.75%)
Mutual labels:  string-matching
TeamReference
Team reference for Competitive Programming. Algorithms implementations very used in the ACM-ICPC contests. Latex template to build your own team reference.
Stars: ✭ 29 (+81.25%)
Mutual labels:  string-matching
simd-byte-lookup
SIMDized check which bytes are in a set
Stars: ✭ 23 (+43.75%)
Mutual labels:  string-matching
algos
A collection of algorithms in rust
Stars: ✭ 16 (+0%)
Mutual labels:  string-matching
simplematch
Minimal, super readable string pattern matching for python.
Stars: ✭ 147 (+818.75%)
Mutual labels:  string-matching
FastFuzzyStringMatcherDotNet
A BK tree implementation for fast fuzzy string matching
Stars: ✭ 23 (+43.75%)
Mutual labels:  string-matching
Levenshtein
The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity
Stars: ✭ 38 (+137.5%)
Mutual labels:  string-matching

Gem

MultiStringReplace

A fast multiple string replace library for ruby. Uses a C implementation of the Aho–Corasick Algorithm based on https://github.com/morenice/ahocorasick while adding support for a few performance enhancements and on the fly multiple string replacement.

If Regex is not needed, this library offers significant performance advantages over String.gsub() for large string and with a large number of tokens.

Installation

Add this line to your application's Gemfile:

gem 'multi_string_replace'

And then execute:

$ bundle

Or install it yourself as:

$ gem install multi_string_replace

Usage

MultiStringReplace.match("The quick brown fox jumps over the lazy dog brown", ['brown', 'fox'])
# { 0 => [10, 44], 1 => [16] }
MultiStringReplace.replace("The quick brown fox jumps over the lazy dog brown", {'brown' => 'black', 'fox' => 'wolf'})
# The quick black wolf jumps over the lazy dog black

You can also pass in a Proc, these will only get evaluated when the token is encountered.

MultiStringReplace.replace("The quick brown fox jumps over the lazy dog brown", {'brown' => 'black', 'fox' => ->() { "cat" }})

Also adds a mreplace method to String which does the same thing:

"The quick brown fox jumps over the lazy dog brown".mreplace({'brown' => 'black', 'fox' => ->() { "cat" }})

Performance

Performing token replacement on a 200K text file repeated 100 times

                         user     system      total        real
multi gsub           1.322510   0.000000   1.322510 (  1.344405)
MultiStringReplace   0.196823   0.007979   0.204802 (  0.207219)
mreplace             0.200593   0.004031   0.204624 (  0.205379)

Benchmark sources can be found here: https://github.com/jedld/multi_word_replace/blob/master/bin/benchmark.rb

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/multi_string_replace. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

License

The gem is available as open source under the terms of the MIT License.

Code of Conduct

Everyone interacting in the MultiStringReplace project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].