wolfgarbe / Symspellcompound
SymSpellCompound: compound aware automatic spelling correction
Stars: ✭ 61
Projects that are alternatives of or similar to Symspellcompound
LinSpell
Fast approximate strings search & spelling correction
Stars: ✭ 52 (-14.75%)
Mutual labels: spellcheck, fuzzy-search, levenshtein, spell-check
Symspellpy
Python port of SymSpell
Stars: ✭ 420 (+588.52%)
Mutual labels: fuzzy-search, spellcheck, levenshtein, spell-check
Symspell
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Stars: ✭ 1,976 (+3139.34%)
Mutual labels: fuzzy-search, spellcheck, levenshtein, spell-check
spellchecker-wasm
SpellcheckerWasm is an extrememly fast spellchecker for WebAssembly based on SymSpell
Stars: ✭ 46 (-24.59%)
Mutual labels: spellcheck, levenshtein, spell-check
SymSpellCppPy
Fast SymSpell written in c++ and exposes to python via pybind11
Stars: ✭ 28 (-54.1%)
Mutual labels: spellcheck, fuzzy-search, spell-check
Did you mean
The gem that has been saving people from typos since 2014
Stars: ✭ 1,786 (+2827.87%)
Mutual labels: spellcheck, spell-check
Dspellcheck
Notepad++ Spell-checking Plug-in
Stars: ✭ 144 (+136.07%)
Mutual labels: spellcheck, spell-check
Dictionaries
Hunspell dictionaries in UTF-8
Stars: ✭ 591 (+868.85%)
Mutual labels: spellcheck, spell-check
Fuzzball.js
Easy to use and powerful fuzzy string matching, port of fuzzywuzzy.
Stars: ✭ 225 (+268.85%)
Mutual labels: fuzzy-search, levenshtein
Wecantspell.hunspell
A port of Hunspell v1 for .NET and .NET Standard
Stars: ✭ 61 (+0%)
Mutual labels: spellcheck, spell-check
Jellyfish
🎐 a python library for doing approximate and phonetic matching of strings.
Stars: ✭ 1,571 (+2475.41%)
Mutual labels: fuzzy-search, levenshtein
levenshtein.c
Levenshtein algorithm in C
Stars: ✭ 77 (+26.23%)
Mutual labels: fuzzy-search, levenshtein
Hunspell
The most popular spellchecking library.
Stars: ✭ 1,196 (+1860.66%)
Mutual labels: spellcheck, spell-check
Misspell Fixer
Simple tool for fixing common misspellings, typos in source code
Stars: ✭ 154 (+152.46%)
Mutual labels: spellcheck, spell-check
Pylanguagetool
Python Library and CLI for the LanguageTool JSON API
Stars: ✭ 62 (+1.64%)
Mutual labels: spellcheck, spell-check
WordSegmentationDP
Word Segmentation with Dynamic Programming
Stars: ✭ 18 (-70.49%)
Mutual labels: spellcheck, spell-check
spell
Spelling correction and string segmentation written in Go
Stars: ✭ 24 (-60.66%)
Mutual labels: spellcheck, spell-check
Ugrep
🔍NEW ugrep v3.1: ultra fast grep with interactive query UI and fuzzy search: search file systems, source code, text, binary files, archives (cpio/tar/pax/zip), compressed files (gz/Z/bz2/lzma/xz/lz4), documents and more. A faster, user-friendly and compatible grep replacement.
Stars: ✭ 626 (+926.23%)
Mutual labels: fuzzy-search
SymSpell. Please visit the SymSpell repository!
SymSpellCompound has been integrated intoCompound aware automatic spelling correction
SymSpellCompound supports compound aware automatic spelling correction of multi-word input strings.
It is built on top of SymSpell's 1 million times faster spelling correction algorithm.
1. Compound splitting & decompounding
SymSpell assumed every input string as single term. SymSpellCompound supports compound splitting / decompounding with three cases:
- mistakenly inserted space within a correct word led to two incorrect terms
- mistakenly omitted space between two correct words led to one incorrect combined term
- multiple input terms with/without spelling errors
Splitting errors, concatenation errors, substitution errors, transposition errors, deletion errors and insertion errors can by mixed within the same word.
2. Automatic spelling correction
- Large document collections make manual correction infeasible and require unsupervised, fully-automatic spelling correction.
- In conventional spelling correction of a single token, the user is presented with spelling correction suggestions.
For automatic spelling correction of long multi-word text the the algorithm itself has to make an educated choice.
Examples:
- whereis th elove hehad dated forImuch of thepast who couqdn'tread in sixthgrade and ins pired him
+ where is the love he had dated for much of the past who couldn't read in sixth grade and inspired him (9 edits)
- in te dhird qarter oflast jear he hadlearned ofca sekretplan y iran
+ in the third quarter of last year he had learned of a secret plan by iran (10 edits)
- the bigjest playrs in te strogsommer film slatew ith plety of funn
+ the biggest players in the strong summer film slate with plenty of fun (9 edits)
- Can yu readthis messa ge despite thehorible sppelingmsitakes
+ can you read this message despite the horrible spelling mistakes (9 edits)
Performance
0.2 milliseconds / word
5000 words / second (single core on 2012 Macbook Pro)
Applications
- Query correction (10–15% of queries contain misspelled terms),
- Chatbots,
- OCR post-processing,
- Automated proofreading.
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].