All Projects → HoldOffHunger → convert-british-to-american-spellings

HoldOffHunger / convert-british-to-american-spellings

Licence: BSD-3-Clause license
Convert text so that British spellings are swapped with their Americanized form or vice versa.

Programming Languages

PHP
23972 projects - #3 most used programming language

Projects that are alternatives of or similar to convert-british-to-american-spellings

cmu-pronouncing-dictionary
The 134,000+ words and their pronunciations in the CMU pronouncing dictionary
Stars: ✭ 46 (+76.92%)
Mutual labels:  english, spelling
lingose-notation
The best mnemonics and notational system of English words.
Stars: ✭ 17 (-34.62%)
Mutual labels:  english, spelling
subtlex-word-frequencies
A list of words from the SUBTLEX movie subtitles corpus, sorted by frequency.
Stars: ✭ 25 (-3.85%)
Mutual labels:  english, american
cpwp
Chinese Programmer Wrong Pronunciation
Stars: ✭ 42 (+61.54%)
Mutual labels:  english
matlab-novice-inflammation
Programming with MATLAB
Stars: ✭ 26 (+0%)
Mutual labels:  english
translate english
Java程序员阅读源码必知英语单词
Stars: ✭ 24 (-7.69%)
Mutual labels:  english
FCH-TTS
A fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese, Japanese, Korean, Russian and Tibetan (so far). 快速语音合成模型,适用于英语、普通话/中文、日语、韩语、俄语和藏语(当前已测试)。
Stars: ✭ 154 (+492.31%)
Mutual labels:  english
introduction-to-conda-for-data-scientists
Introduction to Conda for (Data) Scientists
Stars: ✭ 35 (+34.62%)
Mutual labels:  english
Britfone
British English pronunciation dictionary
Stars: ✭ 66 (+153.85%)
Mutual labels:  english
rhymes
Give me an English word and I’ll give you a list of rhymes
Stars: ✭ 34 (+30.77%)
Mutual labels:  english
introcsharpbook
"Fundamentals of Computer Programming with C#" Book
Stars: ✭ 12 (-53.85%)
Mutual labels:  english
shell-genomics
Introduction to the Command Line for Genomics
Stars: ✭ 54 (+107.69%)
Mutual labels:  english
BSD
The Business Scene Dialogue corpus
Stars: ✭ 51 (+96.15%)
Mutual labels:  english
nextword
Predict next English words.
Stars: ✭ 65 (+150%)
Mutual labels:  english
spellchecker-wasm
SpellcheckerWasm is an extrememly fast spellchecker for WebAssembly based on SymSpell
Stars: ✭ 46 (+76.92%)
Mutual labels:  spelling
English-Persian-Word-Database
English Persian Word Database - Popular database extensions
Stars: ✭ 19 (-26.92%)
Mutual labels:  english
python-aos-lesson
Python for Atmosphere and Ocean Scientists
Stars: ✭ 78 (+200%)
Mutual labels:  english
asyncomplete-nextword.vim
Provides intelligent English autocomplete for asyncomplete.vim via nextword
Stars: ✭ 43 (+65.38%)
Mutual labels:  english
Pluralize.NET
📘 Pluralize or singularize any English word.
Stars: ✭ 50 (+92.31%)
Mutual labels:  english
folket
Swedish–English dictionary for macOS (December 20, 2020)
Stars: ✭ 31 (+19.23%)
Mutual labels:  english

US/UK Spelling Converter

You provide the text, with either US/UK-spelling.

We return the same text, converted to either system.

We have you covered -- for about 20,000 words.

TOC

  1. TOC
  2. Online Demos
  3. Features
  4. Functionality
  5. Example Usage
  6. Code Structure and Design

Online Demos

Check out the code in an online demo...

Simple Demo Hosted by Us

Editable, Online Sandbox Demo (at IDEone.com)

Note: Since there are text limits to online compilers, we reduced the actual list of words covered to make this demo run.

Features

Regularly updated! Please submit corrections, additions, fixes, anything!

How many words are covered?

  • Total of 20,000 words covered, with multiple sources.
    • Source: VarCon/ISpell (18,000 words).
    • Source: WordsWorldWide (8,000 words).
    • Source: Our own personal list.
      • BtA List: Literary and archaic British variants (1500's to 1900's): (~500 words).
      • BtA List: Alternative Latinized spellings of Russian and French names: (~1,500 words).
      • BtA List: Alternative dashed-form words ("hundredfold" versus "hundred-fold"): (~2,000 words).
    • These lists were used to cross-check each other, correct errors, and remove duplicates.
    • Letter-sorted lists for easily updating and checking on words: A (1314 words), B (687 words), C (1,807 words), D (1,427 words), E (948 words), F (678 words), G (654 words), H (1,066 words), I (590 words), J (149 words), K (264 words), L (641 words), M (1,312 words), N (716 words), O (532 words), P (2,273 words), Q (57 words), R (1,071 words), S (2,024 words), T (800 words), U (1,259 words), V (450 words), W (177 words), X (0 words), Y (75 words), z (63 words).
  • Variants for British words.
    • For example, "unrealisable" and "unrealiseable".
  • Words are defined with simple associative array, making for a quick transfer to Perl, C++, Java, etc..
    • For example, the syntax of somekey=>"somevalue" is widely-used throughout many languages, or easily converted to their versions of this syntax.
  • Permissively-licensed
    • Do whatever you want with the code!
    • For example, see what others are doing with their personal, commercial, and legal rights as endowed by BSD-3-clause-licensed software.

Functionality

General Behavior

How in general does it work?

  • Exact / Error-Resistant
    • British/American Spelling Converter uses regular expression checking with /\b$word\b/, so this makes it impossible to corrupt words.
    • For example, "Ax" becomes "Axe", but "Axiomatic" will remain as "Axiomatic", and cannot become "Axeiomatic", which would be incorrect.
  • Fast / Efficient
    • Every mass-replace is done within a single preg_replace() call, using arrays as arguments
    • This means that the script will finish much sooner.
  • Reliable / Atomic / Deterministic
    • American-ize/British-ify will not corrupt meaning.
    • For example, 'discus' and 'diskus' have reverse meanings in US/UK, swapping them in or out will cause the text to change each time you "Americanize" or "Britishify" it. So, we don't do these types of swaps.

Precise Behavior - Use Cases

How exactly does it work?

  • Only all lower case, all upper case, or first letter capitalized versions are converted.
    • Example: American=>English, "axe"=>"ax", "AXE" would be converted to "AX" or vice versa, but "AxE would not be converted to Ax".
  • Apostrophes are treated as word boundaries.
    • Example: American=>English, "axe"=>"ax", "the ax's handle" would be converted to "the axe's handle."
  • Only precisely whole, known words are converted.
    • Example: American=>English, "axe"=>"ax", this will not convert "axed" to axd", because the "-d" concluding character indicates that it is an entirely different word.
  • Dashes are treated as word boundaries only when not preceded and followed by a dash.
    • Example: American=>English, "affecteffect=>affect-effect", this will convert "the affect-effect of it" to "the affecteffect of it", but it will not convert "these every-night-affect-effect-happenings are" to "these every-every-night-affecteffect-happenings are", as the dash here implies new meaning than when solely alone.
  • British alternates are handled.
    • Example: American=>English, "amoebas"=>["amoebae", "amebas", "amebae",], if converting to English, "amoebas" will be replaced with "amoebae", the most contemporary term, and if converting to American, "amoebae", "amebas", etc., will all be converted to the single, American equivalent.

Some test sentences...

The neighbour walked to the theatre's centre, manoeuvred about the sabre, and proceeded to reconnoitre the sepulchre in ochre.

The rumour spread that splendour and flavour were affected by our behaviour, so walk a metre in my mitre while carrying a litre of nitre.

The connexion with industrialisation remains with the municipalisation of the calibre of the fibre of the spectre, not with the meagre and sombre saltpetre with all its colour and honour.

Example Usage

How do I use the British/American Spelling Converter?

Americanize Text Example

How do I convert British-spelling text to American-spelling text?

require('AmericanBritishSpellings.php');
$american_british_spellings = new AmericanBritishSpellings([]);

$text = "Axiomatically ax that door, would you, my neighbour?";     // British input text source

$americanized = $american_british_spellings->SwapBritishSpellingsForAmericanSpellings(['text'=>$text]);

print($americanized);   // output: Axiomatically axe that door, would you, my neighbor?

Britishize Text Example

How do I convert American-spelling text to British-spelling text?

require('AmericanBritishSpellings.php');
$american_british_spellings = new AmericanBritishSpellings([]);

$text = "Axiomatically axe that door, would you, my neighbor?";     // American input text source

$britishized = $american_british_spellings->SwapAmericanSpellingsForBritishSpellings(['text'=>$text]);

print($britishized);   // output: Axiomatically ax that door, would you, my neighbour?

Code Structure and Design

Coding Languages

What coding languages are used in the British/American Spelling Converter?

The entire project is coded in the following...

  • PHP - For processing the text and storing the US/UK words.

Exclude List

How do you avoiding adding words that would break the deterministic / atomistic model of functionality?

We do this with an exclude list, which also details the conflict in the words themselves.

Check it out: Exclude List.

AmericanBritishSpellings.php - Technical Overview

What are the functions in the sourcecode files for?

AmericanBritishSpellings.php

Class for converting text from US/UK spellings to US/UK spellings.

  • __construct($args)
    • Constructor.
    • Load the words into the converter class for ready use.
  • SwapBritishSpellingsForAmericanSpellings($args)
    • Convert text with British spellings to text with American spellings.
  • SwapAmericanSpellingsForBritishSpellings($args)
    • Convert text with American spellings to text with British spellings.
  • GetSpellingsAndReplacements($args)
    • Get spellings and replacements based on the desired end language.
  • BuildSpellingAlternates($args)
    • Building spelling alternatives for British and American dialects.
  • BuildSpellingAlternatesForLanguage($args)
    • Building spelling alternates for a single particular dialect of a language (either British or American, in our case).
  • BuildSearchRegex($args)
    • Build an array of search regexes when given an array of search terms.
  • BuildSearchRegex($args)
    • Build a single search regex for a single search term.
  • BuildSpellingReplacements()
    • Build the replacements to be used for the search terms.

AmericanBritishSpellings_Words.php

Class for building word lists for converting UK/US english dialects.

  • __construct($args)
    • Constructor.
    • Nothing to do here.
  • GetBritishToAmericanSpellings()
    • Build a mapping of British to American spellings.
  • GetAmericanToBritishSpellings()
    • Build a mapping of American to British spellings from the /Language/Words/AmericanBritish/ classes.

AmericanBritishWords_A.php ... AmericanBritishWords_Z.php

  • __construct($args)
    • Constructor.
    • Load the words into the converter class for ready use.
  • AmericanBritishWords()
    • List of US/UK spellings for words starting with : A...Z.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].