All Projects β†’ vasilevp β†’ sam

vasilevp / sam

Licence: other
SAM: Software Automatic Mouth (Ported from https://github.com/vidarh/SAM)

Programming Languages

go
31211 projects - #10 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to sam

persian-tts
πŸ”Š A simple human-based text-to-speach synthesiser and ReactNative app for Persian language.
Stars: ✭ 18 (-45.45%)
Mutual labels:  text-to-speech, tts, tts-engine
Tacotron2-PyTorch
Yet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.
Stars: ✭ 118 (+257.58%)
Mutual labels:  text-to-speech, tts
TTS tf
WIP Tensorflow implementation of https://github.com/mozilla/TTS
Stars: ✭ 14 (-57.58%)
Mutual labels:  text-to-speech, tts
ttslearn
ttslearn: Library for Pythonで学ぢ音声合成 (Text-to-speech with Python)
Stars: ✭ 158 (+378.79%)
Mutual labels:  text-to-speech, tts
STYLER
Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech, INTERSPEECH 2021
Stars: ✭ 105 (+218.18%)
Mutual labels:  text-to-speech, tts
FastSpeech2
PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech
Stars: ✭ 163 (+393.94%)
Mutual labels:  text-to-speech, tts
text-to-speech
⚑️ Capacitor plugin for synthesizing speech from text.
Stars: ✭ 50 (+51.52%)
Mutual labels:  text-to-speech, tts
golang-tts
Text-to-Speach golang package based in Amazon Polly service
Stars: ✭ 19 (-42.42%)
Mutual labels:  text-to-speech, tts
Parallel-Tacotron2
PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Stars: ✭ 149 (+351.52%)
Mutual labels:  text-to-speech, tts
dctts-pytorch
The pytorch implementation of DC-TTS
Stars: ✭ 73 (+121.21%)
Mutual labels:  text-to-speech, tts
spokestack-android
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Stars: ✭ 52 (+57.58%)
Mutual labels:  text-to-speech, tts
SpeakIt Vietnamese TTS
Vietnamese Text-to-Speech on Windows Project (zalo-speech)
Stars: ✭ 81 (+145.45%)
Mutual labels:  text-to-speech, tts
WaveGrad2
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Stars: ✭ 55 (+66.67%)
Mutual labels:  text-to-speech, tts
laravel-text-to-speech
πŸ’¬ A wrapper for popular TTS services to create a more simple & uniform API. Currently, only AWS Polly is supported.
Stars: ✭ 26 (-21.21%)
Mutual labels:  text-to-speech, tts
Daft-Exprt
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis
Stars: ✭ 41 (+24.24%)
Mutual labels:  text-to-speech, tts
bingspeech-api-client
Microsoft Bing Speech API client in node.js
Stars: ✭ 32 (-3.03%)
Mutual labels:  text-to-speech, tts
open-speech-corpora
πŸ’Ž A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Stars: ✭ 841 (+2448.48%)
Mutual labels:  text-to-speech, tts
voices
macOS CLI for changing the default TTS (text-to-speech) voice and printing information about and speaking text with multiple voices.
Stars: ✭ 53 (+60.61%)
Mutual labels:  text-to-speech, tts
tts dataset maker
A gui to help make a text to speech dataset.
Stars: ✭ 20 (-39.39%)
Mutual labels:  text-to-speech, tts
LVCNet
LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation
Stars: ✭ 67 (+103.03%)
Mutual labels:  text-to-speech, tts

SAM

Software Automatic Mouth - Tiny Speech Synthesizer

This is a Go port of the great SAM speech synthesizer. It's basically a semi-automatic rewrite from C to Go of what was, in turn, a semi-automatic rewrite from Assembly to C :). Consequently, this is not meant to be readable.

Original repo: https://github.com/vidarh/SAM. Based on this commit.

Why port this to go?

As a challenge and just for fun.


Original README

What is SAM?

Sam is a very small Text-To-Speech (TTS) program written in C, that runs on most popular platforms. It is an adaption to C of the speech software SAM (Software Automatic Mouth) for the Commodore C64 published in the year 1982 by Don't Ask Software (now SoftVoice, Inc.). It includes a Text-To-Phoneme converter called reciter and a Phoneme-To-Speech routine for the final output. It is so small that it will work also on embedded computers. On my computer it takes less than 39KB (much smaller on embedded devices as the executable-overhead is not necessary) of disk space and is a fully stand alone program. For immediate output it uses the SDL-library, otherwise it can save .wav files.

An online version and executables for Windows can be found on the web site: http://simulationcorner.net/index.php?page=sam

Compile

Simply type "make" in your command prompt. In order to compile without SDL remove the SDL statements from the CFLAGS and LFLAGS variables in the file "Makefile".

It should compile on every UNIX-like operating system. For Windows you need Cygwin or MinGW( + libsdl).

Usage

type

./sam I am Sam

for the first output.

If you have disabled SDL try

./sam -wav i_am_sam.wav I am Sam

to get a wav file. This file can be played by many media players available for the PC.

you can try other options like -pitch number -speed number -throat number -mouth number

Some typical values written in the original manual are:

DESCRIPTION          SPEED     PITCH     THROAT    MOUTH
Elf                   72        64        110       160
Little Robot          92        60        190       190
Stuffy Guy            82        72        110       105
Little Old Lady       82        32        145       145
Extra-Terrestrial    100        64        150       200
SAM                   72        64        128       128

It can even sing look at the file "sing" for a small example.

For the phoneme input table look in the Wiki.

A description of additional features can be found in the original manual at http://www.retrobits.net/atari/sam.shtml or in the manual of the equivalent Apple II program http://www.apple-iigs.info/newdoc/sam.pdf

Adaption To C

This program was converted semi-automatic into C by converting each assembler opcode. e. g.

lda 56		=>	A = mem[56];
jmp 38018  	=>	goto pos38018;
inc 38		=>	mem[38]++;
.			.
.			.

Then it was manually rewritten to remove most of the jumps and register variables in the code and rename the variables to proper names. Most of the description below is a result of this rewriting process.

Unfortunately its still a not very good readable. But you should see where I started :)

Short description

First of all I will limit myself here to a very coarse description. There are very many exceptions defined in the source code that I will not explain. Also a lot of code is unknown for me e. g. Code47503. For a complete understanding of the code I need more time and especially more eyes have a look on the code.

Reciter

It changes the english text to phonemes by a ruleset shown in the wiki.

The rule " ANT(I)", "AY", means that if he find an "I" with previous letters " ANT", exchange the I by the phoneme "AY".

There are some special signs in this rules like # & @ ^ + : % which can mean e. g. that there must be a vocal or a consonant or something else.

With the -debug option you will get the corresponding rules and the resulting phonemes.

Output

Here is the full tree of subroutine calls:

SAMMain() Parser1() Parser2() Insert() CopyStress() SetPhonemeLength() Code48619() Code41240() Insert() Code48431() Insert()

Code48547
	Code47574
		Special1
		Code47503
		Code48227

SAMMain() is the entry routine and calls all further routines. Parser1 transforms the phoneme input and transforms it to three tables phonemeindex[] stress[] phonemelength[] (zero at this moment)

This tables are now changed:

Parser2 exchanges some phonemes by others and inserts new. CopyStress adds 1 to the stress under some circumstances SetPhonemeLength sets phoneme lengths. Code48619 changes the phoneme lengths Code41240 adds some additional phonemes Code48431 has some extra rules

The wiki shows all possible phonemes and some flag fields.
The final content of these tables can be seen with the -debug command.

In the function PrepareOutput() these tables are partly copied into the small tables: phonemeindexOutput[] stressOutput[] phonemelengthOutput[] for output.

Final Output

Except of some special phonemes the output is build by a linear combination:

A =   A1 * sin ( f1 * t ) +
      A2 * sin ( f2 * t ) +
      A3 * rect( f3 * t )

where rect is a rectangular function with the same periodicity like sin. It seems really strange, but this is really enough for most types of phonemes.

Therefore the above phonemes are converted with some tables to pitches[] frequency1[] = f1 frequency2[] = f2 frequency3[] = f3 amplitude1[] = A1 amplitude2[] = A2 amplitude3[] = A3

Above formula is calculated in one very good omptimized routine. It only consist of 26 commands:

48087: 	LDX 43		; get phase	
CLC		
LDA 42240,x	; load sine value (high 4 bits)
ORA TabAmpl1,y	; get amplitude (in low 4 bits)
TAX		
LDA 42752,x	; multiplication table
STA 56		; store 

LDX 42		; get phase
LDA 42240,x	; load sine value (high 4 bits)
ORA TabAmpl2,y	; get amplitude (in low 4 bits)
TAX		
LDA 42752,x	; multiplication table
ADC Var56	; add with previous values
STA 56		; and store

LDX 41		; get phase
LDA 42496,x	; load rect value (high 4 bits)
ORA TabAmpl3,y	; get amplitude (in low 4 bits)
TAX		
LDA 42752,x	; multiplication table
ADC 56		; add with previous values

ADC #136		
LSR A		; get highest 4 bits
LSR A		
LSR A		
LSR A		
STA 54296	;SID   main output command

The rest is handled in a special way. At the moment I cannot figure out in which way. But it seems that it uses some noise (e. g. for "s") using a table with random values.

License

The software is a reverse-engineered version of a commercial software published more than 30 years ago. The current copyright holder is SoftVoice, Inc. (www.text2speech.com)

Any attempt to contact the company failed. The website was last updated in the year 2009. The status of the original software can therefore best described as Abandonware (http://en.wikipedia.org/wiki/Abandonware)

As long this is the case I cannot put my code under any specific open source software license Use it at your own risk.

Contact

If you have questions don' t hesitate to ask me. If you discovered some new knowledge about the code please mail me.

Sebastian Macke Email: [email protected]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].