All Projects → asticode → Go Astibob

asticode / Go Astibob

Licence: mit
Golang framework to build an AI that can understand and speak back to you, and everything else you want

Programming Languages

go
31211 projects - #10 most used programming language
golang
3204 projects

Projects that are alternatives of or similar to Go Astibob

Speech recognition
Speech recognition module for Python, supporting several engines and APIs, online and offline.
Stars: ✭ 5,999 (+2602.25%)
Mutual labels:  speech-to-text, audio
Dragonfire
the open-source virtual assistant for Ubuntu based Linux distributions
Stars: ✭ 1,120 (+404.5%)
Mutual labels:  speech-to-text, text-to-speech
Botium Speech Processing
Botium Speech Processing
Stars: ✭ 908 (+309.01%)
Mutual labels:  speech-to-text, text-to-speech
leon
🧠 Leon is your open-source personal assistant.
Stars: ✭ 8,560 (+3755.86%)
Mutual labels:  text-to-speech, speech-to-text
Spokestack Python
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application.
Stars: ✭ 103 (-53.6%)
Mutual labels:  speech-to-text, text-to-speech
musicologist
Music advice from a conversational interface powered by Algolia
Stars: ✭ 19 (-91.44%)
Mutual labels:  text-to-speech, speech-to-text
Audio Pretrained Model
A collection of Audio and Speech pre-trained models.
Stars: ✭ 61 (-72.52%)
Mutual labels:  speech-to-text, audio
open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Stars: ✭ 841 (+278.83%)
Mutual labels:  text-to-speech, speech-to-text
Speech And Text
Speech to text (PocketSphinx, Iflytex API, Baidu API) and text to speech (pyttsx3) | 语音转文字(PocketSphinx、百度 API、科大讯飞 API)和文字转语音(pyttsx3)
Stars: ✭ 102 (-54.05%)
Mutual labels:  speech-to-text, text-to-speech
Openseq2seq
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Stars: ✭ 1,378 (+520.72%)
Mutual labels:  speech-to-text, text-to-speech
bingspeech-api-client
Microsoft Bing Speech API client in node.js
Stars: ✭ 32 (-85.59%)
Mutual labels:  text-to-speech, speech-to-text
Aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Stars: ✭ 1,942 (+774.77%)
Mutual labels:  audio, text-to-speech
Athena
A free and open source replacement for Google Assistant on Android devices, meant to integrate with the Sapphire Framework. It contains both speech-to-text and text-to-speech services. It does not require Google services or network connectivity
Stars: ✭ 73 (-67.12%)
Mutual labels:  text-to-speech, speech-to-text
Google Speech V2
💬 Reverse Engineering Google's Speech To Text API (v2)
Stars: ✭ 435 (+95.95%)
Mutual labels:  audio, text-to-speech
spokestack-ios
Spokestack: give your iOS app a voice interface!
Stars: ✭ 27 (-87.84%)
Mutual labels:  text-to-speech, speech-to-text
Soloud
Free, easy, portable audio engine for games
Stars: ✭ 1,048 (+372.07%)
Mutual labels:  speech-to-text, audio
web-speech-cognitive-services
Polyfill Web Speech API with Cognitive Services Bing Speech for both speech-to-text and text-to-speech service.
Stars: ✭ 35 (-84.23%)
Mutual labels:  text-to-speech, speech-to-text
AmazonSpeechTranslator
End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.
Stars: ✭ 50 (-77.48%)
Mutual labels:  text-to-speech, speech-to-text
Watbot
An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.
Stars: ✭ 64 (-71.17%)
Mutual labels:  speech-to-text, text-to-speech
Awesome Ai Services
An overview of the AI-as-a-service landscape
Stars: ✭ 133 (-40.09%)
Mutual labels:  speech-to-text, text-to-speech

GoReportCard GoDoc

Golang framework to build an AI that can understand and speak back to you, and everything else you want.

WARNING: the code below doesn't handle errors for readability purposes, however you SHOULD!

Demos

Here's a list of AIs built with astibob (if you're using astibob and want your project to be listed here, please submit a PR):

How it works

Overview

Overview

  • humans operate the AI through the Web UI
  • the Web UI interacts with the AI through the Index
  • the Index keeps an updated list of all Workers and forwards Web UI messages to Workers and vice versa
  • Workers have one or more Abilities and are usually located on different machines
  • Abilities run simple tasks such as reading an audio input (e.g. a microphone), executing speech-to-text analyses or doing speech-synthesis
  • Abilities can communicate directly between each other even if on different Workers
  • all communication is done via JSON messages exchanged through HTTP or Websocket

FAQ

  • Why split abilities between several workers?

    Because abilities may need to run on different machines located in different part of the world. The simplest example is wanting to read microphones inputs located in several rooms of your house. Each microphone is an ability whereas each room of your house is a worker.

Install the project

Run the following command:

$ go get -u github.com/asticode/go-astibob/...

I want to see some code

Index

// Create index
i, _ := index.New(index.Options{
    Server: astibob.ServerOptions{
        Addr:     "127.0.0.1:4000",
        Password: "admin",
        Username: "admin",
    },
})

// Make sure to properly close the index
defer i.Close()

// Handle signals
i.HandleSignals()

// Serve
i.Serve()

// Blocking pattern
i.Wait()

Worker

// Create worker
w := worker.New("Worker #1", worker.Options{
    Index: astibob.ServerOptions{
        Addr:     "127.0.0.1:4000",
        Password: "admin",
        Username: "admin",
    },
    Server: astibob.ServerOptions{Addr: "127.0.0.1:4001"},
})

// Make sure to properly close the worker
defer w.Close()

// Create runnables
r1 := pkg1.NewRunnable("Runnable #1")
r2 := pkg2.NewRunnable("Runnable #2")

// Register runnables
w.RegisterRunnables(
	worker.Runnable{
        AutoStart: true,
        Runnable:  r1,
    },
	worker.Runnable{
        Runnable:  r2,
    },
)

// Create listenables
l1 := pkg3.NewListenable(pkg3.ListenableOptions{
	OnEvent1: func(arg1 string) { log.Println(arg1) },
})
l2 := pkg4.NewListenable(pkg4.ListenableOptions{
	OnEvent2: func(arg2 string) { log.Println(arg2) },
})

// Register listenables
w.RegisterListenables(
	worker.Listenable{
        Listenable: l1,
        Runnable:   "Runnable #1",
        Worker:     "Worker #1",
    },
	worker.Listenable{
        Listenable: l2,
        Runnable:   "Runnable #3",
        Worker:     "Worker #2",
    },
)

// Handle an event and send a message to one of the runnables
w.On(astibob.DispatchConditions{
    From: astibob.NewRunnableIdentifier("Runnable #1", "Worker #1"),
    Name: astikit.StrPtr("Event #1"),
}, func(m *astibob.Message) (err error) {
    // Send message
    if err = w.SendMessages("Worker #1", "Runnable #1", pkg2.NewMessage1("Hello world")); err != nil {
        err = errors.Wrap(err, "main: sending message failed")
        return
    }
    return
})

// Handle signals
w.HandleSignals()

// Serve
w.Serve()

// Register to index
w.RegisterToIndex()

// Blocking pattern
w.Wait()

Abilities

The framework comes with a few abilities located in the abilities folder:

Audio input

This ability allows you reading from an audio stream e.g. a microphone.

Dependencies

It's strongly recommended to use PortAudio and its astibob wrapper.

To know which devices are available on the machine run:

$ go run abilities/audio_input/portaudio/cmd/main.go

Runnable and operatable

// Create portaudio
p := portaudio.New()

// Initialize portaudio
p.Initialize()

// Make sure to close portaudio
defer p.Close()

// Create default stream
s, _ := p.NewDefaultStream(portaudio.StreamOptions{
    BitDepth:             32,
    BufferLength:         5000,
    MaxSilenceLevel:      5 * 1e6,
    NumInputChannels:     2,
    SampleRate:           44100,
})

// Create runnable
r := audio_input.NewRunnable("Audio input", s)

// Register runnables
w.RegisterRunnables(worker.Runnable{
    AutoStart: true,
    Runnable:  r,
})

// Register listenables
// This is mandatory for the Web UI to work properly
w.RegisterListenables(worker.Listenable{
    Listenable: r,
    Runnable:   "Audio input",
    Worker:     "Worker #1",
})

Listenable

// Register listenables
w.RegisterListenables(
    worker.Listenable{
        Listenable: audio_input.NewListenable(audio_input.ListenableOptions{
            OnSamples: func(from astibob.Identifier, samples []int, bitDepth, numChannels, sampleRate int, maxSilenceLevel float64) (err error) {
                // TODO Do something with the samples
                return
            },
        }),
        Runnable: "Audio input",
        Worker:   "Worker #1",
    },
)

Speech to Text

This ability allows you to execute speech-to-text analyses.

Dependencies

It's strongly recommended to install DeepSpeech and its astibob wrapper.

I don't want to train a new model

  • create a working directory (for simplicity purposes, we'll assume its absolute path is /path/to/deepspeech)

  • download a client native_client.<your system>.tar.xz" matching your system at the bottom of this page

  • create the /path/to/deepspeech/lib directory and extract the client content inside it

  • create the /path/to/deepspeech/include directory and download deepspeech.h inside it

  • create the /path/to/deepspeech/model/en directory, and download and extract the english model inside it

  • whenever you run a worker that needs deepspeech, make sure to have the following environment variables:

      CGO_CXXFLAGS="-I/path/to/deepspeech/include"
      LIBRARY_PATH=/path/to/deepspeech/lib:$LIBRARY_PATH
      LD_LIBRARY_PATH=/path/to/deepspeech/lib:$LD_LIBRARY_PATH
    

I want to train a new model

In addition to the steps above:

  • create the /path/to/deepspeech/model/custom directory
  • run git clone https://github.com/mozilla/DeepSpeech inside /path/to/deepspeech
  • install the dependencies

Runnable and Operatable

// Create deepspeech
mp := "/path/to/deepspeech/model/en"
d := deepspeech.New(deepspeech.Options{
    AlphabetPath:   mp + "/alphabet.txt",
    BeamWidth:      1024,
    ClientPath:     "/path/to/deepspeech/DeepSpeech/DeepSpeech.py",
    LMPath:         mp + "/lm.binary",
    LMWeight:       0.75,
    ModelPath:      mp + "/output_graph.pb",
    PrepareDirPath: "/path/to/deepspeech/prepare",
    TrainingArgs: map[string]string{
        "checkpoint_dir":   "/path/to/deepspeech/model/custom/checkpoints",
        "dev_batch_size":   "4",
        "export_dir":       "/path/to/deepspeech/model/custom",
        "noearly_stop":     "",
        "test_batch_size":  "4",
        "train_batch_size": "20",

        // Mozilla values
        "learning_rate": "0.0001",
        "dropout_rate":  "0.15",
        "lm_alpha":      "0.75",
        "lm_beta":       "1.85",
    },
    TriePath:             mp + "/trie",
    ValidWordCountWeight: 1.85,
})

// Make sure to close deepspeech
defer d.Close()

// Initialize deepspeech
d.Init()

// Create runnable
r := speech_to_text.NewRunnable("Speech to Text", d, speech_to_text.RunnableOptions{
    SpeechesDirPath: "/path/to/speech_to_text/speeches",
})

// Initialize runnable
r.Init()

// Make sure to close the runnable
defer r.Close()

// Register runnables
w.RegisterRunnables(worker.Runnable{
    AutoStart: true,
    Runnable:  r,
})

// Send samples
w.SendMessage(worker.MessageOptions{
    Message:  speech_to_text.NewSamplesMessage(
        from,
        samples,
        bitDepth,
        numChannels,
        sampleRate,
        maxSilenceLevel,
    ),
    Runnable: "Speech to Text",
    Worker:   "Worker #3",
})

Listenable

// Register listenables
w.RegisterListenables(
    worker.Listenable{
        Listenable: speech_to_text.NewListenable(speech_to_text.ListenableOptions{
            OnText: func(from astibob.Identifier, text string) (err error) {
                // TODO Do something with the text
                return
            },
        }),
        Runnable: "Speech to Text",
        Worker:   "Worker #3",
    },
)

Text to Speech

This ability allows you to run speech synthesis.

Dependencies

It's strongly recommended to use astibob wrapper.

If you're using Linux it's strongly recommended to use ESpeak.

Runnable

// Create speaker
s := speak.New(speak.Options{})

// Initialize speaker
s.Initialize()

// Make sure to close speaker
defer s.Close()

// Register runnables
w.RegisterRunnables(worker.Runnable{
    AutoStart: true,
    Runnable:  text_to_speech.NewRunnable("Text to Speech", s),
})

// Say something
w.SendMessage(worker.MessageOptions{
    Message:  text_to_speech.NewSayMessage("Hello world"),
    Runnable: "Text to Speech",
    Worker:   "Worker #1",
})

Create your own ability

Creating your own ability is pretty straight-forward: you need to create an object that implements the astibob.Runnable interface. Optionally it can implement the astibob.Operatable interface as well.

If you want other abilities to be able to interact with it you'll need to create another object that implements the astibob.Listenable interface.

I strongly recommend checking out how provided abilities are built and trying to copy them first.

Runnable

The quickest way to implement the astibob.Runnable interface is to add an embedded astibob.BaseRunnable attribute to your object.

You can then use astibob.NewBaseRunnable to initialize it which allows you providing the proper options.

Operatable

The quickest way to implement the astibob.Operatable interface is to add an embedded astibob.BaseOperatable attribute to your object.

You can then use the cmd/operatable command to generate an operatable.go file binding your resources folder containing your static and template files. You can finally add custom routes manually to the astibob.BaseOperatable using the AddRoute method.

Listenable

No shortcut here, you need to create an object that implements the astibob.Listenable interface yourself.

Contribute

If you've created an awesome Ability and you feel it could be of interest to the community, create a PR here.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].