All Projects → healzer → Discordspeechbot

healzer / Discordspeechbot

Licence: mit
A speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Discordspeechbot

KeenASR-Android-PoC
A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html
Stars: ✭ 21 (-40%)
Mutual labels:  voice-commands, speech, speech-recognition, speech-to-text
kaldi ag training
Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.
Stars: ✭ 14 (-60%)
Mutual labels:  speech, speech-recognition, speech-to-text
web-voice-processor
A library for real-time voice processing in web browsers
Stars: ✭ 69 (+97.14%)
Mutual labels:  voice-commands, speech-recognition, speech-to-text
Sonus
💬 /so.nus/ STT (speech to text) for Node with offline hotword detection
Stars: ✭ 532 (+1420%)
Mutual labels:  speech-recognition, speech, speech-to-text
ASR-Audio-Data-Links
A list of publically available audio data that anyone can download for ASR or other speech activities
Stars: ✭ 179 (+411.43%)
Mutual labels:  speech, speech-recognition, speech-to-text
react-native-spokestack
Spokestack: give your React Native app a voice interface!
Stars: ✭ 53 (+51.43%)
Mutual labels:  voice-commands, speech-recognition, speech-to-text
simple-obs-stt
Speech-to-text and keyboard input captions for OBS.
Stars: ✭ 89 (+154.29%)
Mutual labels:  speech, speech-recognition, speech-to-text
musicologist
Music advice from a conversational interface powered by Algolia
Stars: ✭ 19 (-45.71%)
Mutual labels:  voice-commands, speech-recognition, speech-to-text
Annyang
💬 Speech recognition for your site
Stars: ✭ 6,216 (+17660%)
Mutual labels:  speech-recognition, speech, speech-to-text
speech to text
how to use the Google Cloud Speech API to transcribe audio/video files.
Stars: ✭ 35 (+0%)
Mutual labels:  speech, speech-recognition, speech-to-text
wav2vec2-live
A live speech recognition using Facebooks wav2vec 2.0 model.
Stars: ✭ 205 (+485.71%)
Mutual labels:  speech, speech-recognition, speech-to-text
Awesome Kaldi
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
Stars: ✭ 393 (+1022.86%)
Mutual labels:  speech-recognition, speech, speech-to-text
anycontrol
Voice control for your websites and applications
Stars: ✭ 53 (+51.43%)
Mutual labels:  speech, speech-recognition, speech-to-text
Java Speech Api
The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.
Stars: ✭ 490 (+1300%)
Mutual labels:  speech-recognition, speech, speech-to-text
Artyom.js
A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
Stars: ✭ 1,011 (+2788.57%)
Mutual labels:  voice-commands, speech-recognition, speech-to-text
Speechbrain.github.io
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Stars: ✭ 242 (+591.43%)
Mutual labels:  speech-recognition, speech, speech-to-text
Lingvo
Lingvo
Stars: ✭ 2,361 (+6645.71%)
Mutual labels:  speech-recognition, speech, speech-to-text
Edgedict
Working online speech recognition based on RNN Transducer. ( Trained model release available in release )
Stars: ✭ 205 (+485.71%)
Mutual labels:  speech-recognition, speech, speech-to-text
deepspeech.mxnet
A MXNet implementation of Baidu's DeepSpeech architecture
Stars: ✭ 82 (+134.29%)
Mutual labels:  speech, speech-recognition, speech-to-text
sova-asr
SOVA ASR (Automatic Speech Recognition)
Stars: ✭ 123 (+251.43%)
Mutual labels:  speech, speech-recognition, speech-to-text

DiscordSpeechBot

A speech-to-text bot for discord with music commands and more written in NodeJS.

Demo:

Discord Speech Bot Demo

Try the bot for yourself on our Discord server: https://discord.gg/ApdTMG9

You can follow the steps below to get this bot up and running.

Heroku

If you don't have a linux server/machine then you can use Heroku for hosting your bot 24/7 and it's free. Under "Resources" tab, use the "worker" deno type, and not the "web" one. You will need to configure the "Config Vars" under "Settings" tab, these are the environment variables from the settings section below.

Tutorial: https://dev.to/codr/discord-ears-bot-on-heroku-4606

Docker

If you prefer using Docker instead of manually installing it. Copy the Dockerfile.sample to Dockerfile and edit it. Near the bottom you have to provide API Credentials either using the settings.json file or setting the ENV variables. Refer to the settings section below for details on these. Once you've configured the Dockerfile you can build and run it:

  1. run docker build -t discordspeechbot . this may take a minute or two.
  2. run docker run -it discordspeechbot
  3. Proceed to Usage section below.

Installation

You need nodeJS version 12+ with npm on your machine. Using shell or command prompt execute the following:

git clone https://github.com/healzer/DiscordSpeechBot.git
cd DiscordSpeechBot
npm install

Settings

Create a (free) discord bot and obtain the API credentials (Bot Token). Here's an easy tutorial: https://www.writebots.com/discord-bot-token/ Note: Give your bot enough permissions or simply grant it Administrator rights.

Create a (free) Spotify developers account to obtain the API credentials (Client Id and Client Secret): https://developer.spotify.com/dashboard/

Create a (free) WitAI account and obtain the API credentials (Server Access Token): https://wit.ai/

Rename the file settings-sample.json to settings.json and enter the obtained API credentials:

{
    "discord_token": "your_token",
    "spotify_token_id": "your_token_id",
    "spotify_token_secret": "your_token_secret",
    "wit_ai_token": "your_token"
}

If you are using Digitalocean Apps, Heroku or another service you can also use Environment Variables instead of a settings file. Configure these with the appropriate values:

DISCORD_TOK
WITAPIKEY
SPOTIFY_TOKEN_ID
SPOTIFY_TOKEN_SECRET

Running

Execute the following in your shell or prompt:

node index.js

Use PM2 to keep the bot running 24/7, it will also restart the bot in case of a crash or on memory limits (2GB default):

pm2 start ecosystem.config.js

Usage

By now you have a discord server, the DiscordSpeechBot is running and is a part of your server. Make sure your server has a text and voice channel.

  1. Enter one of your voice channels.
  2. In one of your text channels type: !join
  3. Type !help for a list of commands.

Examples:

!play https://www.youtube.com/watch?v=vK1YiArMDfg
!play red hot chili peppers californication
!list
!skip

Voice commands

When the bot is inside a voice channel it listens to all speech and tries to detect commands.

Try saying:

music play 'the chemical brothers'
music skip
music play random
music list
music clear list

A successful voice command looks like this:

<long pause> music play 'justin timberlake cry river' <long pause>

Notes:

  • Each voice command starts with music.
  • Each user talks to a separate channel, the bot hears every user separately.
  • Only when your user picture turns green in the voice channel will the bot receive your audio.
  • A long pause interrupts the audio input.
  • (WitAI only) The duration of a single audio input is limited to 20 seconds, longer audio is not transcribed.

Here are some examples which may not work (properly):

<talking> music skip
music skip <talking>
<talking> music skip <talking>
...

music play 'the chemical brothers' <talking>

music <long silence>  play  <long silence> 'the chemical brothers'

Notes:

  • A successful voice command should contain as little noise before and after the command.
  • A successful voice command should should not contain too many/long periods of silence, otherwise the bot will only receive separate words instead of the whole sentence.
  • <long pause> is usually between 1 and 2 seconds, long enough for discord to stop processing your audio input.
  • If you have a very sensitive microphone or a lot of (background) noise, then voice commands may not work properly for you.

For developers

Music lagging or stuttering? Try this

Using Mozilla DeepSpeech for speech recognition, tutorial.

Language

WitAI supports over 120 languages (https://wit.ai/faq), however only one language can be used at a time. If you're not speaking English on Discord, then change your default language on WitAI under "settings" for your app.

You can also change the language using the following bot command:

!lang <code>

!lang en     for English
!lang es     for Spanish
!lang ru     for Russian
...

The bot should reply with a success message.

<code> should be an ISO 639-1 language code (2 digits):
https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes

Speech-To-Text

By default WitAI's free API is used for voice recognition / transcription. But you can easily integrate any other API into the bot. You can use Google's Speech-to-Text API as follows:

  1. Open index.js, inside the function transcribe(file) make sure that transcribe_gspeech is being used and the other one(s) are disabled.
  2. You may want to adjust the languageCode value if you're speaking a non-English language.
  3. Enable Google Speech API here: https://console.cloud.google.com/apis/library/speech.googleapis.com
  4. Create a new Service Account (or use your existing one): https://console.cloud.google.com/apis/credentials
  5. Create a new Service Account Key (or use existing) and download the json file.
  6. Put the json file inside your bot directory and rename it to gspeech_key.json.

Contact

For enquiries or issues get in touch with me:

Name: Ilya Nevolin

Email: [email protected]

Discord: https://discord.gg/ApdTMG9

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].