All Projects → Picovoice → octopus

Picovoice / octopus

Licence: Apache-2.0 license
On-device speech-to-index engine powered by deep learning.

Programming Languages

python
139335 projects - #7 most used programming language
typescript
32286 projects
swift
15916 projects
javascript
184084 projects - #8 most used programming language
java
68154 projects - #9 most used programming language
c
50402 projects - #5 most used programming language

Projects that are alternatives of or similar to octopus

react-native-spokestack
Spokestack: give your React Native app a voice interface!
Stars: ✭ 53 (+76.67%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
Vosk
VOSK Speech Recognition Toolkit
Stars: ✭ 182 (+506.67%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Stars: ✭ 841 (+2703.33%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
AmazonSpeechTranslator
End-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.
Stars: ✭ 50 (+66.67%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
Voice Overlay Ios
🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI
Stars: ✭ 440 (+1366.67%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
Vosk Api
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Stars: ✭ 1,357 (+4423.33%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
Sonus
💬 /so.nus/ STT (speech to text) for Node with offline hotword detection
Stars: ✭ 532 (+1673.33%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
musicologist
Music advice from a conversational interface powered by Algolia
Stars: ✭ 19 (-36.67%)
Mutual labels:  speech-recognition, speech-to-text, voice-search
Rhino
On-device speech-to-intent engine powered by deep learning
Stars: ✭ 406 (+1253.33%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
Cheetah
On-device streaming speech-to-text engine powered by deep learning
Stars: ✭ 383 (+1176.67%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
Nativescript Speech Recognition
💬 Speech to text, using the awesome engines readily available on the device.
Stars: ✭ 72 (+140%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
Spokestack Python
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application.
Stars: ✭ 103 (+243.33%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
houndify-sdk-go
The official Houndify SDK for Go
Stars: ✭ 23 (-23.33%)
Mutual labels:  voice-recognition, speech-recognition, voice-search
leopard
On-device speech-to-text engine powered by deep learning
Stars: ✭ 354 (+1080%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
KeenASR-Android-PoC
A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html
Stars: ✭ 21 (-30%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
spokestack-ios
Spokestack: give your iOS app a voice interface!
Stars: ✭ 27 (-10%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
voce-browser
Voice Controlled Chromium Web Browser
Stars: ✭ 34 (+13.33%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
Speech To Text Benchmark
speech to text benchmark framework
Stars: ✭ 481 (+1503.33%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
Voice Overlay Android
🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI
Stars: ✭ 189 (+530%)
Mutual labels:  voice-recognition, speech-recognition, speech-to-text
Voice
🎤 React Native Voice Recognition library for iOS and Android (Online and Offline Support)
Stars: ✭ 993 (+3210%)
Mutual labels:  voice-recognition, speech-recognition

Octopus

GitHub

PyPI Maven Central CocoaPods

Made in Vancouver, Canada by Picovoice

Twitter URL YouTube Channel Views

Octopus is Picovoice's Speech-to-Index engine. It directly indexes speech without relying on a text representation. This acoustic-only approach boosts accuracy by removing out-of-vocabulary limitation and eliminating the problem of competing hypothesis (e.g. homophones)

Table of Contents

Demos

Python Demos

Install the demo package:

sudo pip3 install pvoctopusdemo

Run the following in the terminal:

octopus_demo  --access_key {AccessKey} --audio_paths ${AUDIO_PATHS}

Replace ${AccessKey} with your AccessKey obtained from Picovoice Console and ${AUDIO_PATHS} with a space-separated list of audio files. Octopus starts processing the audio files and asks you for search phrases and shows results interactively.

For more information about the Python demos go to demo/python.

C Demos

Build the demo:

cmake -S demo/c/ -B demo/c/build && cmake --build demo/c/build

Index a given audio file:

./demo/c/build/octopus_index_demo ${LIBRARY_PATH} ${ACCESS_KEY} ${AUDIO_PATH} ${INDEX_PATH}

Then search the index for a given phrase:

./demo/c/build/octopus_search_demo ${LIBRARY_PATH} ${MODEL_PATH} ${ACCESS_KEY} ${INDEX_PATH} ${SEARCH_PHRASE}

Replace ${LIBRARY_PATH} with path to appropriate library available under lib, ${ACCESS_KEY} with AccessKey obtained from Picovoice Console, ${AUDIO_PATH} with the path to a given audio file and format, ${INDEX_PATH} with the path to cached index file and ${SEARCH_PHRASE} to a search phrase.

For more information about C demos go to demo/c.

Android Demos

Using Android Studio, open demo/android/OctopusDemo as an Android project.

Replace "${YOUR_ACCESS_KEY_HERE}" inside MainActivity.java with your AccessKey obtained from Picovoice Console. Then run the demo.

For more information about Android demos go to demo/android.

iOS Demos

From the demo/ios/OctopusDemo, run the following to install the Octopus CocoaPod:

pod install

Replace "{YOUR_ACCESS_KEY_HERE}" inside ViewModel.swift with your AccessKey obtained from Picovoice Console. Then, using Xcode, open the generated OctopusDemo.xcworkspace and run the application.

For more information about iOS demos go to demo/ios.

Web Demos

From demo/web run the following in the terminal:

yarn
yarn start

(or)

npm install
npm run start

Open http://localhost:5000 in your browser to try the demo.

SDKs

Python

Create an instance of the engine:

import pvoctopus

# AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
access_key = "${ACCESS_KEY}"

handle = pvoctopus.create(access_key=access_key)

Index your raw audio data or file:

import pvoctopus

# AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
access_key = "${ACCESS_KEY}"

handle = pvoctopus.create(access_key=access_key)

audio_data = [...]
metadata = handle.index_audio_data(audio_data)

# or

audio_file_path = "/path/to/my/audiofile.wav"
metadata = handle.index_audio_file(audio_file_path)

# Then search the metadata for phrases

matches = handle.search(metadata, phrases=['avocado'])

avocado_matches = matches['avocado']
for match in avocado_matches:
    print(f"Match for `avocado`: {match.start_sec} -> {match.end_sec} ({match.probability})")

When done the handle resources have to be released explicitly handle.delete().

C

pv_octopus.h header file contains relevant information. Build an instance of the object:

    const char *model_path = "..."; // absolute path to the model file available at `lib/common/octopus_params.pv`
    const char *access_key = "..." // AccessKey provided by Picovoice Console (https://console.picovoice.ai/)
    pv_octopus_t *handle = NULL;
    pv_status_t status = pv_octopus_init(access_key, model_path, &handle);
    if (status != PV_STATUS_SUCCESS) {
        // error handling logic
    }

Index audio data using constructed object:

const char *audio_path = "..."; // absolute path to the audio file to be indexed
void *indices = NULL;
int32_t num_indices_bytes = 0;
pv_status_t status = pv_octopus_index_file(handle, audio_path, &indices, &num_indices_bytes);
if (status != PV_STATUS_SUCCESS) {
    // error handling logic
}

Search the indexed data:

const char *phrase = "...";
pv_octopus_match_t *matches = NULL;
int32_t num_matches = 0;
pv_status_t status = pv_octopus_search(handle, indices, num_indices_bytes, phrase, &matches, &num_matches);
if (status != PV_STATUS_SUCCESS) {
    // error handling logic
}

When done be sure to release the acquired resources:

pv_octopus_delete(handle);

Android

Create an instance of the engine:

import ai.picovoice.octopus.*;

final String accessKey = "..."; // AccessKey provided by Picovoice Console (https://console.picovoice.ai/)
try {
    Octopus handle = new Octopus.Builder().setAccessKey(accessKey).build(appContext);
} catch (OctopusException ex) { }

Index audio data using constructed object:

final String audioFilePath = "/path/to/my/audiofile.wav"
try {
    OctopusMetadata metadata = handle.indexAudioFile(audioFilePath);
} catch (OctopusException ex) { }

Search the indexed data:

HashMap <String, OctopusMatch[]> matches = handle.search(metadata, phrases);

for (Map.Entry<String, OctopusMatch[]> entry : map.entrySet()) {
    final String phrase = entry.getKey();
    for (OctopusMatch phraseMatch : entry.getValue()){
        final float startSec = phraseMatch.getStartSec();
        final float endSec = phraseMatch.getEndSec();
        final float probability = phraseMatch.getProbability();
    }
}

When done be sure to release the acquired resources:

metadata.delete();
handle.delete();

iOS

Create an instance of the engine:

import Octopus

let accessKey : String = // .. AccessKey provided by Picovoice Console (https://console.picovoice.ai/)
do {
    let handle = try Octopus(accessKey: accessKey)
} catch { }

Index audio data using constructed object:

let audioFilePath = "/path/to/my/audiofile.wav"
do {
    let metadata = try handle.indexAudioFile(path: audioFilePath)
} catch { }

Search the indexed data:

let matches: Dictionary<String, [OctopusMatch]> = try octopus.search(metadata: metadata, phrases: phrases)
for (phrase, phraseMatches) in matches {
    for phraseMatch in phraseMatches {
        var startSec = phraseMatch.startSec;
        var endSec = phraseMatch.endSec;
        var probability = phraseMatch.probability;
    }
}

When done be sure to release the acquired resources:

handle.delete();

Web

Install the web SDK using yarn:

yarn add @picovoice/octopus-web

or using npm:

npm install --save @picovoice/octopus-web

Create an instance of the engine using OctopusWorker and transcribe an audio file:

import { Octopus } from "@picovoice/octopus-web";
import octopusParams from "${PATH_TO_BASE64_OCTOPUS_PARAMS}";

function getAudioData(): Int16Array {
... // function to get audio data
  return new Int16Array();
}

const octopus = await OctopusWorker.create(
  "${ACCESS_KEY}",
  { base64: octopusParams }
);

const octopusMetadata = await octopus.index(getAudioData());
const searchResult = await octopus.search(octopusMetadata, "${SEARCH_PHRASE}");
console.log(searchResult);

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console. Finally, when done release the resources using octopus.release().

Releases

v1.2.0 August 11th, 2022

  • added language support for French, German, Spanish, Japanese, Korean, Italian, and Portuguese
  • improved testing infrastructure

v1.1.0 May 12th, 2022

  • various bug fixes and improvements

v1.0.0 October 8th, 2021

  • Initial release.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].