All Projects → mobvoi → speech_sdk

mobvoi / speech_sdk

Licence: other
Chinese Speech SDK for Android, iOS and embedded Linux platforms. http://ai.mobvoi.com

Programming Languages

c
50402 projects - #5 most used programming language

Mobvoi SDK for Speech Recognition and Text-to-Speech Synthesis

SDK for the development of Automatic Speech Recognition (ASR) and Text-to-Speech Synthesis (TTS) applications

Features

  • Hotword Wakeup
  • Online speech interaction, which includes
    • ASR
    • Natural Language Understanding
    • Dialogue Management
    • Vertical Search
    • TTS
  • Offline ASR
  • Online/Offline mixed ASR
  • Multi-Keywords Activation
  • Online TTS
  • Offline TTS
  • Online/Offline mixed TTS

Language support

  • Mandarin Chinese
  • American English (Coming soon)
  • Cantonese (Coming soon)

Supported platforms

The SDK is validated on the following platforms:

  • Ubuntu Linux on x86_64
  • Linux on ARMv7 (Raspberry Pi)
  • Linux on ARMv8 (Raspberry Pi)

Directory hierarchy

File/Directoy Purpose
doc Contains SDK documentation
include Contains the SDK header file (speech_sdk.h)
lib Contains the library (libmobvoisdk.so) for different platforms
.mobvoi Contains the configurations for the SDK. It is a hidden directory
samples Sample code and binaries built based on the SDK

Usage

  • The .mobvoi (hidden directory) contains info for SDK to run. SDK also writes to the directory. So you should install it to a writable directory
  • Pass the location (.mobvoi directory's parent directory) to mobvoi_sdk_init() in your program
  • Create your program according to the SDK documentation and the sample code in samples/src/
  • When building your program, link libmobvoisdk.so provided in lib/{arch}/
  • When running your program, specify the location for libmobvoisdk.so to LD_LIBRARY_PATH environment variable

Samples

Several sample programs are provided in the samples/ directory:

Program Purpose
asr Shows how to do hotword wakeup and speech recognition
mix_tts Shows how to make use of the TTS function
multi_keywords Shows how to make use of the multi-keywords activation function

The binaries on different platforms are also provided.

To run the binaries, specify the location of the libmobvoisdk.so to LD_LIBRARY_PATH. The following shows how to run the x86_64 version asr:

cd samples/bin
LD_LIBRARY_PATH=../../lib/x86_64 ./x86_64_asr online

Note:

  • The wakeup word is "Ni Hao Wen Wen" (你好问问)

Trouble shooting

Hints for SDK trouble shooting:

  • SDK generates logs when it runs. So you can examine the logs to get clues
  • You can get more detailed logs by invoking mobvoi_set_vlog_level()
    • Invoking mobvoi_set_vlog_level(3) also saves the received PCM audio streams to .mobvoi/audio_dump/record.pcm

Documentation

Please refer to the online SDK documentation.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].