Alternatives and detailed information of pie

HMS ML Demo provides an example of integrating Huawei ML Kit service into applications. This example demonstrates how to integrate services provided by ML Kit, such as face detection, text recognition, image segmentation, asr, and tts.

Stars: ✭ 187 (+201.61%)

Mutual labels: asr

Edgedict

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

Stars: ✭ 205 (+230.65%)

Mutual labels: asr

Asr audio data links

A list of publically available audio data that anyone can download for ASR or other speech activities

Stars: ✭ 128 (+106.45%)

Mutual labels: asr

megs

A merged version of multiple open-source German speech datasets.

Stars: ✭ 21 (-66.13%)

Mutual labels: asr

Asr Evaluation

Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).

Stars: ✭ 190 (+206.45%)

Mutual labels: asr

Zeroth

Kaldi-based Korean ASR (한국어 음성인식) open-source project

Stars: ✭ 248 (+300%)

Mutual labels: asr

Mrcp Plugin With Freeswitch

使用FreeSWITCH接受用户手机呼叫，通过UniMRCP Server集成讯飞开放平台（xfyun）插件将用户语音进行语音识别（ASR），并根据自定义业务逻辑调用语音合成（TTS），构建简单的端到端语音呼叫中心。

Stars: ✭ 168 (+170.97%)

Mutual labels: asr

End2end Asr Pytorch

End-to-End Automatic Speech Recognition on PyTorch

Stars: ✭ 175 (+182.26%)

Mutual labels: asr

Chinese text normalization

Chinese text normalization for speech processing

Stars: ✭ 242 (+290.32%)

Mutual labels: asr

Speecht

An opensource speech-to-text software written in tensorflow

Stars: ✭ 152 (+145.16%)

Mutual labels: asr

Wukong Robot

🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，还可能是首个支持脑机交互的开源智能音箱项目。

Stars: ✭ 3,110 (+4916.13%)

Mutual labels: asr

Listen Attend Spell

A PyTorch implementation of Listen, Attend and Spell (LAS), an End-to-End ASR framework.

Stars: ✭ 147 (+137.1%)

Mutual labels: asr

Lingvo

Stars: ✭ 2,361 (+3708.06%)

Mutual labels: asr

leopard

On-device speech-to-text engine powered by deep learning

Stars: ✭ 354 (+470.97%)

Mutual labels: asr

rasr

The RWTH ASR Toolkit.

Stars: ✭ 43 (-30.65%)

Mutual labels: asr

Kerasdeepspeech

A Keras CTC implementation of Baidu's DeepSpeech for model experimentation

Stars: ✭ 245 (+295.16%)

Mutual labels: asr

View All Similar Projects ➔

百度云流式语音识别客户端

项目结构

audio-streaming-server-cpp：c++/c版本client sdk
audio-streaming-server-java：java版本client sdk
audio-streaming-server-python：python版本client sdk
audio-streaming-server-c#：c#版本client sdk, beta version
audio-streaming-server-go：go版本client sdk
android-demo：基于java sdk实现的android实时音频流识别的demo app
ios-demo 基于 proto 实现的 ios 实时音频流识别的 demo app，内部实现了 ios 的 grpc client
java-demo 基于java sdk实现的不同方案的demo
windows c++ windows c++版本client sdk

功能

本部分代码为asr streaming client端，支持的场景如下：

大音频文件的识别
音频流url的识别
管道音频流的识别
实时音频流的识别

常用参数

以下列举了常用的参数，具体参数可以参考对应目录的client

url：asr streaming server端的ip（需要联系百度同学获取）
port：asr streaming server端服务对应的端口号
enable_flush_data：是否连续输出，False表示一次只输出每段话识别的结果
product_id：每个product id对应一个后端解码器的模型
send_per_seconds：设置server发包间隔时间，推荐值为0.02，即20ms。发包大小会根据此值计算,计算方式为：发包大小 = send_per_seconds * 音频采样率 * 采样点字节数。对于8k音频，发包大小为320，16k音频，发包大小为640。
sleep_ratio：默认为1，在send_per_seconds和发包大小都使用推荐值的情况下，代表了实时音频流的处理速率。如果要加速处理，可以适当减小sleep_ratio，比如sleep_ratio=0.5时，代表了以两倍速率进行处理。处理速率过块，可能会造成丟字。正常情况下，推荐设置为1.

Examples

在对应的目录下找到demo client运行：

Issues

相关问题可以直接提交issue，也可以提交给百度同学

Contact Us

个人用户请至ai.baidu.com体验，企业客户体验之前请联系百度同学要到streaming server的ip和port、对应的product id，并且添加白名单信息（需要提供client出口ip，可以通过curl cip.cc获得）

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

baidubce / pie

Programming Languages

Labels

Projects that are alternatives of or similar to pie

百度云流式语音识别客户端

项目结构

功能

常用参数

Examples

Issues

Contact Us