All Projects → zw76859420 → Asr_syllable

zw76859420 / Asr_syllable

基于卷积神经网络的语音识别声学模型的研究

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Asr syllable

Neural sp
End-to-end ASR/LM implementation with PyTorch
Stars: ✭ 408 (+221.26%)
Mutual labels:  attention, asr, ctc
Sightseq
Computer vision tools for fairseq, containing PyTorch implementation of text recognition and object detection
Stars: ✭ 116 (-8.66%)
Mutual labels:  attention, densenet, ctc
Pytorch Asr
ASR with PyTorch
Stars: ✭ 124 (-2.36%)
Mutual labels:  asr, densenet, ctc
Text Classification
Implementation of papers for text classification task on DBpedia
Stars: ✭ 682 (+437.01%)
Mutual labels:  cnn, attention
Athena
an open-source implementation of sequence-to-sequence based speech processing engine
Stars: ✭ 542 (+326.77%)
Mutual labels:  asr, ctc
Speech Transformer
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
Stars: ✭ 565 (+344.88%)
Mutual labels:  attention, asr
Nmtpytorch
Sequence-to-Sequence Framework in PyTorch
Stars: ✭ 392 (+208.66%)
Mutual labels:  cnn, asr
Hyperdensenet
This repository contains the code of HyperDenseNet, a hyper-densely connected CNN to segment medical images in multi-modal image scenarios.
Stars: ✭ 124 (-2.36%)
Mutual labels:  cnn, densenet
Eesen
The official repository of the Eesen project
Stars: ✭ 738 (+481.1%)
Mutual labels:  asr, ctc
Keras Sincnet
Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)
Stars: ✭ 47 (-62.99%)
Mutual labels:  cnn, asr
12306 captcha
基于深度学习识别12306验证码
Stars: ✭ 89 (-29.92%)
Mutual labels:  cnn, densenet
Asrt speechrecognition
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
Stars: ✭ 4,943 (+3792.13%)
Mutual labels:  cnn, ctc
Caffe ocr
主流ocr算法研究实验性的项目,目前实现了CNN+BLSTM+CTC架构
Stars: ✭ 1,156 (+810.24%)
Mutual labels:  densenet, ctc
Cnn lstm for text classify
CNN, LSTM, NBOW, fasttext 中文文本分类
Stars: ✭ 90 (-29.13%)
Mutual labels:  cnn, attention
Pytorch classification
利用pytorch实现图像分类的一个完整的代码,训练,预测,TTA,模型融合,模型部署,cnn提取特征,svm或者随机森林等进行分类,模型蒸馏,一个完整的代码
Stars: ✭ 395 (+211.02%)
Mutual labels:  cnn, densenet
Sincnet
SincNet is a neural architecture for efficiently processing raw audio samples.
Stars: ✭ 764 (+501.57%)
Mutual labels:  cnn, asr
Tensorflow end2end speech recognition
End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)
Stars: ✭ 305 (+140.16%)
Mutual labels:  asr, ctc
Cnn lstm ctc tensorflow
CNN+LSTM+CTC based OCR implemented using tensorflow.
Stars: ✭ 343 (+170.08%)
Mutual labels:  cnn, ctc
Deeplearning Nlp Models
A small, interpretable codebase containing the re-implementation of a few "deep" NLP models in PyTorch. Colab notebooks to run with GPUs. Models: word2vec, CNNs, transformer, gpt.
Stars: ✭ 64 (-49.61%)
Mutual labels:  cnn, attention
Captcharecognition
End-to-end variable length Captcha recognition using CNN+RNN+Attention/CTC (pytorch implementation). 端到端的不定长验证码识别
Stars: ✭ 97 (-23.62%)
Mutual labels:  cnn, attention

ASR_Syllable

=======================基于卷积神经网络的语音识别声学模型的研究========================

此项目是对自己研一与研二上之间对于DCNN-CTC学习总结,提出了MCNN-CTC以及Densenet-CTC声学模型,最终实验结果如下所示:

1) Thchs30_TrainingResults

Thchs30训练以及微调训练曲线

2) Thchs30_Results

Thchs30实验结果

3) Stcmds_Results

Stcmds实验结果

声学模型介绍

1) DCNN-CTC声学模型介绍

该模型主要是在speech_model-05上进行修改,上述模型主要使用DCNN-CTC构建语音识别声学模型,STcmds 数据集也是仿照该模型进行修改,最后实验结果如上图所示;

2) MCNN-CTC声学模型介绍

该模型主要是在speech_model_10 脚本上进行实验,最终实验结果可在上图2)所示结果,最终MCNN-CTC总体实验结果相较于DCNN-CTC较好;

3) DenseNet-CTC声学模型介绍

上述模型主要是在 DenseNet上进行实验,最终实验在Thchs30数据集结果可以达到接近30%左右的CER,具体实验可以自己付尝试一下;

4) Attention-CTC声学模型

此模型主要在DCNN-CTC基础上,在全连接层进行注意力操作,最终结果相较于其他结果相较于DCNN-CTC可能有提升,具体可以参看speech_model_06脚本;主要算法实验如下所示:
NN(Attention)-CTC:
# dense1 = Dense(units=512, activation='relu', use_bias=True, kernel_initializer='he_normal')(reshape)
# attention_prob = Dense(units=512, activation='softmax', name='attention_vec')(dense1)
# attention_mul = multiply([dense1, attention_prob])
#
# dense1 = BatchNormalization(epsilon=0.0002)(attention_mul)
# dense1 = Dropout(0.3)(dense1)

迁移学习

Retraining(重新训练)主要对初始模型进行进一步微调,可进一步提升初始模型的准确率,具体训练脚本可参看 train_modelSpeech 脚本,本文主要针对全部网路层进行微调,实验结果相较于初始模型可进一步提升,具体实验结果可参看图1)

论文引用

W Zhang, M H Zhai, Z L Huang, et al. Towards End-to-End Speech Recognition with Deep Multipath Convolutional Neural Networks[C]. https://doi.org/10.1007/978-3-030-27529-7_29

参考项目连接

个人博客 包含自己近期的学习总结
参考链接
ASR_WORD以字为建模单元构建语音识别声学模型

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].