All Projects → tugstugi → Pytorch Speech Commands

tugstugi / Pytorch Speech Commands

Speech commands recognition with PyTorch

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pytorch Speech Commands

Pytorch Classification
Classification with PyTorch.
Stars: ✭ 1,268 (+890.63%)
Mutual labels:  classification, resnet, densenet, cifar10, resnext
speech-recognition-transfer-learning
Speech command recognition DenseNet transfer learning from UrbanSound8k in keras tensorflow
Stars: ✭ 18 (-85.94%)
Mutual labels:  kaggle, speech-recognition, densenet
Pytorch Cifar100
Practice on cifar100(ResNet, DenseNet, VGG, GoogleNet, InceptionV3, InceptionV4, Inception-ResNetv2, Xception, Resnet In Resnet, ResNext,ShuffleNet, ShuffleNetv2, MobileNet, MobileNetv2, SqueezeNet, NasNet, Residual Attention Network, SENet, WideResNet)
Stars: ✭ 2,423 (+1792.97%)
Mutual labels:  resnet, densenet, resnext
Tianchi Medical Lungtumordetect
天池医疗AI大赛[第一季]:肺部结节智能诊断 UNet/VGG/Inception/ResNet/DenseNet
Stars: ✭ 314 (+145.31%)
Mutual labels:  classification, resnet, densenet
Chainer Cifar10
Various CNN models for CIFAR10 with Chainer
Stars: ✭ 134 (+4.69%)
Mutual labels:  resnet, densenet, cifar10
Segmentation models
Segmentation models with pretrained backbones. Keras and TensorFlow Keras.
Stars: ✭ 3,575 (+2692.97%)
Mutual labels:  resnet, densenet, resnext
Caffe Model
Caffe models (including classification, detection and segmentation) and deploy files for famouse networks
Stars: ✭ 1,258 (+882.81%)
Mutual labels:  classification, resnet, resnext
Keras Idiomatic Programmer
Books, Presentations, Workshops, Notebook Labs, and Model Zoo for Software Engineers and Data Scientists wanting to learn the TF.Keras Machine Learning framework
Stars: ✭ 720 (+462.5%)
Mutual labels:  resnet, densenet, resnext
Cifar Zoo
PyTorch implementation of CNNs for CIFAR benchmark
Stars: ✭ 584 (+356.25%)
Mutual labels:  resnet, densenet, resnext
Pytorch classification
利用pytorch实现图像分类的一个完整的代码,训练,预测,TTA,模型融合,模型部署,cnn提取特征,svm或者随机森林等进行分类,模型蒸馏,一个完整的代码
Stars: ✭ 395 (+208.59%)
Mutual labels:  resnet, densenet, resnext
Basic cnns tensorflow2
A tensorflow2 implementation of some basic CNNs(MobileNetV1/V2/V3, EfficientNet, ResNeXt, InceptionV4, InceptionResNetV1/V2, SENet, SqueezeNet, DenseNet, ShuffleNetV2, ResNet).
Stars: ✭ 374 (+192.19%)
Mutual labels:  resnet, densenet, resnext
Classification models
Classification models trained on ImageNet. Keras.
Stars: ✭ 938 (+632.81%)
Mutual labels:  resnet, densenet, resnext
Pytorch Asr
ASR with PyTorch
Stars: ✭ 124 (-3.12%)
Mutual labels:  speech-recognition, resnet, densenet
Segmentationcpp
A c++ trainable semantic segmentation library based on libtorch (pytorch c++). Backbone: ResNet, ResNext. Architecture: FPN, U-Net, PAN, LinkNet, PSPNet, DeepLab-V3, DeepLab-V3+ by now.
Stars: ✭ 49 (-61.72%)
Mutual labels:  resnet, resnext
Tensornets
High level network definitions with pre-trained weights in TensorFlow
Stars: ✭ 982 (+667.19%)
Mutual labels:  resnet, densenet
Pretrained Models.pytorch
Pretrained ConvNets for pytorch: NASNet, ResNeXt, ResNet, InceptionV4, InceptionResnetV2, Xception, DPN, etc.
Stars: ✭ 8,318 (+6398.44%)
Mutual labels:  resnet, resnext
Resnet On Cifar10
Reimplementation ResNet on cifar10 with caffe
Stars: ✭ 123 (-3.91%)
Mutual labels:  resnet, cifar10
Randwire tensorflow
tensorflow implementation of Exploring Randomly Wired Neural Networks for Image Recognition
Stars: ✭ 29 (-77.34%)
Mutual labels:  classification, cifar10
Gluon2pytorch
Gluon to PyTorch deep neural network model converter
Stars: ✭ 70 (-45.31%)
Mutual labels:  resnet, densenet
Mlbox
MLBox is a powerful Automated Machine Learning python library.
Stars: ✭ 1,199 (+836.72%)
Mutual labels:  kaggle, classification

Convolutional neural networks for Google speech commands data set with PyTorch.

General

We, xuyuan and tugstugi, have participated in the Kaggle competition TensorFlow Speech Recognition Challenge and reached the 10-th place. This repository contains a simplified and cleaned up version of our team's code.

Features

  • 1x32x32 mel-spectrogram as network input
  • single network implementation both for CIFAR10 and Google speech commands data sets
  • faster audio data augmentation on STFT
  • Kaggle private LB scores evaluated on 150.000+ audio files

Results

Due to time limit of the competition, we have trained most of the nets with sgd using ReduceLROnPlateau for 70 epochs. For the training parameters and dependencies, see TRAINING.md. Earlier stopping the train process will sometimes produce a better score in Kaggle.

        Model         CIFAR10
test set
accuracy
Speech Commands
test set
accuracy
Speech Commands
test set
accuracy with crop
Speech Commands
Kaggle private LB
score
Speech Commands
Kaggle private LB
score with crop
        Remarks        
VGG19 BN 93.56% 97.337235% 97.527432% 0.87454 0.88030
ResNet32 - 96.181419% 96.196050% 0.87078 0.87419
WRN-28-10 - 97.937089% 97.922458% 0.88546 0.88699
WRN-28-10-dropout 96.22% 97.702999% 97.717630% 0.89580 0.89568
WRN-52-10 - 98.039503% 97.980980% 0.88159 0.88323 another trained model has 97.52%/0.89322
ResNext29 8x64 - 97.190929% 97.161668% 0.89533 0.89733 our best model during competition
DPN92 - 97.190929% 97.249451% 0.89075 0.89286
DenseNet-BC (L=100, k=12) 95.52% 97.161668% 97.147037% 0.88946 0.89134
DenseNet-BC (L=190, k=40) - 97.117776% 97.147037% 0.89369 0.89521

Results with Mixup

After the competition, some of the networks were retrained using mixup: Beyond Empirical Risk Minimization by Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin and David Lopez-Paz.

        Model         CIFAR10
test set
accuracy
Speech Commands
test set
accuracy
Speech Commands
test set
accuracy with crop
Speech Commands
Kaggle private LB
score
Speech Commands
Kaggle private LB
score with crop
        Remarks        
VGG19 BN - 97.483541% 97.542063% 0.89521 0.89839
WRN-52-10 - 97.454279% 97.498171% 0.90273 0.90355 same score as the 16-th place in Kaggle
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].