Xr3playerπ§ πΌ Advanced JavaFX Media Player
Stars: β 472 (+972.73%)
JuliusOpen-Source Large Vocabulary Continuous Speech Recognition Engine
Stars: β 1,258 (+2759.09%)
audio noise clusteringhttps://dodiku.github.io/audio_noise_clustering/results/ ==> An experiment with a variety of clustering (and clustering-like) techniques to reduce noise on an audio speech recording.
Stars: β 24 (-45.45%)
PraatPraat: Doing Phonetics By Computer
Stars: β 675 (+1434.09%)
Java Speech ApiThe J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.
Stars: β 490 (+1013.64%)
NnaudioAudio processing by using pytorch 1D convolution network
Stars: β 428 (+872.73%)
Auto EditorAuto-Editor: Effort free video editing!
Stars: β 382 (+768.18%)
ArcanArcan - [Display Server, Multimedia Framework, Game Engine] -> "Desktop Engine"
Stars: β 885 (+1911.36%)
BeethovenπΈ A maestro of pitch detection.
Stars: β 601 (+1265.91%)
Twilio JavaA Java library for communicating with the Twilio REST API and generating TwiML.
Stars: β 371 (+743.18%)
Wave U NetImplementation of the Wave-U-Net for audio source separation
Stars: β 506 (+1050%)
Annyang㪠Speech recognition for your site
Stars: β 6,216 (+14027.27%)
QC++ Library for Audio Digital Signal Processing
Stars: β 481 (+993.18%)
SpecaugmentA Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
Stars: β 408 (+827.27%)
Speech Emotion AnalyzerThe neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
Stars: β 633 (+1338.64%)
MusigA shazam like tool to store songs fingerprints and retrieve them
Stars: β 388 (+781.82%)
KfrFast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)
Stars: β 985 (+2138.64%)
Audio Visualizer Androidπ΅ [Android Library] A light-weight and easy-to-use Audio Visualizer for Android.
Stars: β 581 (+1220.45%)
Voice BuilderAn opensource text-to-speech (TTS) voice building tool
Stars: β 362 (+722.73%)
AaxaudioconverterConvert Audible aax files to mp3 and m4a/m4b
Stars: β 336 (+663.64%)
Ttsπ€ π¬ Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Stars: β 5,427 (+12234.09%)
BaresipBaresip is a modular SIP User-Agent with audio and video support
Stars: β 817 (+1756.82%)
SoundfingerprintingOpen source audio fingerprinting in .NET. An efficient algorithm for acoustic fingerprinting written purely in C#.
Stars: β 554 (+1159.09%)
EqmacmacOS System-wide Audio Equalizer & Volume Mixer π§
Stars: β 3,947 (+8870.45%)
AudinoOpen source audio annotation tool for humansβ’
Stars: β 740 (+1581.82%)
TacotronAudio samples accompanying publications related to Tacotron, an end-to-end speech synthesis model.
Stars: β 493 (+1020.45%)
LightspeechLightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
Stars: β 31 (-29.55%)
FfmediaelementFFME: The Advanced WPF MediaElement (based on FFmpeg)
Stars: β 733 (+1565.91%)
VectorhubVector Hub - Library for easy discovery, and consumption of State-of-the-art models to turn data into vectors. (text2vec, image2vec, video2vec, graph2vec, bert, inception, etc)
Stars: β 317 (+620.45%)
CboardAAC communication system with text-to-speech for the browser
Stars: β 437 (+893.18%)
DiscordspeechbotA speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.
Stars: β 35 (-20.45%)
MoblyE2E test framework for tests with complex environment requirements.
Stars: β 424 (+863.64%)
SeganSpeech Enhancement Generative Adversarial Network in TensorFlow
Stars: β 661 (+1402.27%)
Neural spEnd-to-end ASR/LM implementation with PyTorch
Stars: β 408 (+827.27%)
GiadaYour Hardcore Loop Machine.
Stars: β 903 (+1952.27%)
Awesome KaldiThis is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
Stars: β 393 (+793.18%)
VadVoice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
Stars: β 622 (+1313.64%)
Dialectid e2eEnd to End Dialect Identification using Convolutional Neural Network
Stars: β 40 (-9.09%)
TtsπΈπ¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Stars: β 305 (+593.18%)
InaspeechsegmenterCNN-based audio segmentation toolkit. Allows to detect speech, music and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
Stars: β 352 (+700%)
MltMLT Multimedia Framework
Stars: β 836 (+1800%)
DplugAudio plugin framework. VST2/VST3/AU/AAX/LV2 for Linux/macOS/Windows.
Stars: β 341 (+675%)
KlioSmarter data pipelines for audio.
Stars: β 560 (+1172.73%)
Ios 10 SamplerCode examples for new APIs of iOS 10.
Stars: β 3,341 (+7493.18%)
WsayWindows "say"
Stars: β 36 (-18.18%)
SurfboardNovoic's audio feature extraction library
Stars: β 318 (+622.73%)
ChromaprintC library for generating audio fingerprints used by AcoustID
Stars: β 553 (+1156.82%)
SincnetSincNet is a neural architecture for efficiently processing raw audio samples.
Stars: β 764 (+1636.36%)
Android SpeechAndroid speech recognition and text to speech made easy
Stars: β 310 (+604.55%)
Nodejs SpeechNode.js client for Google Cloud Speech: Speech to text conversion powered by machine learning.
Stars: β 545 (+1138.64%)
DaliA GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
Stars: β 3,624 (+8136.36%)
Css10CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages
Stars: β 302 (+586.36%)
Twilio CsharpTwilio C#/.NET Helper Library for .NET Framework 3.5+ and supported .NET Core versions
Stars: β 541 (+1129.55%)
Dc ttsA TensorFlow Implementation of DC-TTS: yet another text-to-speech model
Stars: β 1,017 (+2211.36%)
Urban Sound ClassificationUrban sound source tagging from an aggregation of four second noisy audio clips via 1D and 2D CNN (Xception)
Stars: β 39 (-11.36%)