Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → Vaibhavs10 → ml-with-audio

Vaibhavs10 / ml-with-audio

Licence: other

HF's ML for Audio study group

Programming Languages

Jupyter Notebook

11667 projects

Labels

speech-synthesis speech-recognition huggingface

Projects that are alternatives of or similar to ml-with-audio

Experiments with Hugging Face 🔬 🤗

Stars: ✭ 37 (-64.42%)

Mutual labels: speech-recognition, huggingface

Kalliope is a framework that will help you to create your own personal assistant.

Stars: ✭ 1,509 (+1350.96%)

Mutual labels: speech-synthesis, speech-recognition

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP

Stars: ✭ 1,378 (+1225%)

Mutual labels: speech-synthesis, speech-recognition

Simple speech linguistic AI with Python

Stars: ✭ 66 (-36.54%)

Mutual labels: speech-synthesis, speech-recognition

🎙️ Handsfree Audio Development Interface

Stars: ✭ 84 (-19.23%)

Mutual labels: speech-synthesis, speech-recognition

Cross-lingual Voice Conversion

Stars: ✭ 91 (-12.5%)

Mutual labels: speech-synthesis, speech-recognition

The open source intelligent personal assistant

Stars: ✭ 25 (-75.96%)

Mutual labels: speech-synthesis, speech-recognition

End-to-End Speech Processing Toolkit

Stars: ✭ 4,533 (+4258.65%)

Mutual labels: speech-synthesis, speech-recognition

Lingvo

Stars: ✭ 2,361 (+2170.19%)

Mutual labels: speech-synthesis, speech-recognition

The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!

Stars: ✭ 171 (+64.42%)

Mutual labels: speech-synthesis, speech-recognition

A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.

Stars: ✭ 1,011 (+872.12%)

Mutual labels: speech-synthesis, speech-recognition

web-speech-cognitive-services

Polyfill Web Speech API with Cognitive Services Bing Speech for both speech-to-text and text-to-speech service.

Stars: ✭ 35 (-66.35%)

Mutual labels: speech-synthesis, speech-recognition

an open-source implementation of sequence-to-sequence based speech processing engine

Stars: ✭ 542 (+421.15%)

Mutual labels: speech-synthesis, speech-recognition

a simple speech recognition app using the Web Speech API Interfaces

Stars: ✭ 18 (-82.69%)

Mutual labels: speech-synthesis, speech-recognition

Java Speech Api

The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.

Stars: ✭ 490 (+371.15%)

Mutual labels: speech-synthesis, speech-recognition

Spokestack Python

Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application.

Stars: ✭ 103 (-0.96%)

Mutual labels: speech-synthesis, speech-recognition

NeMo: a toolkit for conversational AI

Stars: ✭ 3,685 (+3443.27%)

Mutual labels: speech-synthesis, speech-recognition

libfaceid is a research framework for prototyping of face recognition solutions. It seamlessly integrates multiple detection, recognition and liveness models w/ speech synthesis and speech recognition.

Stars: ✭ 354 (+240.38%)

Mutual labels: speech-synthesis, speech-recognition

Awesome Ai Services

An overview of the AI-as-a-service landscape

Stars: ✭ 133 (+27.88%)

Mutual labels: speech-synthesis, speech-recognition

react-native-spokestack

Spokestack: give your React Native app a voice interface!

Stars: ✭ 53 (-49.04%)

Mutual labels: speech-synthesis, speech-recognition

View All Similar Projects ➔

Hugging Face Machine Learning for Audio Study Group

Welcome to the ML for Audio Study Group. Through a series of presentations, paper reading and discussions, we'll explore the field of applying Machine Learning in the Audio domain. Some examples of this are:

Generating synthetic sound out of a given text (think of conversational assistants)
Transcribing audio signals to text.
Removing noise out of an audio.
Separating different sources of audio.
Identifying which speaker is talking.
And much more!

We suggest you to join the community Discord at http://hf.co/join/discord, and we're looking forward to meet at the #ml-4-audio-study-group channel 🤗. Remember, this is a community effort so make out of this your space!

Organisation

We'll kick off with some basics and then collaboratively decide the further direction of the group.

Before each session:

Read/watch related resources

During each session, you can

Ask question in the forum
Present a short (~10-15mins) presentation on the topic (agree beforehand)

Before/after:

Keep discussing/asking questions about the topic (#ml-4-audio-study channel on discord)
Share interesting resources

Schedule

Date	Topics	Resources (To read before)
Dec 14, 2021	Kickoff + Overview of Audio related usecases (video, questions)	The 3 DL Frameworks for e2e Speech Recognition that power your devices
Dec 21, 2021	Intro to Audio Automatic Speech Recognition Deep Dive (video, questions)	Intro to Audio for FastAI Sections 1 and 2 Speech and Language Processing 26.1-26.5
Jan 4, 2022	Text to Speech Deep Dive (video, questions)	Intro to Audio & ASR Notebooks Speech and Language Processing 26.6
Jan 18, 2022	pyctcdecode: A simple & fast STT prediction decoding algorithm (demo, slides, questions)	Beam search CTC decoding pyctcdecode

Supplementary Resources

In case you want to solidify a concept, or just want to go down further deep into the speech processing rabbit-hole.

General Resources

Slides from LSA352: Slides (no videos available)
Slides from CS224S (Latest): Slides (no videos available)
Speech & Language Processing Book (Chapters 25 & 26) - E-book

Research Papers

Speech Recognition Papers: Github repo
Speech Synthesis Papers: Github repo

Toolkits

Speechbrain - Github repo
Toucan - Github repo
ESPnet - Github repo

Demos

Add interesting effects to your audio files - Huggingface spaces
Generate Speech from text (TTS) - Huggingface spaces
Generate text from Speech (ASR) - Huggingface spaces

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 104

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗