Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.

Stars: ✭ 220 (-82.51%)

Mutual labels: speech-recognition

Voc

A physical model of the human vocal tract using literate programming, based on Pink Trombone.

Stars: ✭ 129 (-89.75%)

Mutual labels: speech

CCAligner

🔮 Word by word audio subtitle synchronisation tool and API. Developed under GSoC 2017 with CCExtractor.

Stars: ✭ 131 (-89.59%)

Mutual labels: speech-recognition

Dragonfly

Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS), Windows Speech Recognition (WSR), Kaldi and CMU Pocket Sphinx

Stars: ✭ 209 (-83.39%)

Mutual labels: speech-recognition

Code Switching Papers

A curated list of research papers and resources on code-switching

Stars: ✭ 122 (-90.3%)

Mutual labels: speech

gtranscribe

Software for interview transcription

Stars: ✭ 12 (-99.05%)

Mutual labels: speech

Speech And Text Unity Ios Android

Speed to text in Unity iOS use Native Speech Recognition

Stars: ✭ 117 (-90.7%)

Mutual labels: speech

Inaspeechsegmenter

CNN-based audio segmentation toolkit. Allows to detect speech, music and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

Stars: ✭ 352 (-72.02%)

Mutual labels: speech

Durian

Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.

Stars: ✭ 111 (-91.18%)

Mutual labels: speech

linear16

Converts an audio file to LINEAR16 Google-speech compatible file.

Stars: ✭ 14 (-98.89%)

Mutual labels: speech

Gtts

Python library and CLI tool to interface with Google Translate's text-to-speech API

Stars: ✭ 1,303 (+3.58%)

Mutual labels: speech

Arcan

Arcan - [Display Server, Multimedia Framework, Game Engine] -> "Desktop Engine"

Stars: ✭ 885 (-29.65%)

Mutual labels: audio-processing

Audio

Data manipulation and transformation for audio signal processing, powered by PyTorch

Stars: ✭ 1,262 (+0.32%)

Mutual labels: speech

speech-transformer

Transformer implementation speciaized in speech recognition tasks using Pytorch.

Stars: ✭ 40 (-96.82%)

Mutual labels: speech

Labelimg

🖍️ LabelImg is a graphical image annotation tool and label object bounding boxes in images

Stars: ✭ 16,088 (+1178.86%)

Mutual labels: recognition

All Contributors Cli

Tool to help automate adding contributor acknowledgements according to the all-contributors specification ✨

Stars: ✭ 345 (-72.58%)

Mutual labels: recognition

Idcardrecognition

🇨🇳中国大陆第二代身份证 🆔 识别，自动读出身份证上的信息（姓名、性别、民族、住址、身份证号码）并截取身份证照片, iOS开发者交流:①群:446310206 ②群:426087546

Stars: ✭ 191 (-84.82%)

Mutual labels: recognition

farm-animal-tracking

Farm Animal Tracking (FAT)

Stars: ✭ 19 (-98.49%)

Mutual labels: recognition

Deep Text Recognition Benchmark

Text recognition (optical character recognition) with deep learning methods.

Stars: ✭ 2,665 (+111.84%)

Mutual labels: recognition

Ccpd

[ECCV 2018] CCPD: a diverse and well-annotated dataset for license plate detection and recognition

Stars: ✭ 1,252 (-0.48%)

Mutual labels: recognition

Text Detector

Tool which allow you to detect and translate text.

Stars: ✭ 173 (-86.25%)

Mutual labels: recognition

TensorFlow-Powered Robot Vision

No description or website provided.

Stars: ✭ 34 (-97.3%)

Mutual labels: recognition

Lc Finder

An image annotation and object detection tool written in C

Stars: ✭ 163 (-87.04%)

Mutual labels: recognition

Dplug

Audio plugin framework. VST2/VST3/AU/AAX/LV2 for Linux/macOS/Windows.

Stars: ✭ 341 (-72.89%)

Mutual labels: audio-processing

Faceid

An implementation of YOLO v2 for direct facial recognition within detection layer.

Stars: ✭ 144 (-88.55%)

Mutual labels: recognition

Vst3HostDemo

Stars: ✭ 16 (-98.73%)

Mutual labels: audio-processing

Yolo Powered robot vision

Stars: ✭ 133 (-89.43%)

Mutual labels: recognition

Speechpy

💬 SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/

Stars: ✭ 833 (-33.78%)

Mutual labels: speech-recognition

Labelbox

Labelbox is the fastest way to annotate data to build and ship computer vision applications.

Stars: ✭ 1,588 (+26.23%)

Mutual labels: recognition

JD-NMF

Joint Dictionary Learning-based Non-Negative Matrix Factorization for Voice Conversion (TBME 2016)

Stars: ✭ 20 (-98.41%)

Mutual labels: speech

Dcnets

Implementation for <Decoupled Networks> in CVPR'18.

Stars: ✭ 115 (-90.86%)

Mutual labels: recognition

Php Opencv Examples

Tutorial for computer vision and machine learning in PHP 7/8 by opencv (installation + examples + documentation)

Stars: ✭ 333 (-73.53%)

Mutual labels: recognition

Facial Expression Recognition

💡My Solution to Facial Emotion Recognization in Kaggle competition

Stars: ✭ 88 (-93%)

Mutual labels: recognition

Deep-learning-And-Paper

【仅作为交流学习使用】机器智能--相关书目及经典论文包括AutoML、情感分类、语音识别、声纹识别、语音合成实验代码等

Stars: ✭ 62 (-95.07%)

Mutual labels: speech-recognition

R8brain Free Src

High-quality pro audio sample rate converter / resampler C++ library

Stars: ✭ 238 (-81.08%)

Mutual labels: audio-processing

Sound Source Localization Algorithm doa estimation

关于语音信号声源定位DOA估计所用的一些传统算法

Stars: ✭ 58 (-95.39%)

Mutual labels: speech

Otto

Sampler, Sequencer, Multi-engine synth and effects - in a box! [WIP]

Stars: ✭ 2,390 (+89.98%)

Mutual labels: audio-processing

openface mass compare

An openface script that runs a REST server. Posted images are compared against a large dataset, and the most likely match is returned. Works with https://hub.docker.com/r/uoacer/openface-mass-compare/

Stars: ✭ 22 (-98.25%)

Mutual labels: recognition

Ios 10 Sampler

Code examples for new APIs of iOS 10.

Stars: ✭ 3,341 (+165.58%)

Mutual labels: speech

deep avsr

A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.

Stars: ✭ 104 (-91.73%)

Mutual labels: speech-recognition

Crnn chinese characters rec

(CRNN) Chinese Characters Recognition.