All Projects → kuangdd → aukit

kuangdd / aukit

Licence: MIT license
audio toolkit. 好用的语音处理工具箱,包含语音降噪、音频格式转换、特征频谱生成等模块。

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to aukit

MouseTooltipTranslator
chrome extension - When mouse hover on text, it shows translated tooltip using google translate
Stars: ✭ 93 (+165.71%)
Mutual labels:  tts
TensorVox
Desktop application for neural speech synthesis written in C++
Stars: ✭ 140 (+300%)
Mutual labels:  tts
ukrainian-tts
Ukrainian TTS (text-to-speech) using Coqui TTS
Stars: ✭ 74 (+111.43%)
Mutual labels:  tts
tai5-uan5 gian5-gi2 kang1-ku7
臺灣言語工具
Stars: ✭ 79 (+125.71%)
Mutual labels:  tts
Expressive-FastSpeech2
PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.
Stars: ✭ 139 (+297.14%)
Mutual labels:  tts
Cross-Speaker-Emotion-Transfer
PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech
Stars: ✭ 107 (+205.71%)
Mutual labels:  tts
brasiltts
Brasil TTS é um conjunto de sintetizadores de voz, em português do Brasil, que lê telas para portadores de deficiência visual. Transforma texto em áudio, permitindo que pessoas cegas ou com baixa visão tenham acesso ao conteúdo exibido na tela. Embora o principal público-alvo de sistemas de conversão texto-fala – como o Brasil TTS – seja formado…
Stars: ✭ 34 (-2.86%)
Mutual labels:  tts
VAENAR-TTS
PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.
Stars: ✭ 66 (+88.57%)
Mutual labels:  tts
StyleSpeech
Official implementation of Meta-StyleSpeech and StyleSpeech
Stars: ✭ 161 (+360%)
Mutual labels:  tts
ms-ra-forwarder
A free online TTS API
Stars: ✭ 397 (+1034.29%)
Mutual labels:  tts
vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Stars: ✭ 1,604 (+4482.86%)
Mutual labels:  tts
soundpad-text-to-speech
Text-To-Speech for Soundpad
Stars: ✭ 29 (-17.14%)
Mutual labels:  tts
FastSpeech2
Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to Speech ✊
Stars: ✭ 64 (+82.86%)
Mutual labels:  tts
simple-obs-stt
Speech-to-text and keyboard input captions for OBS.
Stars: ✭ 89 (+154.29%)
Mutual labels:  tts
JSpeak
A Text to Speech Reader Front-end that Reads from the Clipboard and with Exceptionable Features
Stars: ✭ 16 (-54.29%)
Mutual labels:  tts
TFGAN
TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis
Stars: ✭ 65 (+85.71%)
Mutual labels:  tts
Zero-Shot-TTS
Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
Stars: ✭ 33 (-5.71%)
Mutual labels:  tts
FCH-TTS
A fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese, Japanese, Korean, Russian and Tibetan (so far). 快速语音合成模型,适用于英语、普通话/中文、日语、韩语、俄语和藏语(当前已测试)。
Stars: ✭ 154 (+340%)
Mutual labels:  tts
EMPHASIS-pytorch
EMPHASIS: An Emotional Phoneme-based Acoustic Model for Speech Synthesis System
Stars: ✭ 15 (-57.14%)
Mutual labels:  tts
magphase
MagPhase Vocoder: Speech analysis/synthesis system for TTS and related applications.
Stars: ✭ 76 (+117.14%)
Mutual labels:  tts

aukit

aukit

audio toolkit: 语音和频谱处理的工具箱。

安装


pip install -U aukit

  • 注意

    • 可能需另外安装的依赖包:tensorflow, pyaudio, sounddevice。

    • tensorflow<=1.13.1

    • pyaudio暂不支持python37以上版本直接pip安装,需要下载whl文件安装,下载路径:https://www.lfd.uci.edu/~gohlke/pythonlibs/#pyaudio

    • sounddevice依赖pyaudio。

    • aukit的默认音频采样率为16k。

版本

v1.4.6

audio_cli

命令行,播放音频,去除背景噪声,音频格式转换。

支持递归处理文件夹内的全部音频。

命令行

说明
  • 用位置参数来控制。

  • 名称说明

    • inpath:输入音频路径或目录。

    • outpath:输出音频路径或目录,如果为目录,则输出的子目录按照inpath的子目录格式输出。

    • sr:音频采样率,默认16000或自动识别采样率。

    • in_format:输入音频格式,主要用以限制为指定后缀名的文件,如果不设置,则处理目录的全部文件。

    • out_format:输出音频格式,主要用以音频格式转换,设置输出音频的后缀名。

  • 中括号【[]】里面的是可选参数。

工具

  • auplay: 播放音频

auplay inpath [sr] [in_format]

  • aunoise: 语音降噪

aunoise inpath outpath [in_format]

  • auformat: 音频格式转换

auformat inpath outpath out_format [in_format]

audio_changer

变声器,变高低音,变语速,变萝莉音,回声。

基于librosa的变声。

audio_editor

语音编辑,切分音频,去除语音中的较长静音,去除语音首尾静音,设置采样率,设置通道数。

音频格式相互转换,例如wav格式转为mp3格式。

切分音频,去除静音,去除首尾静音输入输出都支持wav格式。

语音编辑功能基于pydub的方法,增加了数据格式支持。

audio_griffinlim

griffinlim声码器,线性频谱转语音,梅尔频谱转语音,TensorFlow版本转语音,梅尔频谱和线性频谱相互转换。

audio_io

语音IO,语音保存、读取,支持wav和mp3格式,语音形式转换(np.array,bytes,io.BytesIO),支持【.】操作符的字典。

audio_noise_remover

语音降噪,降低环境噪声。

audio_normalizer

语音正则化,去除音量低的音频段(去除静音),调节音量。

语音正则化方法基于VAD的方法。

audio_player

语音播放,传入文件名播放,播放wave数据,播放bytes数据。

audio_spectrogram

语音频谱,语音转线性频谱,语音转梅尔频谱。

audio_tuner

语音调整,调整语速,调整音高。

audio_world

world声码器,提取语音的基频、频谱包络和非周期信号,频谱转为语音。调音高,调机器人音。

历史版本

v1.4.6

  • ttskit配套的语音工具。
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].