The Wayback Machine - https://web.archive.org/web/20211111161301/https://github.com/topics/speech-processing

#

speech-processing

Here are 342 public repositories matching this topic...

speechbrain / speechbrain

Star

A PyTorch-based Speech Toolkit

audio transformers pytorch voice-recognition speech-recognition speech-to-text language-model speaker-recognition speaker-verification speech-processing audio-processing asr speaker-diarization speechrecognition speech-separation speech-enhancement spoken-language-understanding huggingface speech-toolkit speechbrain

Updated Nov 11, 2021
Python

pliang279 / awesome-multimodal-ml

Star

Reading list for research topics in multimodal machine learning

machine-learning natural-language-processing reinforcement-learning computer-vision deep-learning robotics healthcare reading-list representation-learning speech-processing multimodal-learning

Updated Nov 5, 2021

r9y9 / wavenet_vocoder

Sponsor Star

WaveNet vocoder

python speech pytorch speech-synthesis wavenet speech-processing wavenet-vocoder neural-vocoder

Updated Nov 2, 2020
Python

r9y9 / deepvoice3_pytorch

Sponsor Star

Open

Multi GPU Support

4

tanmayb123 commented Mar 4, 2018

I'd like to train this model on 8 V100 GPUs - does it support multi GPU training?

Read more

enhancement help wanted good first issue

pyannote-audio

pyannote / pyannote-audio

Star

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

tutorial detection extraction citation pytorch pretrained-models speaker-recognition speaker-verification speech-processing speaker-diarization voice-activity-detection speech-activity-detection speaker-change-detection speaker-embedding pyannote-audio overlapped-speech-detection speaker-diarization-pipeline

Updated Nov 9, 2021
Python

mravanelli / SincNet

Star

SincNet is a neural architecture for efficiently processing raw audio samples.

audio python deep-learning signal-processing waveform cnn pytorch artificial-intelligence speech-recognition neural-networks convolutional-neural-networks digital-signal-processing filtering speaker-recognition speaker-verification speech-processing audio-processing asr timit speaker-identification

Updated Apr 28, 2021
Python

awesome-diarization

wq2012 / awesome-diarization

Star

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

machine-learning awesome deep-learning speech-recognition awesome-list speech-processing speaker-diarization

Updated Sep 27, 2021

midas-research / audino

Star

Open

Move error handling to Flask

manrajgrover commented Jul 16, 2020

What?

Currently, API manually throws its own messages and errors. We should move them to werkzeug exceptions.

Read more

good first issue

Open

Add filename to annotation dashboard

Open

Convert tutorial assets to drawio

Find more good first issues

drethage / speech-denoising-wavenet

Star

A neural network for end-to-end speech denoising

machine-learning deep-learning end-to-end speech neural-networks wavenet speech-processing speech-denoising

Updated Jul 24, 2019
Python

arjo129 / uSpeech

Star

Speech recognition toolkit for the arduino

arduino speech-recognition signal speech-processing

Updated May 5, 2021
C++

nanahou / Awesome-Speech-Enhancement

Star

A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.

deep-neural-networks signal-processing machine-learning-algorithms speech-processing speech-enhancement

Updated Dec 1, 2020
MATLAB

santi-pdp / pase

Star

Problem Agnostic Speech Encoder

deep-learning pytorch unsupervised-learning speech-processing multi-task-learning waveform-analysis self-supervised-learning

Updated May 20, 2020
Python

novoic / surfboard

Star

Novoic's audio feature extraction library

audio python machine-learning statistics signal-processing waveform healthcare feature-extraction dimension speech-processing audio-processing docstrings alzheimers-disease parkinsons-disease

Updated Oct 19, 2020
Python

r9y9 / nnmnkwii

Sponsor Star

Library to build speech synthesis systems designed for easy and fast prototyping.

python machine-learning text-to-speech speech-synthesis voice-conversion speech-processing

Updated Aug 11, 2021
Python

r9y9 / pysptk

Sponsor Star

A python wrapper for Speech Signal Processing Toolkit (SPTK).

python dsp speech speech-synthesis python-wrapper digital-signal-processing speech-processing sptk

Updated May 22, 2021
Python

SforAiDl / Neural-Voice-Cloning-With-Few-Samples

Star

This repository has implementation for "Neural Voice Cloning With Few Samples"

deep-learning voice tts speech-processing voice-synthesis saidl speaker-adaptation voice-cloning speaker-encodings mel-spectogram

Updated Feb 23, 2021
Python

speechbrain / speechbrain.github.io

Star

The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.

deep-learning neural-network speech speech-recognition neural-networks deeplearning speech-to-text speaker-recognition speaker-verification speech-processing speech-recognizer beamforming speech-analysis timit speechrecognition speech-api speech-separation librispeech speech-emotion-recognition speaker-identification

Updated Nov 3, 2021
HTML

breizhn / DTLN

Star

Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.

audio raspberry-pi deep-learning tensorflow keras speech-processing dns-challenge noise-reduction audio-processing real-time-audio speech-enhancement speech-denoising onnx tf-lite noise-suppression dtln-model

Updated Nov 5, 2020
Python

Ryuk17 / SpeechAlgorithms

Star

Speech Algorithms Collections

speech-processing

Updated Sep 12, 2021
C

seanwood / gcc-nmf

Star

Real-time GCC-NMF Blind Speech Separation and Enhancement

machine-learning real-time gcc speech ipython-notebook low-latency dictionary-learning speaker speech-processing cross-correlation nmf real-time-processing unsupervised-machine-learning speech-separation speech-enhancement gcc-nmf generalized-cross-correlation tdoa

Updated Apr 8, 2019
Python

Sharad24 / Neural-Voice-Cloning-with-Few-Samples

Star

Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu

speech speech-synthesis encodings speech-processing speaker-embeddings mel-spectrogram voice-cloning speaker-encodings

Updated Feb 23, 2021
Python

gemengtju / Tutorial_Separation

Star

This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.

deep-neural-networks deep-learning signal-processing speech-processing speech-analysis speech-separation

Updated Jan 9, 2021
MATLAB

swasun / VQ-VAE-Speech

Star

PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]

speech pytorch wavenet speech-processing vq-vae vq-vae-wavenet

Updated Aug 13, 2019
Python

rishikksh20 / VocGAN

Star

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network

text-to-speech speech-synthesis gan speech-processing vocoder melgan vocgan

Updated Oct 8, 2021
Python

kahne / NonAutoregGenProgress

Star

Tracking the progress in non-autoregressive generation (translation, transcription, etc.)

natural-language-processing machine-translation artificial-intelligence speech-recognition natural-language-generation speech-processing

Updated Oct 29, 2021

innFactory / react-native-dialogflow

Star

A React-Native Bridge for the Google Dialogflow (API.AI) SDK

google react-native voice speech text-recognition apiai api-ai speech-processing speak speech-to-function dialogflow

Updated Jun 3, 2021
JavaScript

jtkim-kaist / Speech-enhancement

Star

Deep neural network based speech enhancement toolkit

speech-processing speech-enhancement

Updated Jun 14, 2019
MATLAB

gionanide / Speech_Signal_Processing_and_Classification

Star

Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can discriminate between utterances of a subject suffering from say vocal fold paralysis and utterances of a healthy subject.The mathematical modeling of the speech production system in humans suggests that an all-pole system function is justified [1-3]. As a consequence, linear prediction coefficients (LPCs) constitute a first choice for modeling the magnitute of the short-term spectrum of speech. LPC-derived cepstral coefficients are guaranteed to discriminate between the system (e.g., vocal tract) contribution and that of the excitation. Taking into account the characteristics of the human ear, the mel-frequency cepstral coefficients (MFCCs) emerged as descriptive features of the speech spectral envelope. Similarly to MFCCs, the perceptual linear prediction coefficients (PLPs) could also be derived. The aforementioned sort of speaking tradi- tional features will be tested against agnostic-features extracted by convolu- tive neural networks (CNNs) (e.g., auto-encoders) [4]. The pattern recognition step will be based on Gaussian Mixture Model based classifiers,K-nearest neighbor classifiers, Bayes classifiers, as well as Deep Neural Networks. The Massachussets Eye and Ear Infirmary Dataset (MEEI-Dataset) [5] will be exploited. At the application level, a library for feature extraction and classification in Python will be developed. Credible publicly available resources will be 1used toward achieving our goal, such as KALDI. Comparisons will be made against [6-8].

nlp classifier natural-language-processing feature-extraction nltk gaussian-mixture-models support-vector-machines mfcc principal-component-analysis speech-processing linear-discriminant-analysis isomap spectral-clustering long-short-term-memory kernel-pca spectral-embedding locally-linear-embedding linear-prediction-coefficients speech-utterance

Updated Jul 15, 2020
Python

haoxiangsnr / FullSubNet

Star

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

audio reproducible-research paper speech pytorch band speech-processing noise-reduction denoising speech-separation speech-enhancement narrow-band single-channel pretrained-model band-fusion-model full-band sub-band

Updated Nov 1, 2021
Python

spafe

SuperKogito / spafe

Sponsor Star

Open

Error in preprocessing

6

ngragaei commented Jul 27, 2020

frames[-1] = np.append(frames[-1], np.array([0]*(frame_length - len(frames[0]))))

TypeError: can't multiply sequence by non-int of type 'float'

Read more

bug good first issue spafe.utils

Open

add a VAD

Open

missing tests for utils.cepstral.py

Find more good first issues

Improve this page

Add a description, image, and links to the speech-processing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the speech-processing topic, visit your repo's landing page and select "manage topics."