30 open-source projects similar to chanil1218/dcunet.pytorch, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best DCUnet.pytorch alternative.
ClearerVoice-Studio is a speech processing studio and framework designed for speech enhancement, audio super-resolution, and targeted voice extraction. It provides a suite of tools to remove background noise, increase the sampling rate of low-resolution recordings, and quantify audio clarity through objective quality evaluation metrics. The project features a target speaker extraction tool that isolates specific voices from mixed audio using acoustic, visual, or neural reference signals. It also includes capabilities for overlapping speech separation by capturing temporal patterns and long-ra
ESPnet is a comprehensive speech processing toolkit and PyTorch-based trainer designed for building end-to-end speech recognition, synthesis, and translation models. It provides a structured framework for developing automatic speech recognition systems using transducer and encoder-decoder architectures, alongside engines for text-to-speech synthesis and speech translation pipelines. The project distinguishes itself through a recipe-based workflow execution system that ensures experimental reproducibility by running standardized sequences of scripts for data preparation and model training. It
Tensorflow 2.x implementation of the stacked dual-signal transformation LSTM network (DTLN) for real-time noise suppression. This repository provides the code for training, infering and serving the DTLN model in python. It also provides pretrained models in SavedModel, TF-lite and ONNX format,…
We provide a PyTorchpytorch implementation of the paper: Real Time Speech Enhancement in the Waveform Domainarxiv. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder…
The implementation of Uformer: A Unet based dilated complex & real dual-path conformer network for simultaneous speech enhancement and dereverberation
This in an implementation of NSNet 1 in PyTorch and PyTorch Lightning. NSNet is a recurrent neural network for single channel speech enhancement. This was implemented as part of my thesis for the Master in Electrical Engineering at Ghent University.
A minimum unofficial implementation of the A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement (CRN) using PyTorch.
基于理想浮值掩蔽(Ideal Ratio Mask,IRM)使用 LSTM 进行语音增强。
This is the repository of the "Listening to Sounds of Silence for Speech Denoising" project. (Project URL: here) Our approach is based on a key observation about human speech: there is often a short pause between each sentence or word. In a recorded speech signal, those pauses introduce a series…
This Git repository for the official PyTorch implementation of "FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement", accepted by ICASSP 2022.
Authors: Yanxin Hu, Yun Liu, Shubo Lv, Mengtao Xing, Shimin Zhang,Yihui Fu, Jian Wu, Bihong Zhang, Lei Xie
Unofficial PyTorch implementation of MSRA's: PHASEN: A Phase-and-Harmonics-Aware Speech Enhancement Network.
An updated version, MetricGAN+, can be found here: paper and code.
This toolkit is the implemention of following paper:
This repository provides an implementation of the convolutional recurrent network (CRN) for monaural speech enhancement, developed in "A convolutional recurrent neural network for real-time speech enhancement", Proceedings of Interspeech, pp. 3229-3233, 2018. In the paper, a causal convolutional…
This repository provides an implementation of the gated convolutional recurrent network (GCRN) for monaural speech enhancement, developed in "Learning complex spectral mapping with gated convolutional recurrent networks for monaural speech enhancement", IEEE/ACM Transactions on Audio, Speech,…
Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech described in https://arxiv.org/abs/2008.04259
English | 中文 本项目为可以利用DNN和CNN的方法来进行语音增强,其中DNN使用的三个隐层每个隐层512个节点,CNN使用的是R-CED的网络结构并且加入了一些resnet来防止过拟合。你也可以选择是否使用dropout或者l2等。
Convolutional neural networks (CNNs) with residual links (ResNets) and causal dilated convolutional units have been the network of choice for deep learning approaches to speech enhancement. While residual links improve gradient flow during training, feature diminution of shallow layer outputs…
DeepFilterNet is a deep learning noise suppression framework and real-time audio filter designed to isolate speech and remove background noise from audio signals. It functions as a speech enhancement tool and audio processing middleware that cleans microphone input and speaker output for communication applications. The system operates as an audio server interceptor, delivering noise-suppressed streams between hardware devices and software applications. It provides capabilities for microphone signal cleaning and audio output filtering, including the creation of virtual noise-suppressed microph
This is a Unofficial Pytorch implementation of the paper HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks.
Recently, convolution-augmented transformer (Conformer) has achieved promising performance in automatic speech recognition (ASR) and time-domain speech enhancement (SE), as it can capture both local and global dependencies in the speech signal. In this paper, we propose a conformer-based metric…
This repository contains the non-sequential VAE and STCN speech models and the NMF noise model for single-channel speech enhancement.
StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation