mumoumou

ZXMu mumoumou

XJTU

Stars

mravanelli / pySpeechRev

This python code performs an efficient speech reverberation starting from a dataset of close-talking speech signals and a collection of acoustic impulse responses.

Python 95 25 Updated May 30, 2020

Audio-WestlakeU / RealMAN

A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NIPS 2024]

Python 93 11 Updated Oct 12, 2024

lllyasviel / IC-Light

More relighting!

Python 5,506 361 Updated Oct 27, 2024

jishengpeng / WavTokenizer

SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling

Python 787 43 Updated Oct 23, 2024

sp-uhh / ears_benchmark

Generation scripts for EARS-WHAM and EARS-Reverb

Python 23 3 Updated Sep 16, 2024

WenzheLiu-Speech / awesome-speech-enhancement

speech enhancement\speech seperation\sound source localization

1,048 221 Updated Nov 14, 2023

ddlBoJack / Awesome-Speech-Language-Model

Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.

106 7 Updated Nov 10, 2024

descriptinc / descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Python 1,209 112 Updated Jul 11, 2024

audacity / audacity

Audio Editor

C 12,606 2,264 Updated Nov 17, 2024

Yip-Jia-Qi / codecformer

Python 15 1 Updated Jul 15, 2024

RookieJunChen / dns_mos_calculate

Code for calculate DNS_MOS.

Python 31 10 Updated Dec 18, 2022

GeWu-Lab / awesome-audiovisual-learning

A curated list of audio-visual learning methods and datasets.

231 17 Updated Oct 28, 2024

facebookresearch / facestar

Facestar dataset. High quality audio-visual recordings of human conversational speech.

Python 104 6 Updated Mar 29, 2022

louaaron / Score-Entropy-Discrete-Diffusion

[ICML 2024 Best Paper] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)

Python 403 36 Updated Feb 29, 2024

yxlu-0102 / MP-SENet

Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement

Python 316 45 Updated Oct 28, 2024

RoyChao19477 / SEMamba

This is the official implementation of the SEMamba paper. (Accepted to IEEE SLT 2024)

Python 143 13 Updated Sep 9, 2024

microsoft / DNS-Challenge

This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.

Python 1,110 412 Updated Jul 25, 2024

Aria-K-Alethia / BigCodec

Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"

Python 82 4 Updated Sep 19, 2024

haidog-yaqub / EzAudio

High-quality Text-to-Audio Generation with Efficient Diffusion Transformer

Python 238 8 Updated Nov 12, 2024

wyf0912 / SinSR

[CVPR 2024] SinSR: Diffusion-Based Image Super-Resolution in a Single Step

Python 330 19 Updated Sep 12, 2024

WangHelin1997 / SSR-Speech

SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis

Python 97 10 Updated Nov 1, 2024

urgent-challenge / urgent2024_challenge

Official data preparation scripts for the URGENT 2024 Challenge

Python 66 5 Updated Aug 12, 2024

zsyOAOA / ResShift

ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting (NeurIPS@2023 Spotlight, TPAMI@2024)

Python 944 50 Updated Sep 14, 2024

facebookresearch / AudioMAE

This repo hosts the code and models of "Masked Autoencoders that Listen".

Python 544 45 Updated Apr 5, 2024

google-research / text-to-text-transfer-transformer

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Python 6,175 756 Updated Sep 20, 2024

haoheliu / AudioLDM

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Python 2,451 222 Updated Oct 14, 2024

haoheliu / AudioLDM2

Text-to-Audio/Music Generation

Python 2,305 180 Updated Sep 29, 2024

LetheSec / HuggingFace-Download-Accelerator

利用HuggingFace的官方下载工具从镜像网站进行高速下载。

Python 827 77 Updated Oct 12, 2024

LoieSun / Auto-ACD

code for A Large-scale Dataset for Audio-Language Representation Learning

C 10 Updated Sep 18, 2024

cdjkim / audiocaps

🔊 Repository for our NAACL-HLT 2019 paper: AudioCaps

Python 144 18 Updated Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly