[go: up one dir, main page]

Skip to content
View sinhprous1's full-sized avatar
  • Hanoi, Vietnam

Block or report sinhprous1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A one stop repository for generative AI research updates, interview resources, notebooks and much more!

8,387 1,834 Updated Sep 16, 2024

Community list of startups working with AI in audio and music technology

1,531 135 Updated Aug 9, 2024

A powerful Python library for getting rich data from the Vietnam Stock Market using just a few lines of code

Python 524 133 Updated Aug 14, 2024

Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

Python 88 3 Updated Sep 25, 2024

FlashSpeech: Efficient Zero-Shot Speech Synthesis

Python 80 3 Updated Sep 20, 2024

TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models (2024 ICASSP)

Python 122 5 Updated Aug 29, 2024

[ECCV 2022] AutoTransition: Learning to Recommend Video Transition Effects

Python 54 9 Updated Nov 3, 2022

[IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer

Python 100 3 Updated Apr 19, 2024

Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)

Python 264 46 Updated Oct 8, 2021

Codebase for BirdClef 2023 solution

Python 31 11 Updated Jun 5, 2023
Jupyter Notebook 87 11 Updated Apr 8, 2024

The code for the bark-voicecloning model. Training and inference.

Python 648 109 Updated Sep 13, 2023

Foundational model for human-like, expressive TTS

Python 3,762 650 Updated Jul 30, 2024

Versatile audio super resolution (any -> 48kHz) with AudioSR.

Python 1,102 107 Updated May 10, 2024

Inference and training library for high-quality TTS models.

Python 4,267 428 Updated Sep 23, 2024

Implementation of TTS model based on NVIDIA P-Flow TTS Paper

Python 65 5 Updated May 12, 2024

Awesome speech/audio LLMs, representation learning, and codec models

615 27 Updated Sep 24, 2024

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Jupyter Notebook 639 79 Updated Sep 23, 2024

[Findings of NAACL 2024] Source code of paper CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models

Python 61 3 Updated Mar 31, 2024

Instrument your FastAPI with Prometheus metrics.

Python 942 84 Updated Jun 17, 2024

Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3

Python 347 39 Updated Sep 13, 2024

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Jupyter Notebook 7,512 739 Updated Jun 24, 2024
Python 7 Updated Mar 11, 2024
Python 30 5 Updated Aug 12, 2023

[ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations

HTML 132 4 Updated Apr 27, 2024

This repository provides some useful snippets that you may need in some situations.

Shell 10 Updated Jan 16, 2024

Efficient few-shot learning with Sentence Transformers

Jupyter Notebook 2,173 219 Updated Sep 19, 2024
HTML 3 Updated Dec 5, 2023

Official Implementation of StyleTTS

Jupyter Notebook 1 Updated Nov 3, 2023

vits2 backbone with multilingual-bert

Python 7,844 1,110 Updated Sep 25, 2024
Next