- SZ,China
Stars
The fast Rust-based web bundler with webpack-compatible API 🦀️
The Rspack-based build tool. It's fast, out-of-the-box and extensible.
The swiss army knife of lossless video/audio editing
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
real time face swap and one-click video deepfake with only a single image
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
Official inference repo for FLUX.1 models
😎 Awesome lists about all kinds of interesting topics
A modular graph-based Retrieval-Augmented Generation (RAG) system
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.
A generative speech model for daily dialogue.
[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"
Official implementation of EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars
Enjoy the magic of Diffusion models!
Build smaller, faster, and more secure desktop and mobile applications with a web frontend.
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
A massively parallel, high-level programming language
NocoBase is a scalability-first, open-source no-code/low-code platform for building business applications and enterprise solutions.
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Instant voice cloning by MIT and MyShell.
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
Perceptual video quality assessment based on multi-method fusion.