-
Imperial College, ZJU, HIT
- London, UK
-
05:13
(UTC) - https://kxhit.github.io/
- @XinKong_IC
Highlights
- Pro
Lists (22)
Sort Name ascending (A-Z)
AD
diffusion
dreammaping
embodied AI
GBP
graphics
LPR
multi-task
Navigation
nerf reading list
NN
open world
PointCloud
RL
RM
Robot arm
segmentation
Semantic Point Cloud
SLAM
tools
video-seg
world model
Starred repositories
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Movie Gen Bench - two media generation evaluation benchmarks released with Meta Movie Gen
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Source code of paper "NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer"
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
From anything to mesh like human artists. Official impl. of "MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers"
TorchCFM: a Conditional Flow Matching library
3D LiDAR Mapping in Dynamic Environments using a 4D Implicit Neural Representation (CVPR 2024)
[ECCV 2024] EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion.
[ACCV 2024 (Oral)] Official Implementation of "Moving Object Segmentation: All You Need Is SAM (and Flow)" Junyu Xie, Charig Yang, Weidi Xie, Andrew Zisserman
[ECCV 2024] Official PyTorch implementation of "Getting it Right: Improving Spatial Consistency in Text-to-Image Models"
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
Code for SPAD : Spatially Aware Multiview Diffusers, CVPR 2024
[CVPR'24, Demo Track Honourable Mention] SuperPrimitive: Scene Reconstruction at a Primitive Level
[ECCV'24] GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image
Open-Sora: Democratizing Efficient Video Production for All
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
[CVPR'24 Highlight & Best Demo Award] Gaussian Splatting SLAM
[CVPR'24] MorpheuS: Neural Dynamic 360° Surface Reconstruction from Monocular RGB-D Video