This is the guidance for training your own MOFA-Adapter.
🔧🔧🔧 Stay tuned. There may exist bugs, feel free to contact me or report issues!
conda create -n mofa_train python==3.10
conda activate mofa_train
pip install -r requirements.txt
We train our MOFA-Adapter on WebVid-10M. Please refer to our implementation of WebVid10M
class in ./train_utils/dataset.py
for more details about how to read the data. You may need to download the WebVid-10M first, or you can modify the codes of WebVid10M
class and train your own MOFA-Adapter on other datasets.
-
Download the pretrained checkpoint folder of SVD_xt from huggingface to
./ckpts
.The structure of the checkpoint folder should be:
./ckpts |-- stable-video-diffusion-img2vid-xt-1-1 | |-- feature_extractor | |-- ... | |-- image_encoder | |-- ... | |-- scheduler | |-- ... | |-- unet | |-- ... | |-- vae | |-- ... | |-- svd_xt_1_1.safetensors | `-- model_index.json
-
Download the Unimatch checkpoint from here and put it into
./train_utils/unimatch/pretrained
. -
Download the checkpoint of CMP from here and put it into
./models/cmp/experiments/semiauto_annot/resnet50_vip+mpii_liteflow/checkpoints
.
./train_stage1.sh
Change the value of --controlnet_model_name_or_path
in train_stage2.sh
, then run:
./train_stage2.sh