xiguan97 commited on
Commit
1fbba67
·
verified ·
1 Parent(s): 9314882

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +101 -3
README.md CHANGED
@@ -1,3 +1,101 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Magi-1: Autoregressive Video Generation Are Scalable World Models
2
+
3
+ <!-- TODO: add image -->
4
+ <div align="center" style="margin-top: 0px; margin-bottom: 0px;">
5
+ <img src=https://github.com/user-attachments/.... width="30%"/>
6
+ 此处添加官方图片
7
+ </div>
8
+
9
+ -----
10
+
11
+ This repository contains the code for the Magi-1 model, pre-trained weights and inference code. You can find more information on our [project page](http://sand.ai).
12
+
13
+
14
+ ## 1. Introduction
15
+
16
+ We present magi, a world model that generates videos by autoregressively predicting a sequence of video chunks, defined as fixed-length segments of consecutive frames. Trained to denoise per-chunk noise that increases monotonically over time, magi enables causal temporal modeling and naturally supports streaming generation. It achieves strong performance on image-to-video (I2V) tasks conditioned on text instructions, providing high temporal consistency and scalability, which are made possible by several algorithmic innovations and a dedicated infrastructure stack. Magi further supports controllable generation via chunk-wise prompting, enabling smooth scene transitions, long-horizon synthesis, and fine-grained text-driven control. We believe magi offers a promising direction for unifying high-fidelity video generation with flexible instruction control and real-time deployment.
17
+
18
+
19
+ ## 2. Model and Checkpoints
20
+
21
+ We provide the pre-trained weights for Magi-1, including the 24B and 4.5B models, as well as the corresponding distill and distill+quant models. The model weight links are shown in the table.
22
+
23
+ | Model | Link | Recommend Machine |
24
+ | ----------------------------- | ------------------------------------------------------------ | ------------------------------- |
25
+ | Magi-1-24B | [Magi-1-24B](https://huggingface.co/sand-ai/Magi-1/tree/main/ckpt/magi/24B_base) | H100/H800 \* 8 |
26
+ | Magi-1-24B-distill | [Magi-1-24B-distill](https://huggingface.co/sand-ai/Magi-1/tree/main/ckpt/magi/24B_distill) | H100/H800 \* 8 |
27
+ | Magi-1-24B-distill+fp8_quant | [Magi-1-24B-distill+quant](https://huggingface.co/sand-ai/Magi-1/tree/main/ckpt/magi/24B_distill_quant) | H100/H800 \* 4 or RTX 4090 \* 8 |
28
+ | Magi-1-4.5B | Magi-1-4.5B (Comming Soon) | RTX 4090 \* 1 |
29
+ | Magi-1-4.5B-distill | Magi-1-4.5B-distill (Comming Soon) | RTX 4090 \* 1 |
30
+ | Magi-1-4.5B-distill+fp8_quant | Magi-1-4.5B-distill+fp8_quant (Comming Soon) | RTX 4090 \* 1 |
31
+
32
+
33
+ ## 3. How to run
34
+
35
+ ### 3.1 Environment preparation
36
+
37
+ We provide two ways to run Magi-1, with the Docker environment being the recommended option.
38
+
39
+ **Run with docker environment (Recommend)**
40
+
41
+ ```bash
42
+ docker pull magi/magi:latest
43
+
44
+ docker run -it --gpus all --privileged --shm-size=32g --name magi --net=host --ipc=host --ulimit memlock=-1 --ulimit stack=6710886 sandai/magi:latest /bin/bash
45
+ ```
46
+
47
+ **Run with source code**
48
+
49
+ ```bash
50
+ # Create a new environment
51
+ conda create -n magi python==3.10.12
52
+ # Install pytorch
53
+ conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.4 -c pytorch -c nvidia
54
+ # Install other dependencies
55
+ pip install -r requirements.txt
56
+ # Install magi-attention, new install method
57
+ pip install --no-cache-dir "https://python-artifacts.oss-cn-shanghai.aliyuncs.com/flash_attn_3-3.0.0b2-cp310-cp310-linux_x86_64.whl" --no-deps
58
+ ```
59
+
60
+ ### 3.2 Inference command
61
+
62
+ ```bash
63
+ # Run 24B Magi-1 model
64
+ bash example/24B/run.sh
65
+
66
+ # Run 4.5B Magi-1 model
67
+ bash example/4.5B/run.sh
68
+ ```
69
+
70
+ ### 3.3 Useful configs
71
+
72
+ | Config | Help |
73
+ | -------------- | ------------------------------------------------------------ |
74
+ | seed | Random seed used for video generation |
75
+ | video_size_h | Height of the video |
76
+ | video_size_w | Width of the video |
77
+ | num_frames | Controls the duration of generated video |
78
+ | fps | Frames per second, 4 video frames correspond to 1 latent_frame |
79
+ | cfg_number | Base model uses cfg_number==2, distill and quant model uses cfg_number=1 |
80
+ | load | Directory containing a model checkpoint. |
81
+ | t5_pretrained | Path to load pretrained T5 model |
82
+ | vae_pretrained | Path to load pretrained VAE model |
83
+
84
+
85
+ ## 4. Acknowledgements
86
+
87
+ ## 5. Contact
88
+
89
+ Please feel free to cite our paper if you find our code or model useful in your research.
90
+
91
+ ```
92
+ @article{magi1,
93
+ title={Magi-1: Autoregressive Video Generation Are Scalable World Models},
94
+ author={Magi-1},
95
+ journal={arXiv preprint arXiv:2504.06165},
96
+ year={2025}
97
+ (TODO: add correct citation)
98
+ }
99
+ ```
100
+
101
+ If you have any questions, please feel free to raise an issue.