Spaces:
Running
Running
File size: 6,438 Bytes
4f8da72 d67abe0 4f8da72 c048b97 4f8da72 c048b97 4f8da72 c048b97 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 |
---
title: Pipeline Parallelism Schedule Visualizer
emoji: π
colorFrom: indigo
colorTo: blue
sdk: docker
app_file: app.py
pinned: false
suggested_hardware: cpu-basic
suggested_storage: small
header: default
---
# Pipeline Parallelism Schedule Visualizer
An interactive visualization tool for exploring different pipeline parallelism scheduling strategies in large language models.
## Features
- Visualize multiple scheduling strategies for pipeline parallelism
- Adjust parameters like number of devices, stages, and batches
- Compare execution timelines between different strategies
- Explore operation timings and their effects on performance
## Supported Strategies
- 1F1B (One-Forward-One-Backward)
- 1F1B with Interleaved Placement
- 1F1B with Overlapped Operations
- 1F1B with Interleaved Placement and Overlapped Operations
- Zero-Bubble 1 Pipeline (ZB1P)
- Dual Pipeline (DualPipe)
## Usage
Simply adjust the parameters and select the strategies you want to compare, then click "Generate Schedule" to visualize the results.
## Deployment
This app is deployed on Hugging Face Spaces using Dash.
## Overview
This project provides tools for emulating and visualizing pipeline parallelism strategies used in large language model training.
Pipeline parallelism is a technique used to train large models by partitioning the model across multiple devices and processing data in a pipelined fashion. This project allows you to:
- Simulate different pipeline parallelism strategies (1F1B, Interleaved, Zero-Bubble, etc.)
- Visualize the execution schedule on multiple devices
- Compare different strategies for efficiency
## Features
- **Supported Pipeline Strategies**:
- 1F1B (One-Forward-One-Backward)
- Interleaved 1F1B
- Zero-Bubble 1F1B (ZB-1P)
- 1F1B with computation-communication overlap
- Interleaved 1F1B with computation-communication overlap
- DualPipe (Bidirectional pipeline parallelism with full forward-backward overlap)
- **Visualization**:
- Interactive visualization dashboard using Plotly/Dash
- **Configuration**:
- Configurable simulation parameters through Hydra
- Customizable stage latency and communication costs
## Installation
This project uses [uv](https://github.com/astral-sh/uv) for dependency management.
Setup `uv` if not installed on your computer:
```bash
# On macOS and Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
```
## Running the Interactive Server
To visualize schedules interactively:
```bash
uv run src/server.py
```
This will start a Dash server (usually on `http://127.0.0.1:8050/`). Open this URL in your web browser.
You can then adjust parameters like the number of devices, stages, batches, operation times, and select different scheduling strategies to see the resulting pipeline visualization.
## Running from Command Line
### Running for 1F1B strategy:
```bash
uv run python main.py strategy=1f1b num_devices=4 num_stages=4 num_batches=8
```

### Running for interleaved strategy:
```bash
uv run python main.py strategy=interleave num_devices=4 num_stages=8 num_batches=8
```

### Running for ZB-1P strategy:
```bash
uv run python main.py strategy=zb1p num_devices=4 num_stages=4 num_batches=8
```

### Running for DualPipe strategy:
```bash
uv run python main.py strategy=dualpipe num_devices=8 num_stages=8 num_batches=20
```

### Running for 1F1B-batch-overlap strategy:
```bash
uv run python main.py strategy=1f1b_overlap num_devices=4 num_stages=4 num_batches=8
```

### Running for 1F1B-interleave-overlap strategy:
```bash
uv run python main.py strategy=1f1b_interleave_overlap num_devices=4 num_stages=8 num_batches=8
```

## Configuration
The default configuration is in `conf/config.yaml`. You can override any parameter on the command line or create configuration groups for different scenarios.
#### Override Specific Parameters
You can override specific parameters at runtime:
```bash
uv run python main.py op_times.forward=0.5 op_times.backward=1.0 num_batches=6
```
Use DualPipe as an example, you can manually set different time for forward/backward/backward_D/backward_W/overlapped_forward_backward:
```bash
uv run python main.py strategy=dualpipe num_devices=8 num_stages=8 num_batches=32 op_times.forward=1.0 op_times.backward=2.0 op_times.backward_D=1.0 op_times.backward_W=1.0 op_times.overlapped_forward_backward=2.5
```
### Using Different Configuration Files
You can use different configuration files with Hydra in several ways:
#### Recommended Approach
1. Create multiple configuration files in the `conf` directory for different use cases:
```
conf/
βββ config.yaml # Default configuration
βββ model_A.yaml # Create your own config with stage-specific latency for performance projection
```
2. Run with your desired configuration using the `--config-name` flag:
```bash
uv run python main.py --config-name=model_A
```
## Project Structure
```
PP-Emulation/
βββ conf/ # Hydra configuration files
β βββ config.yaml # Default configuration
βββ src/ # Source code
β βββ __init__.py # Package initialization
β βββ execution_model.py # Schedule execution models
β βββ strategies.py # Pipeline parallelism strategies
β βββ visualizer.py # Visualization utilities
βββ main.py # Main entry point
βββ pyproject.toml # Project metadata and dependencies
βββ README.md # This file
```
## References
1. _PipeDream: Fast and Efficient Pipeline Parallel DNN Training_. [arxiv](https://arxiv.org/abs/1806.03377)
2. _Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM_. [arxiv](https://arxiv.org/abs/2104.04473)
3. _Zero Bubble Pipeline Parallelism_. [arxiv](https://arxiv.org/abs/2401.10241)
4. _Communication-Computation Overlap in MoE Training with 1F1B Pipeline Parallelism_. [blog](https://zhuanlan.zhihu.com/p/28463368206)
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request. |