Spaces:
Running
Running
Commit
·
0c350fd
1
Parent(s):
14b802d
fix: update InferBench description in app.py to emphasize speed instead of compression, and add a link to a related blog post for further information
Browse files
app.py
CHANGED
@@ -84,7 +84,7 @@ with gr.Blocks("ParityError/Interstellar", css=custom_css) as demo:
|
|
84 |
We are Pruna AI, an open source AI optimisation engine and we simply make your models cheaper, faster, smaller, greener!
|
85 |
|
86 |
# 📊 About InferBench
|
87 |
-
InferBench is a leaderboard for inference providers, focusing on cost, quality, and
|
88 |
Over the past few years, we’ve observed outstanding progress in image generation models fueled by ever-larger architectures.
|
89 |
Due to their size, state-of-the-art models such as FLUX take more than 6 seconds to generate a single image on a high-end H100 GPU.
|
90 |
While compression techniques can reduce inference time, their impact on quality often remains unclear.
|
@@ -96,6 +96,8 @@ with gr.Blocks("ParityError/Interstellar", css=custom_css) as demo:
|
|
96 |
|
97 |
FLUX-juiced was obtained using a combination of compilation and caching algorithms and we are proud to say that it consistently outperforms alternatives, while delivering performance on par with the original model.
|
98 |
This combination is available in our Pruna Pro package and can be applied to almost every image generation model.
|
|
|
|
|
99 |
"""
|
100 |
)
|
101 |
with gr.Column(scale=1):
|
|
|
84 |
We are Pruna AI, an open source AI optimisation engine and we simply make your models cheaper, faster, smaller, greener!
|
85 |
|
86 |
# 📊 About InferBench
|
87 |
+
InferBench is a leaderboard for inference providers, focusing on cost, quality, and speed.
|
88 |
Over the past few years, we’ve observed outstanding progress in image generation models fueled by ever-larger architectures.
|
89 |
Due to their size, state-of-the-art models such as FLUX take more than 6 seconds to generate a single image on a high-end H100 GPU.
|
90 |
While compression techniques can reduce inference time, their impact on quality often remains unclear.
|
|
|
96 |
|
97 |
FLUX-juiced was obtained using a combination of compilation and caching algorithms and we are proud to say that it consistently outperforms alternatives, while delivering performance on par with the original model.
|
98 |
This combination is available in our Pruna Pro package and can be applied to almost every image generation model.
|
99 |
+
|
100 |
+
A full blogpost on the method can be found [here](https://pruna.ai/blog/flux-juiced). # TODO: Add link
|
101 |
"""
|
102 |
)
|
103 |
with gr.Column(scale=1):
|