davidberenstein1957 commited on
Commit
0c350fd
·
1 Parent(s): 14b802d

fix: update InferBench description in app.py to emphasize speed instead of compression, and add a link to a related blog post for further information

Browse files
Files changed (1) hide show
  1. app.py +3 -1
app.py CHANGED
@@ -84,7 +84,7 @@ with gr.Blocks("ParityError/Interstellar", css=custom_css) as demo:
84
  We are Pruna AI, an open source AI optimisation engine and we simply make your models cheaper, faster, smaller, greener!
85
 
86
  # 📊 About InferBench
87
- InferBench is a leaderboard for inference providers, focusing on cost, quality, and compression.
88
  Over the past few years, we’ve observed outstanding progress in image generation models fueled by ever-larger architectures.
89
  Due to their size, state-of-the-art models such as FLUX take more than 6 seconds to generate a single image on a high-end H100 GPU.
90
  While compression techniques can reduce inference time, their impact on quality often remains unclear.
@@ -96,6 +96,8 @@ with gr.Blocks("ParityError/Interstellar", css=custom_css) as demo:
96
 
97
  FLUX-juiced was obtained using a combination of compilation and caching algorithms and we are proud to say that it consistently outperforms alternatives, while delivering performance on par with the original model.
98
  This combination is available in our Pruna Pro package and can be applied to almost every image generation model.
 
 
99
  """
100
  )
101
  with gr.Column(scale=1):
 
84
  We are Pruna AI, an open source AI optimisation engine and we simply make your models cheaper, faster, smaller, greener!
85
 
86
  # 📊 About InferBench
87
+ InferBench is a leaderboard for inference providers, focusing on cost, quality, and speed.
88
  Over the past few years, we’ve observed outstanding progress in image generation models fueled by ever-larger architectures.
89
  Due to their size, state-of-the-art models such as FLUX take more than 6 seconds to generate a single image on a high-end H100 GPU.
90
  While compression techniques can reduce inference time, their impact on quality often remains unclear.
 
96
 
97
  FLUX-juiced was obtained using a combination of compilation and caching algorithms and we are proud to say that it consistently outperforms alternatives, while delivering performance on par with the original model.
98
  This combination is available in our Pruna Pro package and can be applied to almost every image generation model.
99
+
100
+ A full blogpost on the method can be found [here](https://pruna.ai/blog/flux-juiced). # TODO: Add link
101
  """
102
  )
103
  with gr.Column(scale=1):