Low GPU Utilization during inference?
#39
by
BagelBig
- opened
Running on WSl-Ubuntu:
GPU Utilization never reaches 60%, usually within the 45-55% range.
I have tried this on the 4b-it and the 12b-it and noticed the same.
Both with just plain text prompt, as well as image+text prompt.
I have reproduced this with the default sample script shown in the model card, as well as various other ways of trying to load/run the model.
I am unsure if this is expected behaviour or not.
If someone can assist me with this, I would greatly appreciate it.
Thank you.