Low GPU Utilization during inference?

#39
by BagelBig - opened

Running on WSl-Ubuntu:
GPU Utilization never reaches 60%, usually within the 45-55% range.
I have tried this on the 4b-it and the 12b-it and noticed the same.
Both with just plain text prompt, as well as image+text prompt.
I have reproduced this with the default sample script shown in the model card, as well as various other ways of trying to load/run the model.

I am unsure if this is expected behaviour or not.

If someone can assist me with this, I would greatly appreciate it.

Thank you.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment