multiple tags does not works

#3
by mhassanch - opened

For the same input image:

  • caption1 = "a cat. a remote control."
    β†’ Only detects cat; the confidence score for remote control is very low.

  • caption2 = "a remote control. a cat."
    β†’ Only detects remote control; the confidence score for cat becomes very low.

This behavior suggests that the ONNX model only gives high confidence to the first object in the caption, regardless of what's actually present in the image.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment