multiple tags does not works
#3
by
mhassanch
- opened
For the same input image:
caption1 = "a cat. a remote control."
β Only detects cat; the confidence score for remote control is very low.caption2 = "a remote control. a cat."
β Only detects remote control; the confidence score for cat becomes very low.
This behavior suggests that the ONNX model only gives high confidence to the first object in the caption, regardless of what's actually present in the image.