nuance

#5
by ehartford - opened

I recommed a third category "avoidant" where the model dumbs down the answer, gives a short answer, "misunderstands" the question, changes the subject, or questions the user instead of answering

I recommed a third category "avoidant" where the model dumbs down the answer, gives a short answer, "misunderstands" the question, changes the subject, or questions the user instead of answering

Maybe questions the user instead of answering should be in separate categoty, because it's brand new feature of OpenAI o3 and o4-mini models. It happens when the user has given not enough info to the model to answer properly

Three categories then for the next one maybe. non - refusal, avoidant, refusal. Coming up with a strict definition of the avoidance category will the the key - ie coming up with an exhaustive list of everything that should be considered an avoidance.

Or - make a strict definition for refusal and compliant, and a loose definition for avoidant

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment