qwen2-vl-2b Image features and image tokens do not match

#22
by novicetyro - opened

how to solve this problem

I am also getting, python3.12/site-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 1688, in forward
[rank2]: raise ValueError(
[rank2]: ValueError: Video features and video tokens do not match: tokens: 0, features 936
and my user data has video.

unsloth_compiled_cache/unsloth_compiled_module_qwen2_vl.py:930, in Qwen2VLForConditionalGeneration_forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, labels, use_cache, output_attentions, output_hidden_states, return_dict, pixel_values, pixel_values_videos, image_grid_thw, video_grid_thw, rope_deltas, cache_position, **loss_kwargs)
928 n_image_features = image_embeds.shape[0]
929 if n_image_tokens != n_image_features:
--> 930 raise ValueError(
931 f"Image features and image tokens do not match: tokens: {n_image_tokens}, features {n_image_features}"
932 )
933 image_mask = (
934 (input_ids == self.config.image_token_id)
935 .unsqueeze(-1)
936 .expand_as(inputs_embeds)
937 .to(inputs_embeds.device)
938 )
939 image_embeds = image_embeds.to(inputs_embeds.device, inputs_embeds.dtype)

ValueError: Image features and image tokens do not match: tokens: 568, features 600

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment