Unable to serve using OpenVINO Model Server

#1
by KYLN24 - opened

I used the model server docker image (openvino/model_server:2025.1-gpu) to serve this model and get the following error:

[2025-04-28 16:54:19.976][64][modelmanager][error][modelinstance.cpp:842] Cannot compile model into target device; error: Exception from src/inference/src/cpp/core.cpp:109:
Exception from src/inference/src/dev/plugin.cpp:53:
Exception from src/plugins/intel_gpu/src/plugin/program_builder.cpp:225:
Operation: VocabDecoder_136799 of type VocabDecoder(extension) is not supported

I also created an issue on GitHub: https://github.com/openvinotoolkit/model_server/issues/3263

OVMS usecases are often production based. Maybe this won't work for you, but with my project OpenArc I haven't had issues running this model (or the other deepseek distils I have converted) on GPU.

https://github.com/SearchSavior/OpenArc

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment