Unable to serve using OpenVINO Model Server
#1
by
KYLN24
- opened
I used the model server docker image (openvino/model_server:2025.1-gpu) to serve this model and get the following error:
[2025-04-28 16:54:19.976][64][modelmanager][error][modelinstance.cpp:842] Cannot compile model into target device; error: Exception from src/inference/src/cpp/core.cpp:109:
Exception from src/inference/src/dev/plugin.cpp:53:
Exception from src/plugins/intel_gpu/src/plugin/program_builder.cpp:225:
Operation: VocabDecoder_136799 of type VocabDecoder(extension) is not supported
I also created an issue on GitHub: https://github.com/openvinotoolkit/model_server/issues/3263
OVMS usecases are often production based. Maybe this won't work for you, but with my project OpenArc I haven't had issues running this model (or the other deepseek distils I have converted) on GPU.