Skip to content

Unsupported model IR version #355

@netw0rkf10w

Description

@netw0rkf10w

Feature request

I tried running some recent models and obtained the following error:

cpu-1.5: Pulling from huggingface/text-embeddings-inference
Digest: sha256:0502794a4d86974839e701dadd6d06e693ec78a0f6e87f68c391e88c52154f3f
Status: Image is up to date for ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
2024-07-25T12:40:25.295596Z  INFO text_embeddings_router: router/src/main.rs:175: Args { model_id: "/dat*/******_**_***M_v5", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: false, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "c165dfa0057d", port: 80, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: Some("/data"), payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }
2024-07-25T12:40:25.306206Z  INFO text_embeddings_router: router/src/lib.rs:199: Maximum number of tokens per request: 512
2024-07-25T12:40:25.306330Z  INFO text_embeddings_core::tokenization: core/src/tokenization.rs:28: Starting 2 tokenization workers
2024-07-25T12:40:25.316131Z  INFO text_embeddings_router: router/src/lib.rs:241: Starting model backend
Error: Could not create backend

Caused by:
    Could not start backend: Failed to create ONNX Runtime session: Load model from /data/stella_en_400M_v5/onnx/model.onnx failed:/home/runner/work/onnxruntime-build/onnxruntime-build/onnxruntime/onnxruntime/core/graph/model.cc:179 onnxruntime::Model::Model(onnx::ModelProto&&, const PathString&, const IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 10, max supported IR version: 9

It seems that the ONNX runtime is not the latest version.

Motivation

It would be great to update the ONNX runtime to the latest version so that the latest models could be used.

Your contribution

Sorry I'm not familiar with the tech stack, but I could help testing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions