Skip to content

cpu-1.5.0: TEI doesn't download all needed ONNX Files #341

@freinold

Description

@freinold

System Info

Image: v1.5 CPU
Model used: intfloat/multilingual-e5-large
Deployment: Docker

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

When using the latest cpu image with ONNX support, running the model intfloat/multilingual-e5-large doesnt work:

docker run --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 --model-id intfloat/multilingual-e5-large 
cpu-1.5: Pulling from huggingface/text-embeddings-inference
Digest: sha256:0502794a4d86974839e701dadd6d06e693ec78a0f6e87f68c391e88c52154f3f
Status: Image is up to date for ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
2024-07-12T10:41:32.130048Z  INFO text_embeddings_router: router/src/main.rs:175: Args { model_id: "int*****/************-**-**rge", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: false, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "91e8108076dd", port: 80, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: Some("/data"), payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }
2024-07-12T10:41:32.130194Z  INFO hf_hub: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/root/.cache/huggingface/token"
2024-07-12T10:41:32.212394Z  INFO download_pool_config: text_embeddings_core::download: core/src/download.rs:38: Downloading `1_Pooling/config.json`
2024-07-12T10:41:33.091211Z  INFO download_new_st_config: text_embeddings_core::download: core/src/download.rs:62: Downloading `config_sentence_transformers.json`
2024-07-12T10:41:33.218900Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:21: Starting download
2024-07-12T10:41:33.218946Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:23: Downloading `config.json`
2024-07-12T10:41:33.473482Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:26: Downloading `tokenizer.json`
2024-07-12T10:41:41.465955Z  INFO download_artifacts: text_embeddings_backend: backends/src/lib.rs:313: Downloading `model.onnx`
2024-07-12T10:41:41.608869Z  WARN download_artifacts: text_embeddings_backend: backends/src/lib.rs:317: Could not download `model.onnx`: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/intfloat/multilingual-e5-large/resolve/main/model.onnx)
2024-07-12T10:41:41.608925Z  INFO download_artifacts: text_embeddings_backend: backends/src/lib.rs:318: Downloading `onnx/model.onnx`
2024-07-12T10:41:42.273395Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:32: Model artifacts downloaded in 9.0544947s
2024-07-12T10:41:42.865307Z  INFO text_embeddings_router: router/src/lib.rs:199: Maximum number of tokens per request: 512
2024-07-12T10:41:42.865553Z  INFO text_embeddings_core::tokenization: core/src/tokenization.rs:28: Starting 8 tokenization workers
2024-07-12T10:41:45.079783Z  INFO text_embeddings_router: router/src/lib.rs:241: Starting model backend
Error: Could not create backend

Caused by:
    Could not start backend: Failed to create ONNX Runtime session: Deserialize tensor encoder.layer.10.attention.output.dense.bias failed.GetFileLength for /data/models--intfloat--multilingual-e5-large/snapshots/ab10c1a7f42e74530fe7ae5be82e6d4f11a719eb/onnx/model.onnx_data failed:Invalid fd was supplied: -1

Expected behavior

This issue stems from the specific model using the additional file model.onnx_data, in which the real onnx data is persisted.
This file is never downloaded by TEI.
The backend should download all necessary files to run the onnx model.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions