cpu-1.5.0: TEI doesn't download all needed ONNX Files

### System Info

Image: v1.5 CPU
Model used: intfloat/multilingual-e5-large
Deployment: Docker

### Information

- [X] Docker
- [ ] The CLI directly

### Tasks

- [X] An officially supported command
- [ ] My own modifications

### Reproduction

When using the latest cpu image with ONNX support, running the model intfloat/multilingual-e5-large doesnt work:
```bash
docker run --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 --model-id intfloat/multilingual-e5-large 
cpu-1.5: Pulling from huggingface/text-embeddings-inference
Digest: sha256:0502794a4d86974839e701dadd6d06e693ec78a0f6e87f68c391e88c52154f3f
Status: Image is up to date for ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
2024-07-12T10:41:32.130048Z  INFO text_embeddings_router: router/src/main.rs:175: Args { model_id: "int*****/************-**-**rge", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: false, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "91e8108076dd", port: 80, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: Some("/data"), payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }
2024-07-12T10:41:32.130194Z  INFO hf_hub: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/root/.cache/huggingface/token"
2024-07-12T10:41:32.212394Z  INFO download_pool_config: text_embeddings_core::download: core/src/download.rs:38: Downloading `1_Pooling/config.json`
2024-07-12T10:41:33.091211Z  INFO download_new_st_config: text_embeddings_core::download: core/src/download.rs:62: Downloading `config_sentence_transformers.json`
2024-07-12T10:41:33.218900Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:21: Starting download
2024-07-12T10:41:33.218946Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:23: Downloading `config.json`
2024-07-12T10:41:33.473482Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:26: Downloading `tokenizer.json`
2024-07-12T10:41:41.465955Z  INFO download_artifacts: text_embeddings_backend: backends/src/lib.rs:313: Downloading `model.onnx`
2024-07-12T10:41:41.608869Z  WARN download_artifacts: text_embeddings_backend: backends/src/lib.rs:317: Could not download `model.onnx`: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/intfloat/multilingual-e5-large/resolve/main/model.onnx)
2024-07-12T10:41:41.608925Z  INFO download_artifacts: text_embeddings_backend: backends/src/lib.rs:318: Downloading `onnx/model.onnx`
2024-07-12T10:41:42.273395Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:32: Model artifacts downloaded in 9.0544947s
2024-07-12T10:41:42.865307Z  INFO text_embeddings_router: router/src/lib.rs:199: Maximum number of tokens per request: 512
2024-07-12T10:41:42.865553Z  INFO text_embeddings_core::tokenization: core/src/tokenization.rs:28: Starting 8 tokenization workers
2024-07-12T10:41:45.079783Z  INFO text_embeddings_router: router/src/lib.rs:241: Starting model backend
Error: Could not create backend

Caused by:
    Could not start backend: Failed to create ONNX Runtime session: Deserialize tensor encoder.layer.10.attention.output.dense.bias failed.GetFileLength for /data/models--intfloat--multilingual-e5-large/snapshots/ab10c1a7f42e74530fe7ae5be82e6d4f11a719eb/onnx/model.onnx_data failed:Invalid fd was supplied: -1
```

### Expected behavior

This issue stems from the specific model using the additional file [`model.onnx_data`](https://huggingface.co/intfloat/multilingual-e5-large/blob/main/onnx/model.onnx_data), in which the real onnx data is persisted.
This file is never downloaded by TEI.
The backend should download all necessary files to run the onnx model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cpu-1.5.0: TEI doesn't download all needed ONNX Files #341

System Info

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

cpu-1.5.0: TEI doesn't download all needed ONNX Files #341

Description

System Info

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions