-
Notifications
You must be signed in to change notification settings - Fork 292
Description
System Info
The full command line used that causes issues: text-embeddings-router --port 80
OS version: debian 12
Rust version: 1.74.1
Model being used: intfloat/multilingual-e5-large
Hardware used:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.42.06 Driver Version: 555.42.06 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Tesla T4 On | 00000001:00:00.0 Off | Off |
| N/A 46C P0 54W / 70W | 1301MiB / 16384MiB | 100% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 686 C ...r/.cargo/bin/text-embeddings-router 1298MiB |
+-----------------------------------------------------------------------------------------+
Deployment specificities: Azure VMSS with application gateway
The current version being used: 1.5.0
Information
- Docker
- The CLI directly
Tasks
- An officially supported command
- My own modifications
Reproduction
- Add an API_KEY
- Try to GET /health
Expected behavior
I expect to get the health of the server, but I receive 401 unauthorized. The problem don't arise with grpc, and as azure health probe is an http request on a specific port with a path all my instance are considered unhealthy.