Skip to content

feature(vllm-router): have a more readable config file for dynamic router config #506

Open
@antoineauger

Description

@antoineauger

Describe the feature

Right now, the dynamic router config only accepts a JSON file through the --dynamic-config-json option.

It would be nice if we could pass a config file formatted in a more user-friendly manner (top-level keys would be model names). For instance, we implemented some wrappers to let us define a more readable YAML config file (see below).

Why do you need this feature?

Right now, our JSON config would have looked like this:

{
    "service_discovery": "static",
    "routing_logic": "roundrobin",
    "static_backends": "https://endpoint1.example.com/bge-m3,https://endpoint2.example.com/bge-m3,https://endpoint3.example.com/bge-m3,https://endpoint4.example.com/bge-m3",
    "static_models": "bge-m3,bge-m3,bge-reranker-v2-m3,bge-reranker-v2-m3"
}

However, this syntax is not really convenient when having several models/backends. Also some config options (e.g., --callbacks, --static-model-types) are not optional fields so it would have forced us to keep in sync the JSON config file and how we actually start vllm-router.

For instance, in order to perform health checks, we have to run it with --static-backend-health-checks --static-model-types embeddings,rerank options.

Instead, we decided to add extra logic to support the following YAML config file:

---
routing-logic: roundrobin
service-discovery: static
callbacks: callbacks.custom_callbacks.custom_callback_handler_instance
models:
  bge-m3:
    endpoints:
      - https://endpoint1.example.com/bge-m3
      - https://endpoint2.example.com/bge-m3
    model_type: embeddings
  bge-reranker-v2-m3:
    endpoints:
      - https://endpoint3.example.com/bge-reranker-v2-m3
      - https://endpoint4.example.com/bge-reranker-v2-m3
    model_type: rerank

We then parse this file in our Docker entrypoint and can start vllm-router with the right config options (context: we deploy it as a Docker image within a kubernetes cluster).

Additional context

Creating a new --dynamic-config-yaml config option seems logical for this but we would have to think about how this new YAML file co-exists with the already-available options (JSON and "CLI options").

Maybe a refactoring of the JSON config file would be first required to align things. This would be a breaking change though.

WDYT folks?

/cc @max-wittig @bufferoverflow @YuhanLiu11

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions