I know how to deploy and call an API using an LLM with speculative decoding and a draft model via llama-serve. ``` ./build/bin/llama-server --model Qwen3-14B-Q8_0.gguf --reasoning-budget 0 --model-draft Qwen3-0.6B-Q8_0.gguf --n-gpu-layers 99 -ngld 99 -fa --draft-max 16 --draft-min 0 --temp 0.6 --top-k 20 --top-p 0.95 --min-p 0 ``` But how can I serve a model using lookahead decoding instead? The command ``` ./build/bin/llama-lookahead --model Qwen3-14B-Q8_0.gguf --n-gpu-layers 99 ``` doesn't work because it requires an input prompt. Reference: https://github.com/ggml-org/llama.cpp/pull/4207 Thanks in advance.