Closed
Description
Name and Version
build: 5686 (e434e691) with cc (GCC) 15.1.1 20250425 for x86_64-pc-linux-gnu
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-server
Command line
llama-server --fim-qwen-1.5b-default
Problem description & steps to reproduce
When using llama with Qwen 1.5b FIM model with llama.vscode, I get, after a couple of minutes, an error 500 on the completion endpoint.
First Bad Commit
it seems to be happening around :
3555b3004ba7687be3d734acade52a3345758aa4 is the first bad commit
commit 3555b3004ba7687be3d734acade52a3345758aa4 (HEAD, tag: b5675)
Author: xctan <[email protected]>
Date: Mon Jun 16 13:54:15 2025 +0800
ggml-cpu : rework weak alias on apple targets (#14146)
* ggml-cpu : rework weak alias on apple targets
* fix powerpc detection
* fix ppc detection
* fix powerpc detection on darwin
ggml/cmake/common.cmake | 3 +-
ggml/src/ggml-cpu/apple-fallback.h | 88 +++++++++++++++++++++++++++++++++++++
ggml/src/ggml-cpu/ggml-cpu-impl.h | 2 +-
ggml/src/ggml-cpu/quants.c | 4 ++
ggml/src/ggml-cpu/quants.h | 27 ------------
ggml/src/ggml-cpu/repack.cpp | 4 ++
ggml/src/ggml-cpu/repack.h | 18 +-------
7 files changed, 99 insertions(+), 47 deletions(-)
If a checkout the master
branch state for a day earlier it seems to work fine
Relevant log output
decode: failed to initialize batch
llama_decode: failed to decode, ret = -1
srv update_slots: Invalid input batch., i = 0, n_batch = 1024, ret = -1
slot release: id 0 | task 2529 | stop processing: n_past = 5594, truncated = 0
srv send_error: task id = 2529, error: Invalid input batch.
srv update_slots: all slots are idle
srv cancel_tasks: cancel task, id_task = 2529
srv update_slots: all slots are idle
srv log_server_r: request: POST /infill 127.0.0.1 500
slot launch_slot_: id 0 | task 2532 | processing task
slot update_slots: id 0 | task 2532 | new prompt, n_ctx_slot = 32768, n_keep = 0, n_prompt_tokens = 5594
slot update_slots: id 0 | task 2532 | need to evaluate at least 1 token for each active slot, n_past = 5594, n_prompt_tokens = 5594
slot update_slots: id 0 | task 2532 | kv cache rm [5593, end)
slot update_slots: id 0 | task 2532 | prompt processing progress, n_past = 5594, n_tokens = 1, progress = 0.000179
slot update_slots: id 0 | task 2532 | prompt done, n_past = 5594, n_tokens = 1
init: sequence 0 does not start from the last position stored in the memory
decode: failed to initialize batch
llama_decode: failed to decode, ret = -1