Skip to content

Misc. bug: Completion fails with error 500 #14298

Closed
@gvcgael

Description

@gvcgael

Name and Version

build: 5686 (e434e691) with cc (GCC) 15.1.1 20250425 for x86_64-pc-linux-gnu

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-server

Command line

llama-server --fim-qwen-1.5b-default

Problem description & steps to reproduce

When using llama with Qwen 1.5b FIM model with llama.vscode, I get, after a couple of minutes, an error 500 on the completion endpoint.

First Bad Commit

it seems to be happening around :

3555b3004ba7687be3d734acade52a3345758aa4 is the first bad commit
commit 3555b3004ba7687be3d734acade52a3345758aa4 (HEAD, tag: b5675)
Author: xctan <[email protected]>
Date:   Mon Jun 16 13:54:15 2025 +0800

    ggml-cpu : rework weak alias on apple targets (#14146)
    
    * ggml-cpu : rework weak alias on apple targets
    
    * fix powerpc detection
    
    * fix ppc detection
    
    * fix powerpc detection on darwin

 ggml/cmake/common.cmake            |  3 +-
 ggml/src/ggml-cpu/apple-fallback.h | 88 +++++++++++++++++++++++++++++++++++++
 ggml/src/ggml-cpu/ggml-cpu-impl.h  |  2 +-
 ggml/src/ggml-cpu/quants.c         |  4 ++
 ggml/src/ggml-cpu/quants.h         | 27 ------------
 ggml/src/ggml-cpu/repack.cpp       |  4 ++
 ggml/src/ggml-cpu/repack.h         | 18 +-------
 7 files changed, 99 insertions(+), 47 deletions(-)


If a checkout the master branch state for a day earlier it seems to work fine

Relevant log output

decode: failed to initialize batch
llama_decode: failed to decode, ret = -1
srv  update_slots: Invalid input batch., i = 0, n_batch = 1024, ret = -1
slot      release: id  0 | task 2529 | stop processing: n_past = 5594, truncated = 0
srv    send_error: task id = 2529, error: Invalid input batch.
srv  update_slots: all slots are idle
srv  cancel_tasks: cancel task, id_task = 2529
srv  update_slots: all slots are idle
srv  log_server_r: request: POST /infill 127.0.0.1 500
slot launch_slot_: id  0 | task 2532 | processing task
slot update_slots: id  0 | task 2532 | new prompt, n_ctx_slot = 32768, n_keep = 0, n_prompt_tokens = 5594
slot update_slots: id  0 | task 2532 | need to evaluate at least 1 token for each active slot, n_past = 5594, n_prompt_tokens = 5594
slot update_slots: id  0 | task 2532 | kv cache rm [5593, end)
slot update_slots: id  0 | task 2532 | prompt processing progress, n_past = 5594, n_tokens = 1, progress = 0.000179
slot update_slots: id  0 | task 2532 | prompt done, n_past = 5594, n_tokens = 1
init: sequence 0 does not start from the last position stored in the memory
decode: failed to initialize batch
llama_decode: failed to decode, ret = -1

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions