You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please include information about your system, the steps to reproduce the bug, and the version of llama.cpp that you are using. If possible, please provide a minimal code example that reproduces the bug.
In the current speculative.cpp implementation, params.sparams.temp is forced to -1.0f
However, if I change this value to 0:
draft sampling seems to fail completely:
(speculative.log)
Is this intended behavior?
I'm working on #5625 which removes the temperature limit so I'd like to get this fixed