-
Notifications
You must be signed in to change notification settings - Fork 12.5k
Closed
Description
Hi all
I just learned about CLBLAST so wanted to try it at home on my Intel macOS system with AMD 6900XT GPU.
I have no idea if it's meant to work on this system or with AMD GPUs? Maybe it's only designed for NV on Linux or Windows at the moment? But I figured as it's using OpenCL, it should work with any GPU? Maybe? :)
Installing CLBLAST is easy:
tomj@Eddie ~/src/llama.cpp (master●●)$ brew install clblast
==> Downloading https://formulae.brew.sh/api/formula.jws.json
....
==> Pouring clblast--1.5.3_1.ventura.bottle.tar.gz
🍺 /usr/local/Cellar/clblast/1.5.3_1: 41 files, 11.6MB
==> Running `brew cleanup clblast`...
Disable this behaviour by setting HOMEBREW_NO_INSTALL_CLEANUP.
Hide these hints with HOMEBREW_NO_ENV_HINTS (see `man brew`).
Compiling went fine:
tomj@Eddie ~/src/llama.cpp (master●●)$ make clean && LLAMA_CLBLAST=1 make
......
c++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -c llama.cpp -o llama.o
c++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -c examples/common.cpp -o common.o
examples/common.cpp:750:24: warning: comparison of integers of different signs: 'char32_t' and '__darwin_wint_t' (aka 'int') [-Wsign-compare]
if (input_char == WEOF || input_char == 0x04 /* Ctrl+D*/) {
~~~~~~~~~~ ^ ~~~~
examples/common.cpp:765:45: warning: comparison of integers of different signs: 'char32_t' and '__darwin_wint_t' (aka 'int') [-Wsign-compare]
while ((code = getchar32()) != WEOF) {
~~~~~~~~~~~~~~~~~~ ^ ~~~~
2 warnings generated.
cc -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native -DGGML_USE_ACCELERATE -DGGML_USE_CLBLAST -c ggml-opencl.c -o ggml-opencl.o
In file included from ggml-opencl.c:4:
/usr/local/include/clblast_c.h:1686:47: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
CLBlastStatusCode PUBLIC_API CLBlastClearCache();
^
void
1 warning generated.
c++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/main/main.cpp ggml.o llama.o common.o ggml-opencl.o -o main -framework Accelerate -lclblast -framework OpenCL
==== Run ./main -h for help. ====
c++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/quantize/quantize.cpp ggml.o llama.o ggml-opencl.o -o quantize -framework Accelerate -lclblast -framework OpenCL
c++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/quantize-stats/quantize-stats.cpp ggml.o llama.o ggml-opencl.o -o quantize-stats -framework Accelerate -lclblast -framework OpenCL
c++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/perplexity/perplexity.cpp ggml.o llama.o common.o ggml-opencl.o -o perplexity -framework Accelerate -lclblast -framework OpenCL
c++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/embedding/embedding.cpp ggml.o llama.o common.o ggml-opencl.o -o embedding -framework Accelerate -lclblast -framework OpenCL
c++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native pocs/vdot/vdot.cpp ggml.o ggml-opencl.o -o vdot -framework Accelerate -lclblast -framework OpenCL
tomj@Eddie ~/src/llama.cpp (master●●●)$
First attempt got this problem - it's using the wrong device:
tomj@Eddie ~/src/llama.cpp (master●●)$ ./main -t 16 -m ~/src/huggingface/Wizard-Vicuna-13B-Uncensored-GGML/Wizard-Vicuna-13B-Uncensored.ggml.q5_1.bin -n 512 -p "### Instruction: write a story about llamas\n### Response:"
main: build = 540 (f048af0)
main: seed = 1683982897
llama.cpp: loading model from /Users/tomj/src/huggingface/Wizard-Vicuna-13B-Uncensored-GGML/Wizard-Vicuna-13B-Uncensored.ggml.q5_1.bin
llama_model_load_internal: format = ggjt v2 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 5120
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 40
llama_model_load_internal: n_layer = 40
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 9 (mostly Q5_1)
llama_model_load_internal: n_ff = 13824
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 13B
llama_model_load_internal: ggml ctx size = 85.08 KB
llama_model_load_internal: mem required = 11359.04 MB (+ 1608.00 MB per state)
Initializing CLBlast (First Run)...
Attempting to use: Platform=0, Device=0 (If invalid, program will crash)
Using Platform: Apple Device: Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz
OpenCL clCreateCommandQueue error -30 at ggml-opencl.c:215
My CPU, not GPU.
So I edited ggml-opencl.c
and changed this line to device 1 :
int dev_num = (GGML_CLBLAST_DEVICE == NULL ? 1 : atoi(GGML_CLBLAST_DEVICE));
Now it tries to use my GPU, but still fails with exactly the same error:
tomj@Eddie ~/src/llama.cpp (master●●●)$ ./main -t 16 -m ~/src/huggingface/Wizard-Vicuna-13B-Uncensored-GGML/Wizard-Vicuna-13B-Uncensored.ggml.q5_1.bin -n 512 -p "### Instruction: write a story about llamas\n### Response:"
main: build = 540 (f048af0)
main: seed = 1683983030
llama.cpp: loading model from /Users/tomj/src/huggingface/Wizard-Vicuna-13B-Uncensored-GGML/Wizard-Vicuna-13B-Uncensored.ggml.q5_1.bin
llama_model_load_internal: format = ggjt v2 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 5120
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 40
llama_model_load_internal: n_layer = 40
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 9 (mostly Q5_1)
llama_model_load_internal: n_ff = 13824
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 13B
llama_model_load_internal: ggml ctx size = 85.08 KB
llama_model_load_internal: mem required = 11359.04 MB (+ 1608.00 MB per state)
Initializing CLBlast (First Run)...
Attempting to use: Platform=0, Device=1 (If invalid, program will crash)
Using Platform: Apple Device: AMD Radeon RX 6900 XT Compute Engine
OpenCL clCreateCommandQueue error -30 at ggml-opencl.c:215
I've never used CLBLAST before so no clue what this error means or what might be wrong.
Any help or advice would be appreciated!
Metadata
Metadata
Assignees
Labels
No labels