Anyone got CLBLAST working on Intel macOS with AMD GPU? Is it meant to work?

Hi all

I just learned about CLBLAST so wanted to try it at home on my Intel macOS system with AMD 6900XT GPU.

I have no idea if it's meant to work on this system or with AMD GPUs? Maybe it's only designed for NV on Linux or Windows at the moment?  But I figured as it's using OpenCL, it *should* work with any GPU? Maybe? :)

Installing CLBLAST is easy:
```
tomj@Eddie ~/src/llama.cpp (master●●)$ brew install clblast
==> Downloading https://formulae.brew.sh/api/formula.jws.json
....
==> Pouring clblast--1.5.3_1.ventura.bottle.tar.gz
🍺  /usr/local/Cellar/clblast/1.5.3_1: 41 files, 11.6MB
==> Running `brew cleanup clblast`...
Disable this behaviour by setting HOMEBREW_NO_INSTALL_CLEANUP.
Hide these hints with HOMEBREW_NO_ENV_HINTS (see `man brew`).
```

Compiling went fine:
```
tomj@Eddie ~/src/llama.cpp (master●●)$ make clean && LLAMA_CLBLAST=1 make
......
c++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -c llama.cpp -o llama.o
c++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -c examples/common.cpp -o common.o
examples/common.cpp:750:24: warning: comparison of integers of different signs: 'char32_t' and '__darwin_wint_t' (aka 'int') [-Wsign-compare]
        if (input_char == WEOF || input_char == 0x04 /* Ctrl+D*/) {
            ~~~~~~~~~~ ^  ~~~~
examples/common.cpp:765:45: warning: comparison of integers of different signs: 'char32_t' and '__darwin_wint_t' (aka 'int') [-Wsign-compare]
                while ((code = getchar32()) != WEOF) {
                        ~~~~~~~~~~~~~~~~~~  ^  ~~~~
2 warnings generated.
cc -I.              -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native -DGGML_USE_ACCELERATE -DGGML_USE_CLBLAST -c ggml-opencl.c -o ggml-opencl.o
In file included from ggml-opencl.c:4:
/usr/local/include/clblast_c.h:1686:47: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
CLBlastStatusCode PUBLIC_API CLBlastClearCache();
                                              ^
                                               void
1 warning generated.
c++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/main/main.cpp ggml.o llama.o common.o ggml-opencl.o -o main  -framework Accelerate -lclblast -framework OpenCL

====  Run ./main -h for help.  ====

c++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/quantize/quantize.cpp ggml.o llama.o ggml-opencl.o -o quantize  -framework Accelerate -lclblast -framework OpenCL
c++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/quantize-stats/quantize-stats.cpp ggml.o llama.o ggml-opencl.o -o quantize-stats  -framework Accelerate -lclblast -framework OpenCL
c++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/perplexity/perplexity.cpp ggml.o llama.o common.o ggml-opencl.o -o perplexity  -framework Accelerate -lclblast -framework OpenCL
c++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/embedding/embedding.cpp ggml.o llama.o common.o ggml-opencl.o -o embedding  -framework Accelerate -lclblast -framework OpenCL
c++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native pocs/vdot/vdot.cpp ggml.o ggml-opencl.o -o vdot  -framework Accelerate -lclblast -framework OpenCL
tomj@Eddie ~/src/llama.cpp (master●●●)$
```

First attempt got this problem - it's using the wrong device:
```
tomj@Eddie ~/src/llama.cpp (master●●)$ ./main -t 16 -m ~/src/huggingface/Wizard-Vicuna-13B-Uncensored-GGML/Wizard-Vicuna-13B-Uncensored.ggml.q5_1.bin -n 512 -p "### Instruction: write a story about llamas\n### Response:"
main: build = 540 (f048af0)
main: seed  = 1683982897
llama.cpp: loading model from /Users/tomj/src/huggingface/Wizard-Vicuna-13B-Uncensored-GGML/Wizard-Vicuna-13B-Uncensored.ggml.q5_1.bin
llama_model_load_internal: format     = ggjt v2 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 512
llama_model_load_internal: n_embd     = 5120
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 40
llama_model_load_internal: n_layer    = 40
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 9 (mostly Q5_1)
llama_model_load_internal: n_ff       = 13824
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 13B
llama_model_load_internal: ggml ctx size =  85.08 KB
llama_model_load_internal: mem required  = 11359.04 MB (+ 1608.00 MB per state)

Initializing CLBlast (First Run)...
Attempting to use: Platform=0, Device=0 (If invalid, program will crash)
Using Platform: Apple Device: Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz
OpenCL clCreateCommandQueue error -30 at ggml-opencl.c:215
```

My CPU, not GPU.

So I edited `ggml-opencl.c` and changed this line to device 1 :
```
  int dev_num = (GGML_CLBLAST_DEVICE == NULL ? 1 : atoi(GGML_CLBLAST_DEVICE));
```

Now it tries to use my GPU, but still fails with exactly the same error:
```
tomj@Eddie ~/src/llama.cpp (master●●●)$ ./main -t 16 -m ~/src/huggingface/Wizard-Vicuna-13B-Uncensored-GGML/Wizard-Vicuna-13B-Uncensored.ggml.q5_1.bin -n 512 -p "### Instruction: write a story about llamas\n### Response:"
main: build = 540 (f048af0)
main: seed  = 1683983030
llama.cpp: loading model from /Users/tomj/src/huggingface/Wizard-Vicuna-13B-Uncensored-GGML/Wizard-Vicuna-13B-Uncensored.ggml.q5_1.bin
llama_model_load_internal: format     = ggjt v2 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 512
llama_model_load_internal: n_embd     = 5120
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 40
llama_model_load_internal: n_layer    = 40
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 9 (mostly Q5_1)
llama_model_load_internal: n_ff       = 13824
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 13B
llama_model_load_internal: ggml ctx size =  85.08 KB
llama_model_load_internal: mem required  = 11359.04 MB (+ 1608.00 MB per state)

Initializing CLBlast (First Run)...
Attempting to use: Platform=0, Device=1 (If invalid, program will crash)
Using Platform: Apple Device: AMD Radeon RX 6900 XT Compute Engine
OpenCL clCreateCommandQueue error -30 at ggml-opencl.c:215
```

I've never used CLBLAST before so no clue what this error means or what might be wrong.

Any help or advice would be appreciated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Anyone got CLBLAST working on Intel macOS with AMD GPU? Is it meant to work? #1429

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Anyone got CLBLAST working on Intel macOS with AMD GPU? Is it meant to work? #1429

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions