-
Notifications
You must be signed in to change notification settings - Fork 627
Closed
Labels
triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Description
I'm trying to run Qwen3-0.6B model on Android using XNNPACK, following the instructions in qwen3 example and step4 of llama example.
When running ./llama_main --model_path qwen3-0_6b.pte --tokenizer_path tokenizer.json --prompt "Hi" --seq_len 120
on Galaxy S23, I got the following error while setting up pretokenizer.
I 00:00:00.003396 executorch:cpuinfo_utils.cpp:62] Reading file /sys/devices/soc0/image_version
I 00:00:00.003583 executorch:main.cpp:76] Resetting threadpool with num threads = 4
I 00:00:00.008963 executorch:runner.cpp:90] Creating LLaMa runner: model_path=qwen3-0_6b.pte, tokenizer_path=tokenizer.json
Setting up pretokenizer...
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1747192804.212017 24371 re2.cc:237] Error parsing '((?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\r\n\p{L}\p{N}]?\p{L}+|\p{N}| ?[^\s\p{L}\p{N}]+[\r\n]*|\s*[\r\n]+|\s...': invalid perl operator: (?!
RE2 failed to compile pattern with lookahead: (?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\r\n\p{L}\p{N}]?\p{L}+|\p{N}| ?[^\s\p{L}\p{N}]+[\r\n]*|\s*[\r\n]+|\s+(?!\S)|\s+
Error: invalid perl operator: (?!
Compile with SUPPORT_REGEX_LOOKAHEAD=ON to enable support for lookahead patterns.
libc++abi: terminating due to uncaught exception of type std::runtime_error: Error: 4
Aborted
It seems the runner now accepts .json
format for the tokenizer. Is there anything I'm missing?
Environments
- executorch main branch (b173722)
- Android NDK r28b
- Galaxy S23
- Run
install_executorch.sh --pybind xnnpack
andexamples/models/llama/install_requirements.sh
.
Commands used (used commands in the instructions)
- model
python -m examples.models.llama.export_llama \ --model qwen3-0_6b \ --params examples/models/qwen3/0_6b_config.json \ -kv \ --use_sdpa_with_kv_cache \ -d fp32 \ -X \ --xnnpack-extended-ops \ -qmode 8da4w \ --metadata '{"get_bos_id": 151644, "get_eos_ids":[151645]}' \ --output_name="qwen3-0_6b.pte" \ --verbose
- runner
cmake -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \ -DANDROID_ABI=arm64-v8a \ -DANDROID_PLATFORM=android-23 \ -DCMAKE_INSTALL_PREFIX=cmake-out-android \ -DCMAKE_BUILD_TYPE=Release \ -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \ -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \ -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \ -DEXECUTORCH_ENABLE_LOGGING=1 \ -DPYTHON_EXECUTABLE=python \ -DEXECUTORCH_BUILD_XNNPACK=ON \ -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \ -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \ -DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON \ -Bcmake-out-android . cmake --build cmake-out-android -j16 --target install --config Release cmake -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \ -DANDROID_ABI=arm64-v8a \ -DANDROID_PLATFORM=android-23 \ -DCMAKE_INSTALL_PREFIX=cmake-out-android \ -DCMAKE_BUILD_TYPE=Release \ -DPYTHON_EXECUTABLE=python \ -DEXECUTORCH_BUILD_XNNPACK=ON \ -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \ -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \ -DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON \ -Bcmake-out-android/examples/models/llama \ examples/models/llama cmake --build cmake-out-android/examples/models/llama -j16 --config Release
After, adb pushed .pte, tokenizer.json, llama_main to the device and executed with the command above.
Metadata
Metadata
Assignees
Labels
triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module