Skip to content

Slow model loading time for CoreML quantized model #5718

@cccclai

Description

@cccclai

🐛 Describe the bug

Get #5710 and run

python executorch.examples.apple.coreml.scripts.export -m resnet18 --quantize

The FP32 model runs fully resident on ANE at 0.9ms on average and 11.13ms cold-start (first inference).
The int8 quantized model runs also fully resident on ANE at 0.54ms on average and 3.10 ms cold-start. Also looking at the layers, looks like there is a lot of quantize followed immediately by dequantize.

Versions

Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 15.0 (arm64)
GCC version: Could not collect
Clang version: 16.0.0 (clang-1600.0.26.3)
CMake version: version 3.29.2
Libc version: N/A

Python version: 3.11.5 (main, Sep 11 2023, 08:31:25) [Clang 14.0.6 ] (64-bit runtime)
Python platform: macOS-15.0-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M1 Pro

Versions of relevant libraries:
[pip3] executorch==0.4.0a0+7047162
[pip3] flake8==6.0.0
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.26.4
[pip3] numpydoc==1.5.0
[pip3] torch==2.5.0.dev20240618
[pip3] torchaudio==2.4.0.dev20240618
[pip3] torchsr==1.0.4
[pip3] torchvision==0.20.0.dev20240618
[conda] executorch                0.4.0a0+7047162          pypi_0    pypi
[conda] numpy                     1.26.4                   pypi_0    pypi
[conda] numpydoc                  1.5.0           py311hca03da5_0
[conda] torch                     2.4.0a0+gitae81855           dev_0    <develop>
[conda] torchaudio                2.4.0.dev20240618          pypi_0    pypi
[conda] torchsr                   1.0.4                    pypi_0    pypi
[conda] torchvision               0.20.0.dev20240618          pypi_0    pypi

Metadata

Metadata

Labels

module: coremlIssues related to Apple's Core ML delegation and code under backends/apple/coreml/triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

Status

To Triage

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions