Skip to content

Commit 01b8210

Browse files
atalmantinglvvmalfetnWEIdiajithunnair-amd
authored
Python 313 work - rebase on main (#1868)
* Remove triton constraint for py312 (#1846) * Cache OpenBLAS to docker image for SBSA builds (#1842) * apply openblas cache for cpu-aarch64 * reapply for cuda-aarch64 * [MacOS] Don't build wheel while building libtorch Not sure why this was ever done twice * Allow validate doker images to be called from different workflow (#1850) * Allow validate doker images to be called from different workflow * Revert "[MacOS] Don't build wheel while building libtorch" This reverts commit d88495a. * [MacOS] Don't build libtorch twice (take 2) By not invoking `tools/build_libtorch.py` as as it's not done on Linux * [MacOs][LibTorch] Copy libomp.dylib into libtorch package * Update cudnn from v8 to v9 across CUDA versions and x86/arm (#1847) * Update cudnn to v9.1.0.70 for cuda11.8, cuda12.1, and cuda12.4 * Add CUDNN_VERSION variable * Remove 2 spaces for install_cu124 * trivial fix * Fix DEPS_LIST and DEPS_SONAME for x86 Update cudnn to v9 for arm cuda binary as well * libcudnn_adv_infer/libcudnn_adv_train becomes libcudnn_adv * Change DEPS due to cudnn v9 libraries name changes (and additions) * Fix lint * Add missing changes to cu121/cu124 * Change OpenSSL URL (#1854) * Change OpenSSL URL * Change to use openssl URL (but no longer ftp!) * Update build-manywheel-images.yml - Add a note about manylinux_2_28 state * Revert "Update cudnn from v8 to v9 across CUDA versions and x86/arm" (#1855) This reverts commit 5783bcc. * Don't run torch.compile on runtime images in docker validations (#1858) * Don't run torch.compile on runtime images * test * Don't run torch.compile on runtime images in docker validations * Update cudnn from v8 to v9 across CUDA versions and x86/arm (#1857) * Update cudnn to v9.1.0.70 for cuda11.8, cuda12.1, and cuda12.4 * Add CUDNN_VERSION variable * Remove 2 spaces for install_cu124 * trivial fix * Fix DEPS_LIST and DEPS_SONAME for x86 Update cudnn to v9 for arm cuda binary as well * libcudnn_adv_infer/libcudnn_adv_train becomes libcudnn_adv * Change DEPS due to cudnn v9 libraries name changes (and additions) * Fix lint * Add missing changes to cu121/cu124 * Fix aarch64 cuda typos * Update validate-docker-images.yml - disable runtime error check for now * Update validate-docker-images.yml - use validation_runner rather then hardcoded one * Update validate-docker-images.yml - fix MATRIX_GPU_ARCH_TYPE setting for cpu only workflows * [aarch64 cuda cudnn] Add RUNPATH to libcudnn_graph.so.9 (#1859) * Add executorch to pypi prep, promotion and validation scripts (#1860) * Add AOTriton install step for ROCm manylinux images (#1862) * Add AOTriton install step for ROCm * No common_utils.sh needed * temporary disable runtime error check * Add python 3.13 builder (#1845) --------- Co-authored-by: Ting Lu <[email protected]> Co-authored-by: Nikita Shulga <[email protected]> Co-authored-by: Wei Wang <[email protected]> Co-authored-by: Jithun Nair <[email protected]>
1 parent 5d8c7af commit 01b8210

24 files changed

+224
-131
lines changed

.github/scripts/validate_binaries.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ else
6262
if [[ ${TARGET_OS} == 'windows' ]]; then
6363
python ./test/smoke_test/smoke_test.py ${TEST_SUFFIX}
6464
else
65-
python3 ./test/smoke_test/smoke_test.py ${TEST_SUFFIX}
65+
python3 ./test/smoke_test/smoke_test.py ${TEST_SUFFIX} --runtime-error-check "disabled"
6666
fi
6767

6868
if [[ ${TARGET_OS} == 'macos-arm64' ]]; then

.github/workflows/build-manywheel-images.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ jobs:
6060
- name: Build Docker Image
6161
run: |
6262
manywheel/build_docker.sh
63+
# NOTE: manylinux_2_28 are still experimental, see https://github.com/pytorch/pytorch/issues/123649
6364
build-docker-cuda-manylinux_2_28:
6465
runs-on: linux.12xlarge
6566
strategy:

.github/workflows/validate_docker_images.yml renamed to .github/workflows/validate-docker-images.yml

Lines changed: 37 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,22 @@
1-
name: Validate Docker Images (with Matrix Generation)
1+
name: Validate Nightly Docker Images
22
on:
3+
workflow_call:
4+
inputs:
5+
channel:
6+
description: 'PyTorch channel to use (nightly, test, release, all)'
7+
required: true
8+
type: string
9+
default: 'nightly'
10+
generate_dockerhub_images:
11+
description: 'Generate Docker Hub images (strip ghcr.io/ prefix for release)'
12+
default: false
13+
required: false
14+
type: boolean
15+
ref:
16+
description: 'Reference to checkout, defaults to empty'
17+
default: ""
18+
required: false
19+
type: string
320
workflow_dispatch:
421
inputs:
522
channel:
@@ -15,8 +32,13 @@ on:
1532
description: 'Generate Docker Hub images (strip ghcr.io/ prefix for release)'
1633
default: false
1734
required: false
18-
type: boolean
19-
35+
type: boolean
36+
ref:
37+
description: 'Reference to checkout, defaults to empty'
38+
default: ""
39+
required: false
40+
type: string
41+
2042
jobs:
2143
generate-matrix:
2244
uses: pytorch/test-infra/.github/workflows/generate_docker_release_matrix.yml@main
@@ -31,7 +53,7 @@ jobs:
3153
fail-fast: false
3254
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
3355
with:
34-
runner: linux.g5.4xlarge.nvidia.gpu
56+
runner: ${{ matrix.validation_runner }}
3557
repository: "pytorch/builder"
3658
ref: ${{ inputs.ref || github.ref }}
3759
job-name: cuda${{ matrix.cuda }}-cudnn${{ matrix.cudnn_version }}-${{ matrix.image_type }}
@@ -40,7 +62,16 @@ jobs:
4062
timeout: 180
4163
script: |
4264
set -ex
43-
export MATRIX_GPU_ARCH_TYPE="cuda"
65+
4466
export MATRIX_GPU_ARCH_VERSION="${{ matrix.cuda }}"
67+
export MATRIX_IMAGE_TYPE="${{ matrix.image_type }}"
4568
export TARGET_OS="linux"
46-
python test/smoke_test/smoke_test.py --package torchonly --runtime-error-check enabled
69+
TORCH_COMPILE_CHECK="--torch-compile-check enabled"
70+
if [[ ${MATRIX_IMAGE_TYPE} == "runtime" ]]; then
71+
TORCH_COMPILE_CHECK="--torch-compile-check disabled"
72+
fi
73+
export MATRIX_GPU_ARCH_TYPE="cuda"
74+
if [[ ${MATRIX_GPU_ARCH_VERSION} == "cpu" ]]; then
75+
export MATRIX_GPU_ARCH_TYPE="cpu"
76+
fi
77+
python test/smoke_test/smoke_test.py --package torchonly --runtime-error-check disabled ${TORCH_COMPILE_CHECK}

aarch64_linux/aarch64_wheel_ci_build.py

Lines changed: 11 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -14,44 +14,6 @@ def list_dir(path: str) -> List[str]:
1414
"""
1515
return check_output(["ls", "-1", path]).decode().split("\n")
1616

17-
18-
def build_OpenBLAS() -> None:
19-
'''
20-
Building OpenBLAS, because the package in many linux is old
21-
'''
22-
print('Building OpenBLAS')
23-
openblas_build_flags = [
24-
"NUM_THREADS=128",
25-
"USE_OPENMP=1",
26-
"NO_SHARED=0",
27-
"DYNAMIC_ARCH=1",
28-
"TARGET=ARMV8",
29-
"CFLAGS=-O3",
30-
]
31-
openblas_checkout_dir = "OpenBLAS"
32-
33-
check_call(
34-
[
35-
"git",
36-
"clone",
37-
"https://github.com/OpenMathLib/OpenBLAS.git",
38-
"-b",
39-
"v0.3.25",
40-
"--depth",
41-
"1",
42-
"--shallow-submodules",
43-
]
44-
)
45-
46-
check_call(["make", "-j8"]
47-
+ openblas_build_flags,
48-
cwd=openblas_checkout_dir)
49-
check_call(["make", "-j8"]
50-
+ openblas_build_flags
51-
+ ["install"],
52-
cwd=openblas_checkout_dir)
53-
54-
5517
def build_ArmComputeLibrary() -> None:
5618
"""
5719
Using ArmComputeLibrary for aarch64 PyTorch
@@ -103,7 +65,7 @@ def update_wheel(wheel_path) -> None:
10365
os.system(f"unzip {wheel_path} -d {folder}/tmp")
10466
libs_to_copy = [
10567
"/usr/local/cuda/extras/CUPTI/lib64/libcupti.so.12",
106-
"/usr/local/cuda/lib64/libcudnn.so.8",
68+
"/usr/local/cuda/lib64/libcudnn.so.9",
10769
"/usr/local/cuda/lib64/libcublas.so.12",
10870
"/usr/local/cuda/lib64/libcublasLt.so.12",
10971
"/usr/local/cuda/lib64/libcudart.so.12",
@@ -116,12 +78,13 @@ def update_wheel(wheel_path) -> None:
11678
"/usr/local/cuda/lib64/libnvJitLink.so.12",
11779
"/usr/local/cuda/lib64/libnvrtc.so.12",
11880
"/usr/local/cuda/lib64/libnvrtc-builtins.so.12.4",
119-
"/usr/local/cuda/lib64/libcudnn_adv_infer.so.8",
120-
"/usr/local/cuda/lib64/libcudnn_adv_train.so.8",
121-
"/usr/local/cuda/lib64/libcudnn_cnn_infer.so.8",
122-
"/usr/local/cuda/lib64/libcudnn_cnn_train.so.8",
123-
"/usr/local/cuda/lib64/libcudnn_ops_infer.so.8",
124-
"/usr/local/cuda/lib64/libcudnn_ops_train.so.8",
81+
"/usr/local/cuda/lib64/libcudnn_adv.so.9",
82+
"/usr/local/cuda/lib64/libcudnn_cnn.so.9",
83+
"/usr/local/cuda/lib64/libcudnn_graph.so.9",
84+
"/usr/local/cuda/lib64/libcudnn_ops.so.9",
85+
"/usr/local/cuda/lib64/libcudnn_engines_runtime_compiled.so.9",
86+
"/usr/local/cuda/lib64/libcudnn_engines_precompiled.so.9",
87+
"/usr/local/cuda/lib64/libcudnn_heuristic.so.9",
12588
"/opt/conda/envs/aarch64_env/lib/libgomp.so.1",
12689
"/opt/OpenBLAS/lib/libopenblas.so.0",
12790
"/acl/build/libarm_compute.so",
@@ -134,6 +97,9 @@ def update_wheel(wheel_path) -> None:
13497
os.system(
13598
f"cd {folder}/tmp/torch/lib/; patchelf --set-rpath '$ORIGIN' {folder}/tmp/torch/lib/libtorch_cuda.so"
13699
)
100+
os.system(
101+
f"cd {folder}/tmp/torch/lib/; patchelf --set-rpath '$ORIGIN' {folder}/tmp/torch/lib/libcudnn_graph.so.9"
102+
)
137103
os.mkdir(f"{folder}/cuda_wheel")
138104
os.system(f"cd {folder}/tmp/; zip -r {folder}/cuda_wheel/{wheelname} *")
139105
shutil.move(
@@ -227,7 +193,6 @@ def parse_arguments():
227193
elif branch.startswith(("v1.", "v2.")):
228194
build_vars += f"BUILD_TEST=0 PYTORCH_BUILD_VERSION={branch[1:branch.find('-')]} PYTORCH_BUILD_NUMBER=1 "
229195

230-
build_OpenBLAS()
231196
if enable_mkldnn:
232197
build_ArmComputeLibrary()
233198
print("build pytorch with mkldnn+acl backend")

analytics/validate_pypi_staging.py

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -15,13 +15,20 @@
1515
"win_amd64",
1616
"macosx_11_0_arm64",
1717
]
18-
PYTHON_VERSIONS = ["cp38", "cp39", "cp310", "cp311", "cp312"]
18+
PYTHON_VERSIONS = [
19+
"cp38",
20+
"cp39",
21+
"cp310",
22+
"cp311",
23+
"cp312"
24+
]
1925
S3_PYPI_STAGING = "pytorch-backup"
2026
PACKAGE_RELEASES = {
21-
"torch": "2.3.0",
22-
"torchvision": "0.18.0",
23-
"torchaudio": "2.3.0",
27+
"torch": "2.3.1",
28+
"torchvision": "0.18.1",
29+
"torchaudio": "2.3.1",
2430
"torchtext": "0.18.0",
31+
"executorch": "0.2.1"
2532
}
2633

2734
PATTERN_V = "Version:"

common/aotriton_version.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
0.6b
2+
manylinux_2_17
3+
rocm6
4+
04b5df8c8123f90cba3ede7e971e6fbc6040d506
5+
3db6ecbc915893ff967abd6e1b43bd5f54949868873be60dc802086c3863e648

common/install_aotriton.sh

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
#!/bin/bash
2+
3+
set -ex
4+
5+
TARBALL='aotriton.tar.bz2'
6+
# This read command alwasy returns with exit code 1
7+
read -d "\n" VER MANYLINUX ROCMBASE PINNED_COMMIT SHA256 < aotriton_version.txt || true
8+
ARCH=$(uname -m)
9+
AOTRITON_INSTALL_PREFIX="$1"
10+
AOTRITON_URL="https://github.com/ROCm/aotriton/releases/download/${VER}/aotriton-${VER}-${MANYLINUX}_${ARCH}-${ROCMBASE}.tar.bz2"
11+
12+
cd "${AOTRITON_INSTALL_PREFIX}"
13+
# Must use -L to follow redirects
14+
curl -L --retry 3 -o "${TARBALL}" "${AOTRITON_URL}"
15+
ACTUAL_SHA256=$(sha256sum "${TARBALL}" | cut -d " " -f 1)
16+
if [ "${SHA256}" != "${ACTUAL_SHA256}" ]; then
17+
echo -n "Error: The SHA256 of downloaded tarball is ${ACTUAL_SHA256},"
18+
echo " which does not match the expected value ${SHA256}."
19+
exit
20+
fi
21+
tar xf "${TARBALL}" && rm -rf "${TARBALL}"

common/install_cuda.sh

Lines changed: 17 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
set -ex
44

5+
CUDNN_VERSION=9.1.0.70
6+
57
function install_cusparselt_040 {
68
# cuSparseLt license: https://docs.nvidia.com/cuda/cusparselt/license.html
79
mkdir tmp_cusparselt && pushd tmp_cusparselt
@@ -25,7 +27,7 @@ function install_cusparselt_052 {
2527
}
2628

2729
function install_118 {
28-
echo "Installing CUDA 11.8 and cuDNN 8.7 and NCCL 2.15 and cuSparseLt-0.4.0"
30+
echo "Installing CUDA 11.8 and cuDNN ${CUDNN_VERSION} and NCCL 2.15 and cuSparseLt-0.4.0"
2931
rm -rf /usr/local/cuda-11.8 /usr/local/cuda
3032
# install CUDA 11.8.0 in the same container
3133
wget -q https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
@@ -36,10 +38,10 @@ function install_118 {
3638

3739
# cuDNN license: https://developer.nvidia.com/cudnn/license_agreement
3840
mkdir tmp_cudnn && cd tmp_cudnn
39-
wget -q https://developer.download.nvidia.com/compute/redist/cudnn/v8.7.0/local_installers/11.8/cudnn-linux-x86_64-8.7.0.84_cuda11-archive.tar.xz -O cudnn-linux-x86_64-8.7.0.84_cuda11-archive.tar.xz
40-
tar xf cudnn-linux-x86_64-8.7.0.84_cuda11-archive.tar.xz
41-
cp -a cudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/* /usr/local/cuda/include/
42-
cp -a cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/* /usr/local/cuda/lib64/
41+
wget -q https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/linux-x86_64/cudnn-linux-x86_64-${CUDNN_VERSION}_cuda11-archive.tar.xz -O cudnn-linux-x86_64-${CUDNN_VERSION}_cuda11-archive.tar.xz
42+
tar xf cudnn-linux-x86_64-${CUDNN_VERSION}_cuda11-archive.tar.xz
43+
cp -a cudnn-linux-x86_64-${CUDNN_VERSION}_cuda11-archive/include/* /usr/local/cuda/include/
44+
cp -a cudnn-linux-x86_64-${CUDNN_VERSION}_cuda11-archive/lib/* /usr/local/cuda/lib64/
4345
cd ..
4446
rm -rf tmp_cudnn
4547

@@ -58,7 +60,7 @@ function install_118 {
5860
}
5961

6062
function install_121 {
61-
echo "Installing CUDA 12.1 and cuDNN 8.9 and NCCL 2.20.5 and cuSparseLt-0.5.2"
63+
echo "Installing CUDA 12.1 and cuDNN ${CUDNN_VERSION} and NCCL 2.20.5 and cuSparseLt-0.5.2"
6264
rm -rf /usr/local/cuda-12.1 /usr/local/cuda
6365
# install CUDA 12.1.0 in the same container
6466
wget -q https://developer.download.nvidia.com/compute/cuda/12.1.1/local_installers/cuda_12.1.1_530.30.02_linux.run
@@ -69,10 +71,10 @@ function install_121 {
6971

7072
# cuDNN license: https://developer.nvidia.com/cudnn/license_agreement
7173
mkdir tmp_cudnn && cd tmp_cudnn
72-
wget -q https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/linux-x86_64/cudnn-linux-x86_64-8.9.2.26_cuda12-archive.tar.xz -O cudnn-linux-x86_64-8.9.2.26_cuda12-archive.tar.xz
73-
tar xf cudnn-linux-x86_64-8.9.2.26_cuda12-archive.tar.xz
74-
cp -a cudnn-linux-x86_64-8.9.2.26_cuda12-archive/include/* /usr/local/cuda/include/
75-
cp -a cudnn-linux-x86_64-8.9.2.26_cuda12-archive/lib/* /usr/local/cuda/lib64/
74+
wget -q https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/linux-x86_64/cudnn-linux-x86_64-${CUDNN_VERSION}_cuda12-archive.tar.xz -O cudnn-linux-x86_64-${CUDNN_VERSION}_cuda12-archive.tar.xz
75+
tar xf cudnn-linux-x86_64-${CUDNN_VERSION}_cuda12-archive.tar.xz
76+
cp -a cudnn-linux-x86_64-${CUDNN_VERSION}_cuda12-archive/include/* /usr/local/cuda/include/
77+
cp -a cudnn-linux-x86_64-${CUDNN_VERSION}_cuda12-archive/lib/* /usr/local/cuda/lib64/
7678
cd ..
7779
rm -rf tmp_cudnn
7880

@@ -91,7 +93,7 @@ function install_121 {
9193
}
9294

9395
function install_124 {
94-
echo "Installing CUDA 12.4 and cuDNN 8.9 and NCCL 2.20.5 and cuSparseLt-0.5.2"
96+
echo "Installing CUDA 12.4 and cuDNN ${CUDNN_VERSION} and NCCL 2.20.5 and cuSparseLt-0.5.2"
9597
rm -rf /usr/local/cuda-12.4 /usr/local/cuda
9698
# install CUDA 12.4.0 in the same container
9799
wget -q https://developer.download.nvidia.com/compute/cuda/12.4.0/local_installers/cuda_12.4.0_550.54.14_linux.run
@@ -102,10 +104,10 @@ function install_124 {
102104

103105
# cuDNN license: https://developer.nvidia.com/cudnn/license_agreement
104106
mkdir tmp_cudnn && cd tmp_cudnn
105-
wget -q https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/linux-x86_64/cudnn-linux-x86_64-8.9.2.26_cuda12-archive.tar.xz -O cudnn-linux-x86_64-8.9.2.26_cuda12-archive.tar.xz
106-
tar xf cudnn-linux-x86_64-8.9.2.26_cuda12-archive.tar.xz
107-
cp -a cudnn-linux-x86_64-8.9.2.26_cuda12-archive/include/* /usr/local/cuda/include/
108-
cp -a cudnn-linux-x86_64-8.9.2.26_cuda12-archive/lib/* /usr/local/cuda/lib64/
107+
wget -q https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/linux-x86_64/cudnn-linux-x86_64-${CUDNN_VERSION}_cuda12-archive.tar.xz -O cudnn-linux-x86_64-${CUDNN_VERSION}_cuda12-archive.tar.xz
108+
tar xf cudnn-linux-x86_64-${CUDNN_VERSION}_cuda12-archive.tar.xz
109+
cp -a cudnn-linux-x86_64-${CUDNN_VERSION}_cuda12-archive/include/* /usr/local/cuda/include/
110+
cp -a cudnn-linux-x86_64-${CUDNN_VERSION}_cuda12-archive/lib/* /usr/local/cuda/lib64/
109111
cd ..
110112
rm -rf tmp_cudnn
111113

common/install_cuda_aarch64.sh

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ function install_cusparselt_052 {
1414
}
1515

1616
function install_124 {
17-
echo "Installing CUDA 12.4 and cuDNN 8.9 and NCCL 2.20.5 and cuSparseLt-0.5.2"
17+
echo "Installing CUDA 12.4 and cuDNN 9.1 and NCCL 2.20.5 and cuSparseLt-0.5.2"
1818
rm -rf /usr/local/cuda-12.4 /usr/local/cuda
1919
# install CUDA 12.4.0 in the same container
2020
wget -q https://developer.download.nvidia.com/compute/cuda/12.4.0/local_installers/cuda_12.4.0_550.54.14_linux_sbsa.run
@@ -25,10 +25,10 @@ function install_124 {
2525

2626
# cuDNN license: https://developer.nvidia.com/cudnn/license_agreement
2727
mkdir tmp_cudnn && cd tmp_cudnn
28-
wget -q https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/linux-sbsa/cudnn-linux-sbsa-8.9.2.26_cuda12-archive.tar.xz -O cudnn-linux-sbsa-8.9.2.26_cuda12-archive.tar.xz
29-
tar xf cudnn-linux-sbsa-8.9.2.26_cuda12-archive.tar.xz
30-
cp -a cudnn-linux-sbsa-8.9.2.26_cuda12-archive/include/* /usr/local/cuda/include/
31-
cp -a cudnn-linux-sbsa-8.9.2.26_cuda12-archive/lib/* /usr/local/cuda/lib64/
28+
wget -q https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/linux-sbsa/cudnn-linux-sbsa-9.1.0.70_cuda12-archive.tar.xz -O cudnn-linux-sbsa-9.1.0.70_cuda12-archive.tar.xz
29+
tar xf cudnn-linux-sbsa-9.1.0.70_cuda12-archive.tar.xz
30+
cp -a cudnn-linux-sbsa-9.1.0.70_cuda12-archive/include/* /usr/local/cuda/include/
31+
cp -a cudnn-linux-sbsa-9.1.0.70_cuda12-archive/lib/* /usr/local/cuda/lib64/
3232
cd ..
3333
rm -rf tmp_cudnn
3434

common/install_openblas.sh

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
#!/bin/bash
2+
3+
set -ex
4+
5+
cd /
6+
git clone https://github.com/OpenMathLib/OpenBLAS.git -b v0.3.25 --depth 1 --shallow-submodules
7+
8+
9+
OPENBLAS_BUILD_FLAGS="
10+
NUM_THREADS=128
11+
USE_OPENMP=1
12+
NO_SHARED=0
13+
DYNAMIC_ARCH=1
14+
TARGET=ARMV8
15+
CFLAGS=-O3
16+
"
17+
18+
OPENBLAS_CHECKOUT_DIR="OpenBLAS"
19+
20+
make -j8 ${OPENBLAS_BUILD_FLAGS} -C ${OPENBLAS_CHECKOUT_DIR}
21+
make -j8 ${OPENBLAS_BUILD_FLAGS} install -C ${OPENBLAS_CHECKOUT_DIR}

0 commit comments

Comments
 (0)