-
Notifications
You must be signed in to change notification settings - Fork 795
Get rid of post-commit testing #14383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,7 +15,14 @@ jobs: | |
with: | ||
build_cache_root: "/__w/" | ||
build_artifact_suffix: default | ||
build_configure_extra_args: '--hip --cuda' | ||
build_configure_extra_args: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we build precommit linux with the same flags (It looks like we don't but maybe I can't read) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added this for more diversity i.e. pre-commit is static build (llvm config default), and then nightly testing gets exercised with a shared config. I'm not married to it, but I figured it was any easy way to increase coverage There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One issue we actually do see relatively frequently is someone breaking the shared library build (usually missing a library dep in the cmake file) and today that is usually caught in post commit because that does shared lib build. With this, we would only catch it in the nightly. However I think it's reasonable to assume that most developers are using the default config (configure.py) and assuming we have someone guarneteed to be checking the nightly and quickly report and/or fix issues, I don't have a huge problem with this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I wonder if it's possible to break static library build and not shared library build. If that's extremely unlikely such that either both are broken or shlib is broken, doing shlib in precommit would be fine with me. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We probably should just drop "sharedlib" build configuration. The primary use case for this configuration is to speed-up incremental build of LLVM libraries. This build configuration is not use in the production environment - it's used only by LLVM developers. If everyone uses standard build, there is no point to build "sharedlib" configuration in CI. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Personally I use it rather frequently and I would be annoyed if it was failing and not caught in CI. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Our downstream uses it a lot. It would be very bad if the issues will only be discovered when syncing downstream with upstream and not in upstream CI. |
||
--shared-libs | ||
--no-assertions | ||
--hip | ||
--cuda | ||
--native_cpu | ||
--cmake-opt="-DSYCL_ENABLE_STACK_PRINTING=ON" | ||
--cmake-opt="-DSYCL_LIB_WITH_DEBUG_SYMBOL=ON" | ||
merge_ref: '' | ||
retention-days: 90 | ||
|
||
|
@@ -80,6 +87,37 @@ jobs: | |
image_options: -u 1001 --device=/dev/dri --privileged --cap-add SYS_ADMIN | ||
target_devices: opencl:cpu | ||
tests_selector: cts | ||
|
||
- name: E2E tests with dev igc on Intel Arc A-Series Graphics | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please update your commit message with the summary of changes related to Arc GPU testing. @sarnex and @YuriPlyakhin would have to approve the changes to the devigc usage. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems as part of this PR we stop doing arc dev igc testing in precommit and only do it in the nightly. This is fine with me but we need @jsji's approval, he worked on the dev igc CI and is more familiar with the use case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ldrumm , why do you suggest to remove dev igc testing from precommit? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the time spent in running the dev igc testing is fairly small, we should try to keep it in pre-commit if it is possible -- moving it from pre-commit to post-commit (nightly) would make things complicated unless we can identify the culprit commit and ping author automatically in post-commit (nightly). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 to what @jsji said There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The time testing the dev driver might be small but it results in an instability. The principle here is to have a stable testing platform where the user's PR is tested against a stable config, not where a users PR is tested against unstable GPU drivers. If we're to have any unstable test plastforms available in pre-commit they should be explicitly opt-in only There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It basically boils down to "GPU driver developers should do their own testing, not experiment on dpc++ contributors" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks @ldrumm , understand the principle. However, the dev igc testing is not trying to do testing for GPU driver, but rather doing the continuous integration tests for Joint Matrix/ESIMD features, which require IGC changes time-to-time. The testing already restricted to Joint Matrix and ESIMD only. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, @ldrumm , dev igc driver is being updated only after Joint Matrix/ESIMD E2E tests passed with it, so it is stable in context of precommit testing. |
||
runner: '["Linux", "arc"]' | ||
image: ghcr.io/intel/llvm/ubuntu2204_intel_drivers:devigc | ||
image_options: -u 1001 --device=/dev/dri --privileged --cap-add SYS_ADMIN | ||
target_devices: ext_oneapi_level_zero:gpu;opencl:gpu | ||
reset_gpu: true | ||
install_drivers: >- | ||
${{ contains(needs.detect_changes.outputs.filters, 'drivers') || | ||
contains(needs.detect_changes.outputs.filters, 'devigccfg') }} | ||
use_dev_igc: ${{ contains(needs.detect_changes.outputs.filters, 'devigccfg') }} | ||
extra_lit_opts: --param matrix-xmx8=True --param gpu-intel-dg2=True | ||
|
||
# Performance tests below. Specifics: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These tests are very dumb for now and the idea behind them is to be able to access the logs to look through performance number on a per-commit basis. I don't see a reason to add this into nightly... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure. I'll move them back |
||
# - only run performance tests (use LIT_FILTER env) | ||
# - ask llvm-lit to show all the output, even for PASS (-a) | ||
# - run in single thread (-j 1) | ||
# - enable the tests in LIT (--param enable-perf-tests=True) | ||
# - run on all available devices. | ||
- name: Perf tests on Intel GEN12 Graphics system | ||
runner: '["Linux", "gen12"]' | ||
env: '{"LIT_FILTER":"PerformanceTests/"}' | ||
extra_lit_opts: -a -j 1 --param enable-perf-tests=True --param gpu-intel-gen12=True | ||
target_devices: all | ||
|
||
- name: Perf tests on Intel Arc A-Series Graphics system | ||
runner: '["Linux", "arc"]' | ||
env: '{"LIT_FILTER":"PerformanceTests/"}' | ||
extra_lit_opts: -a -j 1 --param enable-perf-tests=True | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we need There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Gotcha. Will fix |
||
target_devices: all | ||
|
||
uses: ./.github/workflows/sycl-linux-run-tests.yml | ||
with: | ||
name: ${{ matrix.name }} | ||
|
This file was deleted.
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -1,4 +1,4 @@ | ||||||
name: SYCL Pre Commit on Linux | ||||||
name: SYCL Pre Commit | ||||||
|
||||||
on: | ||||||
# We rely on "Fork pull request workflows from outside collaborators" - | ||||||
|
@@ -18,9 +18,6 @@ on: | |||||
- 'clang/docs/**' | ||||||
- '**.md' | ||||||
- '**.rst' | ||||||
- '.github/workflows/sycl-windows-*.yml' | ||||||
- '.github/workflows/sycl-macos-*.yml' | ||||||
- '.github/workflows/sycl-nightly.yml' | ||||||
- 'devops/containers/**' | ||||||
- 'devops/actions/build_container/**' | ||||||
|
||||||
|
@@ -35,7 +32,7 @@ jobs: | |||||
detect_changes: | ||||||
uses: ./.github/workflows/sycl-detect-changes.yml | ||||||
|
||||||
build: | ||||||
build_linux: | ||||||
needs: [detect_changes] | ||||||
if: always() && success() | ||||||
uses: ./.github/workflows/sycl-linux-build.yml | ||||||
|
@@ -46,12 +43,11 @@ jobs: | |||||
build_artifact_suffix: "default" | ||||||
build_cache_suffix: "default" | ||||||
changes: ${{ needs.detect_changes.outputs.filters }} | ||||||
build_image: "ghcr.io/intel/llvm/ubuntu2204_build:latest-0300ac924620a51f76c4929794637b82790f12ab" | ||||||
|
||||||
determine_arc_tests: | ||||||
name: Decide which Arc tests to run | ||||||
needs: [build, detect_changes] | ||||||
if: ${{ always() && !cancelled() && needs.build.outputs.build_conclusion == 'success' }} | ||||||
needs: [build_linux, detect_changes] | ||||||
if: ${{ always() && !cancelled() && needs.build_linux.outputs.build_conclusion == 'success' }} | ||||||
runs-on: [Linux, aux-tasks] | ||||||
timeout-minutes: 3 | ||||||
outputs: | ||||||
|
@@ -69,27 +65,38 @@ jobs: | |||||
else | ||||||
echo 'arc_tests="Matrix/"' >> "$GITHUB_OUTPUT" | ||||||
fi | ||||||
test: | ||||||
needs: [build, detect_changes, determine_arc_tests] | ||||||
if: ${{ always() && !cancelled() && needs.build.outputs.build_conclusion == 'success' }} | ||||||
|
||||||
test_linux: | ||||||
needs: [build_linux, detect_changes, determine_arc_tests] | ||||||
if: ${{ always() && !cancelled() && needs.build_linux.outputs.build_conclusion == 'success' }} | ||||||
strategy: | ||||||
fail-fast: false | ||||||
matrix: | ||||||
include: | ||||||
- name: AMD/HIP | ||||||
# XXX CUDA testing is in a separate workflow because it runs on AWS | ||||||
# which has different access restrictions/requirements. See | ||||||
# sycl-linux-precommit.yml if you're looking for the CUDA precommit | ||||||
# definition | ||||||
- name: E2E on AMD/HIP | ||||||
runner: '["Linux", "amdgpu"]' | ||||||
image: ghcr.io/intel/llvm/ubuntu2204_build:latest-0300ac924620a51f76c4929794637b82790f12ab | ||||||
image_options: -u 1001 --device=/dev/dri --device=/dev/kfd | ||||||
target_devices: ext_oneapi_hip:gpu | ||||||
- name: Intel | ||||||
|
||||||
- name: E2E on Intel GEN12; level zero GPU, OpenCL CPU | ||||||
runner: '["Linux", "gen12"]' | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Everywhere in the code: |
||||||
image: ghcr.io/intel/llvm/ubuntu2204_intel_drivers:latest | ||||||
image_options: -u 1001 --device=/dev/dri --privileged --cap-add SYS_ADMIN | ||||||
target_devices: ext_oneapi_level_zero:gpu;opencl:gpu;opencl:cpu | ||||||
reset_gpu: true | ||||||
install_drivers: ${{ contains(needs.detect_changes.outputs.filters, 'drivers') }} | ||||||
extra_lit_opts: --param gpu-intel-gen12=True | ||||||
- name: E2E tests on Intel Arc A-Series Graphics | ||||||
|
||||||
- name: E2E on Intel GEN12 Graphics; OpenCL GPU, OpenCL FPGA | ||||||
runner: '["Linux", "gen12"]' | ||||||
extra_lit_opts: --param gpu-intel-gen12=True | ||||||
target_devices: opencl:gpu;opencl:fpga | ||||||
|
||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This might not be an issue introduced by this PR and may already exist, but I think having @aelovikov-intel Any idea? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a parallel activity to auto-detect architecture. I think that would solve the issue, but I don't remember the status of it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It should be already available: #13976 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cool, thanks! Probably we need to either map the results there to the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That's the "in progress" part that I'm unaware about the status of. @AlexeySachkov , someone from your team is looking into it, right? |
||||||
- name: E2E on Intel Arc A-Series Graphics; Level zero GPU, OpenCL CPU | ||||||
runner: '["Linux", "arc"]' | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm confused. Isn't this FGPA and not GPU? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Originally on ARC we tested with 2 GPU backends: level 0 and OpenCL: |
||||||
image: ghcr.io/intel/llvm/ubuntu2204_intel_drivers:latest | ||||||
image_options: -u 1001 --device=/dev/dri --privileged --cap-add SYS_ADMIN | ||||||
|
@@ -98,23 +105,16 @@ jobs: | |||||
install_drivers: ${{ contains(needs.detect_changes.outputs.filters, 'drivers') }} | ||||||
extra_lit_opts: --param matrix-xmx8=True --param gpu-intel-dg2=True | ||||||
env: '{"LIT_FILTER":${{ needs.determine_arc_tests.outputs.arc_tests }} }' | ||||||
- name: E2E tests with dev igc on Intel Arc A-Series Graphics | ||||||
|
||||||
- name: E2E on Intel Arc A-Series Graphics | ||||||
runner: '["Linux", "arc"]' | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should the name be more specific on what targets are being tested, for example: OpenCL FPGA, OpenCL GPU, etc.... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Probably, I'll have a look |
||||||
image: ghcr.io/intel/llvm/ubuntu2204_intel_drivers:${{ contains(github.event.pull_request.labels.*.name, 'ci-no-devigc') && 'latest' || 'devigc' }} | ||||||
image_options: -u 1001 --device=/dev/dri --privileged --cap-add SYS_ADMIN | ||||||
target_devices: ext_oneapi_level_zero:gpu;opencl:gpu | ||||||
reset_gpu: true | ||||||
install_drivers: >- | ||||||
${{ contains(needs.detect_changes.outputs.filters, 'drivers') || | ||||||
contains(needs.detect_changes.outputs.filters, 'devigccfg') }} | ||||||
use_dev_igc: ${{ contains(needs.detect_changes.outputs.filters, 'devigccfg') }} | ||||||
extra_lit_opts: --param matrix-xmx8=True --param gpu-intel-dg2=True | ||||||
env: '{"LIT_FILTER":${{ needs.determine_arc_tests.outputs.arc_tests }} }' | ||||||
target_devices: opencl:gpu;opencl:opencl:fpga | ||||||
|
||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OpenCL GPU is being tested with previous "E2E on Intel Arc A-Series Graphics" run, right? Also, what is the point to test OpenCL FPGA on HW with Intel ARC? I would use more powerful system for that, and GPU is not necessary for it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 on this i think fpga testing on any system is the same, it just needs to be powerful enough to not timeout/etc |
||||||
uses: ./.github/workflows/sycl-linux-run-tests.yml | ||||||
with: | ||||||
name: ${{ matrix.name }} | ||||||
runner: ${{ matrix. runner }} | ||||||
runner: ${{ matrix.runner }} | ||||||
image: ${{ matrix.image }} | ||||||
image_options: ${{ matrix.image_options }} | ||||||
target_devices: ${{ matrix.target_devices }} | ||||||
|
@@ -128,15 +128,40 @@ jobs: | |||||
merge_ref: '' | ||||||
|
||||||
sycl_toolchain_artifact: sycl_linux_default | ||||||
sycl_toolchain_archive: ${{ needs.build.outputs.artifact_archive_name }} | ||||||
sycl_toolchain_decompress_command: ${{ needs.build.outputs.artifact_decompress_command }} | ||||||
sycl_toolchain_archive: ${{ needs.build_linux.outputs.artifact_archive_name }} | ||||||
sycl_toolchain_decompress_command: ${{ needs.build_linux.outputs.artifact_decompress_command }} | ||||||
|
||||||
build_win: | ||||||
if: | | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why are you removing the split between lin/win tasks? I would have expected you to link the commit that introduced the split in the PR's description together with your arguments on why you think that was wrong. Also, you're not removing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, even if we decide to keep that change, it should be done in a separate PR and not here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
So people can see all precommit tests together. It never occurred to me that they might have been together at one point and then separated out, so I didn't consider something was done "wrong", thus no need to focus on
That's a mistake. Will fix.
Which change? Why does it need a separate merge request? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Merging win/lin into a single YML, obviously...
Because that would be an atomic change and per good software development practices PRs have to represent atomic changes, not a bunch of different fixes at once.
They were. If you decide to merge them back, please make it clear in your new atomic PR why original reasoning doesn't take place anymore. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Not obviously. Don't patronise me. Your description was unclear and I'm not a mind reader There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My first comment in this thread was attributed to a single change:
The second (and the only other comment at that time) in this thread was
What else could it possibly refer to, in your opinion? |
||||||
always() | ||||||
&& success() | ||||||
&& github.repository == 'intel/llvm' | ||||||
uses: ./.github/workflows/sycl-windows-build.yml | ||||||
|
||||||
test_e2e_win: | ||||||
needs: build_win | ||||||
# Continue if build was successful. | ||||||
if: | | ||||||
always() | ||||||
&& !cancelled() | ||||||
&& needs.build_win.outputs.build_conclusion == 'success' | ||||||
uses: ./.github/workflows/sycl-windows-run-tests.yml | ||||||
with: | ||||||
name: Intel GEN12 Graphics with Level Zero | ||||||
runner: '["Windows","gen12"]' | ||||||
sycl_toolchain_archive: ${{ needs.build_win.outputs.artifact_archive_name }} | ||||||
extra_lit_opts: --param gpu-intel-gen12=True | ||||||
|
||||||
build_macos: | ||||||
name: macOS Build and Test | ||||||
if: github.repository == 'intel/llvm' | ||||||
uses: ./.github/workflows/sycl-macos-build-and-test.yml | ||||||
|
||||||
test-perf: | ||||||
needs: [build, detect_changes] | ||||||
test_perf: | ||||||
needs: [build_linux, detect_changes] | ||||||
if: | | ||||||
always() && !cancelled() | ||||||
&& needs.build.outputs.build_conclusion == 'success' | ||||||
&& needs.build_linux.outputs.build_conclusion == 'success' | ||||||
&& (contains(github.event.pull_request.labels.*.name, 'run-perf-tests') | ||||||
|| contains(needs.detect_changes.outputs.filters, 'perf-tests')) | ||||||
strategy: | ||||||
|
@@ -165,7 +190,7 @@ jobs: | |||||
uses: ./.github/workflows/sycl-linux-run-tests.yml | ||||||
with: | ||||||
name: Perf tests on ${{ matrix.name }} | ||||||
runner: ${{ matrix. runner }} | ||||||
runner: ${{ matrix.runner }} | ||||||
image: ${{ matrix.image }} | ||||||
image_options: -u 1001 --privileged --cap-add SYS_ADMIN ${{ matrix.image_extra_opts }} | ||||||
target_devices: all | ||||||
|
@@ -179,5 +204,5 @@ jobs: | |||||
merge_ref: '' | ||||||
|
||||||
sycl_toolchain_artifact: sycl_linux_default | ||||||
sycl_toolchain_archive: ${{ needs.build.outputs.artifact_archive_name }} | ||||||
sycl_toolchain_decompress_command: ${{ needs.build.outputs.artifact_decompress_command }} | ||||||
sycl_toolchain_archive: ${{ needs.build_linux.outputs.artifact_archive_name }} | ||||||
sycl_toolchain_decompress_command: ${{ needs.build_linux.outputs.artifact_decompress_command }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this? At the very least, it should have a comment in the code explaining it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same as the other places that use this tag.
What should the comment say? "Use stable tag 0300ac9"? Seems redundant to me