-
Notifications
You must be signed in to change notification settings - Fork 626
[Backend Test] Backend test reporting skeleton #12296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Backend Test] Backend test reporting skeleton #12296
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12296
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 330d48e with merge base dd4488d ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
ac4a118
to
05f933d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good for a starting point. Left some comments.
# We can do this if we ever see to_executorch() or serialize() fail due a backend issue. | ||
return build_result(TestResult.UNKNOWN_FAIL, e) | ||
|
||
# TODO We should consider refactoring the tester slightly to return more signal on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
||
def print_summary(summary: RunSummary): | ||
print() | ||
print("Test Session Summary:") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add some version number? And make sure we are parsable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll add a version number. I was intending to add a machine-readable file output, such as in json format. Is that sufficient for the use cases you're envisioning, or do you think we need machine parsable text output, as well?
fbe335b
to
6ee85f7
Compare
@GregoryComer has imported this pull request. If you are a Meta employee, you can view this in D78311621. |
6ee85f7
to
330d48e
Compare
@GregoryComer has imported this pull request. If you are a Meta employee, you can view this in D78311621. |
### Summary Add the initial skeleton of reporting code for the backend tester. This PR is primarily focused on putting the hooks and runner structure in place. Follow-up work will expand the scope of collection and reporting outputs. This PR adds the following: - CLI runner for the test suite. - Basic test result breakdown by success / fail and cause (failing in lowering vs output mismatch, for example). - Refactoring of test suite logic to clean things up. Next steps: - Aggregate results by flow (backend). - Add additional CLI flags to allow filtering backends and dtypes. - Land more of the operator test suite. - Wire up flows for quantized operators. Note that this PR is stacked on (and thus includes) #11960. I accidently broke my ghstack, so I'm converting this to a normal PR. Sample output (XNNPACK): ``` Test Session Summary: 84 Passed / 95 11 Failed / 95 0 Skipped / 95 [Success] 66 Delegated 18 Undelegated [Failure] 4 Lowering Fail 0 PTE Load Fail 0 PTE Run Fail 6 Output Mismatch Fail ``` Reproduce with `ET_TEST_ENABLED_BACKENDS=xnnpack python -m executorch.backends.test.suite.runner.executorch.backends.test.suite`. I've temporarily commented out non-f32 dtypes to work around some crashes in XNNPACK, which are non-recoverable from Python.
Summary
Add the initial skeleton of reporting code for the backend tester. This PR is primarily focused on putting the hooks and runner structure in place. Follow-up work will expand the scope of collection and reporting outputs.
This PR adds the following:
Next steps:
Note that this PR is stacked on (and thus includes) #11960. I accidently broke my ghstack, so I'm converting this to a normal PR.
Sample output (XNNPACK):
Reproduce with
ET_TEST_ENABLED_BACKENDS=xnnpack python -m executorch.backends.test.suite.runner.executorch.backends.test.suite
. I've temporarily commented out non-f32 dtypes to work around some crashes in XNNPACK, which are non-recoverable from Python.