Description
Summary
The current CI pipeline that runs for pull requests provides a set of quality checks. Some of these checks are executed for the full monorepo even when the change-set is small and scoped to a single package. With the proper improvements we could speed up the feedback loop and resource consumption when working on a single package.
Current Status
As for now the pipeline has several tasks working in parallel like lint, pr title check, api peer check... and the unit and TAV tests. this discussion is focusing on these last 2 jobs because is where we could get more gains. Regarding tests the current pipeline could be depicted as (simplified version)
---
title: Current PR WorkFlow
---
flowchart TD
UT[Unit Tests]
TAV[Test All Versions]
UTC[Compile All]
UT18["Test All (nodejs@18)"]
UT20["Test Diff (nodejs@20)"]
UT22["Test Diff (nodejs@22)"]
UTCOV[Upload Coverage]
TAVC[Compile All]
TAV18["Test Diff (nodejs@18)"]
TAV20["Test Diff (nodejs@20)"]
TAV22["Test Diff (nodejs@22)"]
UT --> UTC
UTC --> UT18
UTC --> UT20
UTC --> UT22
UT18 --> UTCOV
TAV --> TAVC
TAVC --> TAV18
TAVC --> TAV20
TAVC --> TAV22
Some observations can be extracted form this setup:
- sending the coverage of all instrumentations at once forces us to compile all packages. This may take up to 5min
- Unit tests with a specific version of the package instrumented does not reflect the actual coverage of the tests. Some code paths are not taken because they depend on the version being instrumented.
- Although we do TAV testing only on the affected packages we also compile everything.
Proposed changes
If we were able to do the checks (compile, test & coverage report) only for the affected packages that would reduce the time spent in most of the PRs. IMHO I think we can achieve that by relying on 2 features that we have in our tool belt. These are Codecov Flags and GH upload artifact action. The later was already used in #2493 to cache the compilation results and reuse them which reduced significantly the CPU usage.
Codecov Flags claims that you could encapsulate each project/package coverage report independently. This encapsulation will allow us to do only the compilation and test only for the affected packages and update their coverage independently.
As said before unit test is not enough to get the real coverage since running the tests with only one version of the instrumented library it will be unlikely to go through all code paths. Instead if we were able to run and collect coverage from test-all-versions
the coverage report would be more realistic. Problem here is that TAV runs in different "CI nodes" with different nodejs versions and we need all test runs in the same place to generate a report for codecov
. This can be achieved by using the upload artifact action to stash all test results in each machine for later download and merge in a single node and upload them to codecov
The process becomes more sequential but we gain in less usage of resources since we will only perform the compilation and tests of the affected packages which usually is reduced. The changed workflow would look like
---
title: Future PR WorkFlow
---
flowchart TD
T[Tests]
TC[Compile Affected]
T18["Unit Test Affected (nodejs@18)"]
T20["Unit Test Affected (nodejs@20)"]
T22["Unit Test Affected (nodejs@22)"]
TAV18["TAV Affected (nodejs@18)"]
TAV20["TAV Affected (nodejs@20)"]
TAV22["TAV Affected (nodejs@22)"]
UP18["Stash coverage (nodejs@18)"]
UP20["Stash coverage (nodejs@20)"]
UP22["Stash coverage (nodejs@22)"]
MC["Merge and upload coverage with flags"]
T --> TC
TC --> T18
TC --> T20
TC --> T22
T18 --> TAV18
T20 --> TAV20
T22 --> TAV22
TAV18 --> UP18
TAV20 --> UP20
TAV22 --> UP22
UP18 --> MC
UP20 --> MC
UP22 --> MC
NOTE: the term affected
is literal. Meaning that if a modified package has dependants in the same monorepo they will become affected too. So compilation and tests will be done for these as well (nx
already takes care of it)
- This only affects the JavaScript OpenTelemetry library
- This may affect other libraries, but I would like to get opinions here first