Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE][EXPERIMENTAL] Dependency graph based testing strategy and related pipeline #3738

Merged

Conversation

cdkini
Copy link
Member

@cdkini cdkini commented Nov 23, 2021

THIS IS A PROOF OF CONCEPT

Changes proposed in this pull request:

Programmatically determine which test files to run in Azure through a script (we follow a similar approach in our docs integration). This brings down the coverage of each individual run but increases performance greatly. The number of tests selected are dependent on how many files have changed and how important those files are to the codebase (a change to data_context.py will cause 99% of the suite to run while a change to rule.py will be quick since it is less impactful).

CI/CD currently
Screen Shot 2021-11-23 at 5 56 55 PM

CI/CD with script (changed rule.py)
Screen Shot 2021-11-23 at 5 56 05 PM
Note that this script is NOT running the same number of tests (that's where the performance gain comes from)
Also note that this is a best case scenario; on average, tests will be faster

If we supplement this dynamic test runner with daily runs of the entire suite, I think we can maintain strong coverage while dramatically improving CI/CD time. The question is, do we think the complexity of this script is worth the proposed benefit and does it provide enough coverage to be reliable?


Definition of Done

Please delete options that are not relevant.

  • My code follows the Great Expectations style guide
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added unit tests where applicable and made sure that new and existing tests are passing.
  • I have run any local integration tests and made sure that nothing is broken.

Thank you for submitting!

@netlify
Copy link

netlify bot commented Nov 23, 2021

✔️ Deploy Preview for niobium-lead-7998 ready!

🔨 Explore the source changes: 300d1c9

🔍 Inspect the deploy log: https://app.netlify.com/sites/niobium-lead-7998/deploys/61b0d3cf116cd7000780222f

😎 Browse the preview: https://deploy-preview-3738--niobium-lead-7998.netlify.app

@github-actions
Copy link
Contributor

HOWDY! This is your friendly 🤖 CHANGELOG bot 🤖

Please don't forget to add a clear and succinct description of your change under the Develop header in docs_rtd/changelog.rst, if applicable. This will help us with the release process. See the Contribution checklist in the Great Expectations documentation for the type of labels to use!

Thank you!

@cdkini cdkini changed the title [EXPERIMENTAL][WIP][FEATURE] Script to streamline Azure pipeline testing runs [WIP][FEATURE] Script to streamline Azure pipeline testing runs Nov 23, 2021
@cdkini cdkini self-assigned this Nov 23, 2021
@cdkini cdkini changed the title [WIP][FEATURE] Script to streamline Azure pipeline testing runs [Proof of Concept] Script to streamline Azure pipeline testing runs Nov 23, 2021
@@ -155,7 +155,7 @@ stages:

- script: |
pip install pytest pytest-cov pytest-azurepipelines
pytest $(GE_pytest_opts) --napoleon-docstrings --junitxml=junit/test-results.xml --cov=. --cov-report=xml --cov-report=html --ignore=tests/cli --ignore=tests/integration/usage_statistics
python scripts/determine_test_files_to_run.py | xargs pytest $(GE_pytest_opts) --napoleon-docstrings --junitxml=junit/test-results.xml --cov=. --cov-report=xml --cov-report=html --ignore=tests/cli --ignore=tests/integration/usage_statistics
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should take the output of the script and store it in a var so it's only used once.

Comment on lines 1 to 3
# This is a test change
import copy
from typing import Dict, List, Optional
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small change to trigger an example run.

@cdkini cdkini force-pushed the feature/script-for-streamlined-testing-in-azure-pipelines branch from d205a37 to 1d32194 Compare November 23, 2021 23:15
@cdkini cdkini closed this Nov 24, 2021
@cdkini cdkini reopened this Nov 24, 2021
@cdkini cdkini changed the title [Proof of Concept] Script to streamline Azure pipeline testing runs [FEATURE][EXPERIMENTAL] Dependency graph based testing strategy and related pipeline Dec 8, 2021
@cdkini cdkini marked this pull request as ready for review December 8, 2021 00:45
Comment on lines 1 to 10
# This pipeline is meant to run the GE test suite with an experimental test runner strategy.
# The significant changes between this YAML and the primary azure-pipelines.yml file is
# that this only tests Python 3.8 (for performance considerations) and utilizes a
# custom script to filter the test files selected and passed on to pytest.
trigger:
branches:
include:
- pre_pr-*
- develop
- main
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if this should be streamlined further (or conversely, more closely follow the main YAML).

An alternative approach would be to add this change in our current pipeline but as an additional stage in our compatability or comprehensive matrices. This would introduce the change to production but reduce the overall burden of the experiment.

@@ -0,0 +1,265 @@
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@@ -0,0 +1,333 @@
# This pipeline is meant to run the GE test suite with an experimental test runner strategy.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Copy link
Contributor

@alexsherstinsky alexsherstinsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! ❤️ Beautiful code -- so excited to start using it! Thank you so much, @cdkini !

…tions into feature/script-for-streamlined-testing-in-azure-pipelines
@cdkini cdkini merged commit 1f150ff into develop Dec 8, 2021
@cdkini cdkini deleted the feature/script-for-streamlined-testing-in-azure-pipelines branch December 8, 2021 15:54
Shinnnyshinshin pushed a commit that referenced this pull request Dec 8, 2021
…NT-29/instrument-expectation-suite-for-usage-stats

* feature/ANT-29/datacontext-is-singleton:
  update tests after review and fantastic suggestsions
  [FEATURE][EXPERIMENTAL] Dependency graph based testing strategy and related pipeline (#3738)
  Fix issue where configuration store didn't allow nesting (#3811)
  [FEATURE] Add suite creation type field to CLI SUITE "new" and "edit" Usage Statistics events (#3810)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants