Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Enable CI and setup all logistics #15

Merged
merged 35 commits into from
Aug 16, 2022
Merged

[CI] Enable CI and setup all logistics #15

merged 35 commits into from
Aug 16, 2022

Conversation

adam2392
Copy link
Collaborator

First step for #11

Summary for CI

We should keep CI and setup and installation workflow relatively similar. From my exp in OSS, this alleviates developer burden and overall technical debt.

This PR adds additional checks to the CI that are based off best practices:

  • circleCI will build documentation for every PR and commit to this repo
  • GH actions has a job to link output of circleCI to be linked to the checks
  • GH actions will test for style using poetry
  • GH actions will test for building the overall package
  • GH actions will run unit tests
  • GH actions will upload code coverage from unit tests
  • GH actions will also automatically release on pypi given a tag is made on main branch

TODO for CI:

  • GH actions should also weekly test integration suite. I.e. this will eventually encompass networkx, dowhy, pywhy_graphs and more.
  • ideally poetry doesn't install all dependencies for every single job it runs

Summary for repo docs

I've added more details to the README based on my experience. What would be nice is to extend the main figure in dowhy's README to encompass causal discovery. Otw, I think short READMEs are the way to go and then linking to the documentation.

Summary for code documentation

I've uploaded boiler plate documentation that will leverage the pydata sphinx layout. Inside we should track: contributors, and PRs. The API is also auto-rendered here via sphinx.

cc: @robertness

Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Copy link
Collaborator

@darthtrevino darthtrevino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; only comment is setup.py is no longer necessary


from setuptools import setup

setup()
Copy link
Collaborator

@darthtrevino darthtrevino Aug 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need setup.py with Poetry, on package build it will auto-generate a setup.py with metadata from pyproject.toml.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we if a developer git clones and wants to run pip install -e .? python-poetry/poetry#34

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does poetry install -E <extra_name> -E <other_extra_nome> not work for you?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No that only installs dev dependencies. Currently in python-poetry/poetry#34 it is said that one cannot run the equivalent of pip install -e ..

Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
@adam2392
Copy link
Collaborator Author

adam2392 commented Aug 12, 2022

@darthtrevino do you know why the circleCI is not catching the install for matplotlib?

Also do you know why poetry bugs out when installing numpy for python v3.10?

These are the only 2 issues left before this is good to merge.

README.md Outdated

dodiscover is a Python library for causal discovery, or structure learning. This is generally considered a "first step" in the causal inference pipeline, if one does not have access to a hypothesized causal graph for their situation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming "structure learning" is a bigger scope than "causal discovery" Let's narrow scope down to "Python library for causal discovery (causal structure learning)."

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested rephrasing

dodiscovery is a Python library for causal discovery (causal structure learning). Causal discovery is useful when you aren't sure what specific causal questions you have in your domain and want to start by exploring causal relationships in the data. dodiscovery also supports finding the causal structure you need to target specific causal inferences (e.g., cause effect inference).

I think it is best to avoid framing global causal discovery necessarily as the first step of a cause inference pipeline. Causal discovery tends to focus on getting the "right" graph given the data you have, and most of causal effect inference methods focus on getting the "right" causal effect in a way that is robust to not knowing all of the graph or observing all of the data. I think we'll address this with learning ADMG's, providing Markov blanket discovery, etc. But we shouldn't overpromise.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Collaborator

@robertness robertness left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requested changes to README, otherwise LGTM

README.md Outdated

dodiscover is a Python library for causal discovery, or structure learning. This is generally considered a "first step" in the causal inference pipeline, if one does not have access to a hypothesized causal graph for their situation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested rephrasing

dodiscovery is a Python library for causal discovery (causal structure learning). Causal discovery is useful when you aren't sure what specific causal questions you have in your domain and want to start by exploring causal relationships in the data. dodiscovery also supports finding the causal structure you need to target specific causal inferences (e.g., cause effect inference).

I think it is best to avoid framing global causal discovery necessarily as the first step of a cause inference pipeline. Causal discovery tends to focus on getting the "right" graph given the data you have, and most of causal effect inference methods focus on getting the "right" causal effect in a way that is robust to not knowing all of the graph or observing all of the data. I think we'll address this with learning ADMG's, providing Markov blanket discovery, etc. But we shouldn't overpromise.

README.md Outdated Show resolved Hide resolved
doc/references.bib Show resolved Hide resolved
Copy link
Collaborator Author

@adam2392 adam2392 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gonna try then what was suggested in discord.

pyproject.toml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
@adam2392
Copy link
Collaborator Author

Just tryna wrangle with the circleCI docs building

Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
@adam2392
Copy link
Collaborator Author

Cross-linking with poetry issue: python-poetry/poetry#6177

It seems there is an overall flaw in poetry that does not allow one to specify separate dev-dependencies. Therefore dev-dependencies must all be installed for any dev step (e.g. matplotlib must be installed to do style checking).

Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
@adam2392 adam2392 merged commit 7533e90 into py-why:main Aug 16, 2022
@adam2392 adam2392 deleted the ci2 branch August 16, 2022 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants