Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set data when building Linearmodel #249

Merged
merged 10 commits into from
Nov 13, 2023

Conversation

pdb5627
Copy link
Contributor

@pdb5627 pdb5627 commented Oct 7, 2023

Fixes #248

I also have a commit in here with some clean-up to ModelBuilder.sample_prior_predictive that I saw while working on this. It's unrelated but relatively minor.

Since model_builder must create model object and set data, data only
needs to be set if model object already exists and no need to check for
model object before sampling.
@twiecki
Copy link
Member

twiecki commented Nov 1, 2023

Thanks @pdb5627! Not sure what the failing test is about, I restarted for now.

pymc_experimental/model_builder.py Outdated Show resolved Hide resolved
pymc_experimental/tests/test_linearmodel.py Outdated Show resolved Hide resolved
pymc_experimental/tests/test_linearmodel.py Outdated Show resolved Hide resolved
pymc_experimental/tests/test_linearmodel.py Outdated Show resolved Hide resolved
@ricardoV94 ricardoV94 added the bug Something isn't working label Nov 1, 2023
pdb5627 and others added 3 commits November 1, 2023 14:01
Use np.testing.assert_allclose instead of pytest.approx

Make fitted_model use fewer samples for fitting and do a
separate model and fit for checking parameter recovery.
@twiecki
Copy link
Member

twiecki commented Nov 6, 2023

Ping @ricardoV94

@twiecki twiecki requested a review from ricardoV94 November 6, 2023 14:32
self.set_idata_attrs(prior_pred)
if extend_idata:
if self.idata is not None:
self.idata.extend(prior_pred, join="right")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we test that all these join="right" are doing the right thing (i.e., discarding the old value and replacing the new one), and that extend_idata=False is being respected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible. I'll work on it and update the PR accordingly.

@@ -220,6 +220,68 @@ def test_sample_posterior_predictive(fitted_model_instance, combined):
assert np.issubdtype(pred[fitted_model_instance.output_var].dtype, np.floating)


@pytest.mark.parametrize("extend_idata", [True, False])
def test_sample_prior_extend_idata_param(fitted_model_instance, extend_idata):
Copy link
Member

@ricardoV94 ricardoV94 Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about mutating the fitted_model_instance in these tests, as it may be used in other tests.
In general I am not a fan of fixtures that return mutable objects that we probably want to mutate further to investigate the behavior. I suggest creating a dummy model in this test and pass a copy of the fitted idata at the beginning.

Also can we parametrize the sample method to merge this and the posterior predictive tests. They seem to share most of the logic anyway.

Otherwise I think it's more than ready

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're saying about modifying the fitted_model_instance object. My thought is to make the fitted_model_instance fixture return a copy of the model instance so individual tests can do what they like with it without any possibility of affecting other tests. The overhead for making a copy doesn't seem to add much to the test run time.

I originally wrote the prior and posterior predictive tests in the same test, but there were so many if else branches that I decided to split the test. But then I ended up making intermediate variables to clean up the code, so it turned out the same after all, so I'm combining them again. Thanks for the suggestion.

Make copies of `fitted_model_instance` to keep tests from interfering
with each other.

Combine `test_sample_prior_extend_idata_param` and
`test_sample_posterior_extend_idata_param` to reduce code repetition.
"""Get a fitted model instance. The instance is copied after being fit,
so tests using this fixture can modify the model object without affecting
other tests."""
return copy.deepcopy(fitted_model_instance_base)
Copy link
Member

@ricardoV94 ricardoV94 Nov 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy doesn't really work for objects that have PyMC models: see pymc-devs/pymc#6985

The approach is not too bad though. What I suggest is to create the idata once and then in this fixture recreate the model and glue-in a copy of the idata. I did something like that with a helper method in this PR: pymc-labs/pymc-marketing@44985a8

Check the _build_with_idata method and how that's used by thin_fit_result. Something similar could be used for a ModelBuilder.copy(), but for now you can just reimplement the logic in this fixture if you want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came up with a workaround for copying the model without using copy.deepcopy.

I also noticed that there's a test marked for skipping on win32 due to lack of permissions for temp files, but the marked test doesn't use a temp file. There is a different test that does use a temp file. I thought maybe the annotation got onto the wrong test, so I made a commit to fix that possible issue. If that's wrong or you want to handle it as it's own issue, no problem, I'll take that commit back out.

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@ricardoV94 ricardoV94 changed the title Linearmodel set data Linearmodel set data during fit Nov 13, 2023
@ricardoV94 ricardoV94 changed the title Linearmodel set data during fit Linearmodel set data during build Nov 13, 2023
@ricardoV94 ricardoV94 changed the title Linearmodel set data during build Set data when building Linearmodel Nov 13, 2023
@ricardoV94 ricardoV94 merged commit a362ff0 into pymc-devs:main Nov 13, 2023
6 checks passed
@twiecki
Copy link
Member

twiecki commented Nov 13, 2023

Thanks for the contribution @pdb5627, does this unblock you for the ModelBuilder refactor?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LinearModel not setting data
3 participants