Set data when building Linearmodel #249

pdb5627 · 2023-10-07T06:30:28Z

Fixes #248

I also have a commit in here with some clean-up to ModelBuilder.sample_prior_predictive that I saw while working on this. It's unrelated but relatively minor.

Since model_builder must create model object and set data, data only needs to be set if model object already exists and no need to check for model object before sampling.

twiecki · 2023-11-01T10:06:47Z

Thanks @pdb5627! Not sure what the failing test is about, I restarted for now.

pymc_experimental/model_builder.py

pymc_experimental/tests/test_linearmodel.py

Co-authored-by: Ricardo Vieira <[email protected]>

Use np.testing.assert_allclose instead of pytest.approx Make fitted_model use fewer samples for fitting and do a separate model and fit for checking parameter recovery.

twiecki · 2023-11-06T14:32:09Z

Ping @ricardoV94

ricardoV94 · 2023-11-06T17:11:12Z

pymc_experimental/model_builder.py

+            self.set_idata_attrs(prior_pred)
+            if extend_idata:
+                if self.idata is not None:
+                    self.idata.extend(prior_pred, join="right")


Can we test that all these join="right" are doing the right thing (i.e., discarding the old value and replacing the new one), and that extend_idata=False is being respected?

It should be possible. I'll work on it and update the PR accordingly.

ricardoV94 · 2023-11-10T16:11:10Z

pymc_experimental/tests/test_model_builder.py

@@ -220,6 +220,68 @@ def test_sample_posterior_predictive(fitted_model_instance, combined):
    assert np.issubdtype(pred[fitted_model_instance.output_var].dtype, np.floating)


+@pytest.mark.parametrize("extend_idata", [True, False])
+def test_sample_prior_extend_idata_param(fitted_model_instance, extend_idata):


I'm concerned about mutating the fitted_model_instance in these tests, as it may be used in other tests.
In general I am not a fan of fixtures that return mutable objects that we probably want to mutate further to investigate the behavior. I suggest creating a dummy model in this test and pass a copy of the fitted idata at the beginning.

Also can we parametrize the sample method to merge this and the posterior predictive tests. They seem to share most of the logic anyway.

Otherwise I think it's more than ready

I see what you're saying about modifying the fitted_model_instance object. My thought is to make the fitted_model_instance fixture return a copy of the model instance so individual tests can do what they like with it without any possibility of affecting other tests. The overhead for making a copy doesn't seem to add much to the test run time.

I originally wrote the prior and posterior predictive tests in the same test, but there were so many if else branches that I decided to split the test. But then I ended up making intermediate variables to clean up the code, so it turned out the same after all, so I'm combining them again. Thanks for the suggestion.

Make copies of `fitted_model_instance` to keep tests from interfering with each other. Combine `test_sample_prior_extend_idata_param` and `test_sample_posterior_extend_idata_param` to reduce code repetition.

ricardoV94 · 2023-11-11T11:41:12Z

pymc_experimental/tests/test_model_builder.py

+    """Get a fitted model instance. The instance is copied after being fit,
+    so tests using this fixture can modify the model object without affecting
+    other tests."""
+    return copy.deepcopy(fitted_model_instance_base)


copy doesn't really work for objects that have PyMC models: see pymc-devs/pymc#6985

The approach is not too bad though. What I suggest is to create the idata once and then in this fixture recreate the model and glue-in a copy of the idata. I did something like that with a helper method in this PR: pymc-labs/pymc-marketing@44985a8

Check the _build_with_idata method and how that's used by thin_fit_result. Something similar could be used for a ModelBuilder.copy(), but for now you can just reimplement the logic in this fixture if you want.

I came up with a workaround for copying the model without using copy.deepcopy.

I also noticed that there's a test marked for skipping on win32 due to lack of permissions for temp files, but the marked test doesn't use a temp file. There is a different test that does use a temp file. I thought maybe the annotation got onto the wrong test, so I made a commit to fix that possible issue. If that's wrong or you want to handle it as it's own issue, no problem, I'll take that commit back out.

ricardoV94

Looks great!

twiecki · 2023-11-13T11:12:03Z

Thanks for the contribution @pdb5627, does this unblock you for the ModelBuilder refactor?

pdb5627 added 3 commits October 7, 2023 09:13

Create test of fitted parameter values

baa445b

Set data when creating model

299dabc

Clean up sample_prior_predictive

56293c0

Since model_builder must create model object and set data, data only needs to be set if model object already exists and no need to check for model object before sampling.

theorashid mentioned this pull request Nov 1, 2023

model_builder scikit-learn integration #155

Closed

twiecki requested a review from ricardoV94 November 1, 2023 10:06

ricardoV94 reviewed Nov 1, 2023

View reviewed changes

ricardoV94 added the bug Something isn't working label Nov 1, 2023

pdb5627 and others added 3 commits November 1, 2023 14:01

Apply suggestions from code review

94ad096

Co-authored-by: Ricardo Vieira <[email protected]>

Add join=right to extend calls

28a6cb1

Update test_parameter_fit based on comments from PR review

40abe19

Use np.testing.assert_allclose instead of pytest.approx Make fitted_model use fewer samples for fitting and do a separate model and fit for checking parameter recovery.

twiecki requested a review from ricardoV94 November 6, 2023 14:32

ricardoV94 reviewed Nov 6, 2023

View reviewed changes

Add tests for extend_idata parameters

f4bc623

ricardoV94 reviewed Nov 10, 2023

View reviewed changes

Update based on code review comments

d0f9704

Make copies of `fitted_model_instance` to keep tests from interfering with each other. Combine `test_sample_prior_extend_idata_param` and `test_sample_posterior_extend_idata_param` to reduce code repetition.

ricardoV94 reviewed Nov 11, 2023

View reviewed changes

pdb5627 added 2 commits November 11, 2023 15:25

Work around deepcopy not working for model objects

101dae7

Mark correct test for skipping on win32

6850f4c

ricardoV94 approved these changes Nov 13, 2023

View reviewed changes

ricardoV94 changed the title ~~Linearmodel set data~~ Linearmodel set data during fit Nov 13, 2023

ricardoV94 changed the title ~~Linearmodel set data during fit~~ Linearmodel set data during build Nov 13, 2023

ricardoV94 changed the title ~~Linearmodel set data during build~~ Set data when building Linearmodel Nov 13, 2023

ricardoV94 merged commit a362ff0 into pymc-devs:main Nov 13, 2023
6 checks passed

ricardoV94 mentioned this pull request Jan 12, 2024

Handle new data correctly and extend functionality of MMM posterior predictive methods pymc-labs/pymc-marketing#482

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set data when building Linearmodel #249

Set data when building Linearmodel #249

pdb5627 commented Oct 7, 2023

twiecki commented Nov 1, 2023

twiecki commented Nov 6, 2023

ricardoV94 Nov 6, 2023

pdb5627 Nov 8, 2023

ricardoV94 Nov 10, 2023 •

edited

Loading

pdb5627 Nov 11, 2023

ricardoV94 Nov 11, 2023 •

edited

Loading

pdb5627 Nov 11, 2023

ricardoV94 left a comment

twiecki commented Nov 13, 2023

Set data when building Linearmodel #249

Set data when building Linearmodel #249

Conversation

pdb5627 commented Oct 7, 2023

twiecki commented Nov 1, 2023

twiecki commented Nov 6, 2023

ricardoV94 Nov 6, 2023

Choose a reason for hiding this comment

pdb5627 Nov 8, 2023

Choose a reason for hiding this comment

ricardoV94 Nov 10, 2023 • edited Loading

Choose a reason for hiding this comment

pdb5627 Nov 11, 2023

Choose a reason for hiding this comment

ricardoV94 Nov 11, 2023 • edited Loading

Choose a reason for hiding this comment

pdb5627 Nov 11, 2023

Choose a reason for hiding this comment

ricardoV94 left a comment

Choose a reason for hiding this comment

twiecki commented Nov 13, 2023

ricardoV94 Nov 10, 2023 •

edited

Loading

ricardoV94 Nov 11, 2023 •

edited

Loading