WIP: Improvements to designmatrix and fast random models #1334

scottransom · 2022-07-09T03:53:12Z

This provides a new, faster, default method of computing random model residuals that uses the design matrix. It appears to be factors of 2-6 faster than the earlier "exact" method, depending on how many TOAs there are.

In doing so, I improved the designmatrix() method of the timing models so that it doesn't use just a fixed F0 to convert from phases to times, but allows also Taylor expansion calculations, numerical derivatives, and a phase-only method. This is at least part of what has been discussed in #1331

To make the above work, I moved the fitter get_PSR_freq() method to become a timing model method called get_spin_freq(), which also takes the same options (except phase-only) that are mentioned above.

I think all the current tests pass and pintk works, but I haven't yet written new tests.

codecov · 2022-07-09T05:19:08Z

Codecov Report

Base: 65.31% // Head: 63.35% // Decreases project coverage by -1.95% ⚠️

Coverage data is based on head (00cf6f2) compared to base (962674d).
Patch coverage: 69.76% of modified lines in pull request are covered.

❗ Current head 00cf6f2 differs from pull request most recent head 10bd528. Consider uploading reports for the commit 10bd528 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1334      +/-   ##
==========================================
- Coverage   65.31%   63.35%   -1.96%     
==========================================
  Files          91       90       -1     
  Lines       21268    20941     -327     
  Branches     3666     3600      -66     
==========================================
- Hits        13891    13267     -624     
- Misses       6525     6861     +336     
+ Partials      852      813      -39

Impacted Files	Coverage Δ
src/pint/pintk/plk.py	`13.41% <0.00%> (ø)`
src/pint/pintk/pulsar.py	`32.67% <ø> (ø)`
src/pint/models/timing_model.py	`82.25% <68.18%> (-0.76%)`	⬇️
src/pint/simulation.py	`77.92% <78.37%> (-0.23%)`	⬇️
src/pint/residuals.py	`70.14% <100.00%> (+2.65%)`	⬆️
src/pint/templates/lctemplate.py	`28.48% <0.00%> (-25.17%)`	⬇️
src/pint/gridutils.py	`43.37% <0.00%> (-14.68%)`	⬇️
src/pint/templates/lcprimitives.py	`28.12% <0.00%> (-12.50%)`	⬇️
src/pint/templates/lcnorm.py	`47.34% <0.00%> (-8.59%)`	⬇️
src/pint/templates/lcfitters.py	`24.51% <0.00%> (-2.40%)`	⬇️
... and 4 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

paulray · 2022-07-09T12:22:56Z

If the calctype is numerical it isn't really returning the spin frequency right? It would be modified by orbital Doppler shifts and such. In fact, isn't it really returning the observed topocentric frequency, not the barycentric frequency?

scottransom · 2022-07-09T13:49:22Z

Yeah, I have not yet tested the detailed differences between the different methods. All I know for sure right now is that they give reasonably close values for an MSP binary system. I'm truthfully not sure exactly what delays should be removed for which use cases. It seems like most use cases would either require the spin rate at the SSB, or maybe at the pulsar system center. I think it should be possible to choose which delays get applied for the numerical method so that you can get either of those or for the topocenter. Maybe that is a good addition. And the default would be SSB-like? (which would include binary doppler shifts?) There are a lot of options.

aarchiba · 2022-07-10T11:15:18Z

This provides a new, faster, default method of computing random model residuals that uses the design matrix. It appears to be factors of 2-6 faster than the earlier "exact" method, depending on how many TOAs there are.

Just to check: "how many TOAs there are" means "how many places you want to sample the random models", right? Not how many input TOAs there are? (I know the current random models splices a few real TOAs at real frequencies in with the fake TOAs at infinite frequencies, which leads to weirdness when fitting DMs.)

Could one speed up the process by keeping design matrix entries for any real TOAs one wanted to include from when they were calculated during the fitting? The Downhill fitters already cache design matrices, though they probably don't expose that to the user.
One doesn't actually need to do a fit, just to compute the parameter covariance matrix, before one can use random matrices. Should fitters have a public interface for computing the covariance matrix only? (Even just "fit with zero iterations"?) Should there be a way to record parameter covariance matrices in, or adjacent to, par files?

aarchiba · 2022-07-10T11:18:43Z

src/pint/models/timing_model.py

+        ----------
+        times : float or long double MJD (can be array), astropy.Time object, TOAs
+            The times (barycentric, inf freq) at which to compute the spin frequency
+        calctype : {'modelF0', 'numerical', 'taylor'}


Perhaps a fourth option, that allowed passing in a fixed frequency, might be helpful to avoid fitter shenanigans where they realize they can rescale their errors by changing the F0 in the model? Or is that better handled by using the prefit model to do this computation?

Where would that fixed frequency come from? Would that be perhaps passed as an optional argument? Although, I guess, as you are saying, it could default to being the prefit F0....

My suggestion was that the caller be allowed to supply it, whether as an optional argument or actually as a floating-point number in calctype. But I'm not sure that's the right way to allow callers to avoid updating the frequency used mid-fit. Given the current implementation, Fitter instances could ensure that the model this was called on was the pre-fit model, not the model updated to reflect fitting.

src/pint/models/timing_model.py

aarchiba · 2022-07-10T11:23:25Z

src/pint/models/timing_model.py

+        # Handle various types of input times for "taylor" calc
+        if isinstance(times, TOAs):
+            # If we pass the TOA table, then barycenter the TOAs
+            bts = self.get_barycentric_toas(times)  # have units of days


Do we actually want to barycenter the TOAs (like, make this dependent on sky position)? If so, should there be a faster shortcut computation method (fasttaylor) that just uses the un-barycentered times?

Yeah, I don't know the answer to the first question. I think it depends on your use case and how precise you want things. A fasttaylor method is probably a good idea no matter what, though.

aarchiba · 2022-07-10T11:24:14Z

src/pint/models/timing_model.py

+            else:
+                raise ValueError("times as Time instance in needs scale=='tdb'")
+        elif isinstance(times, MJDParameter):
+            bts = np.asarray(times.value) << u.d  # .value is always a MJD long double


If it's a Parameter maybe better to use .quantity, which always returns a Quantity ready to go?

Because .quantity for an MJDParameter returns an astropy Time instance, which doesn't work for us here (at least not in an easy way).

aarchiba · 2022-07-10T11:25:08Z

src/pint/models/timing_model.py

+        elif isinstance(times, MJDParameter):
+            bts = np.asarray(times.value) << u.d  # .value is always a MJD long double
+        else:
+            bts = np.asarray(times) << u.d


What if it's already a Quantity but accidentally in seconds?

Good question. Maybe a ValueError since the input times are supposed to be dates (i.e. MJDs, if an array)?

aarchiba · 2022-07-10T11:25:59Z

src/pint/models/timing_model.py

+            bts = np.asarray(times.value) << u.d  # .value is always a MJD long double
+        else:
+            bts = np.asarray(times) << u.d
+        dts = bts - (self.PEPOCH.value << u.d)


Again using the .quantity member of a parameter is probably better than stripping the units and then reapplying days. Particularly since, as shown in the example notebook, users may have created their own PEPOCH parameter - if that's in time units, we want to convert to days, and if not we want an exception.

PEPOCH is an MJDParam, so it returns a Time instance and not a day quantity. So I think we have to do it something like this unless we want to work with astropy TimeDeltas, and we definitely don't want that.

aarchiba · 2022-07-10T11:30:49Z

src/pint/models/timing_model.py

        incfrozen : bool
            Whether to include frozen parameters in the design matrix
        incoffset : bool
            Whether to include the constant offset in the design matrix
+        calctype : {'modelF0', 'numerical', 'taylor'}


Is there a way to make the possibilities for calctype the same here as in other functions that accept this parameter? I guess it's a bit trivial for get_spin_freq but it's nicer to have a more uniform calling convention.

They are all identical now (including with the docstrings), except that .designmatrix also allows a phase option.

aarchiba · 2022-07-10T11:33:11Z

src/pint/pintk/plk.py

@@ -1068,11 +1068,11 @@ def plotResiduals(self, keepAxes=False):
                    else self.psr.postfit_resids
                )
                if self.y_unit == u.us:
-                    f0 = r.get_PSR_freq().to(u.MHz).value
+                    f0 = r.model.F0.quantity.to(u.MHz).value


Should this be r.model.get_spin_freq() with some appropriate options? At a minimum so we can easily identify this as a place we do that conversion?

Maybe? But get_spin_freq requires times to be specified. What time are we going to use for plotting? (and that is all this is used for: the 2nd axis of the residual plots)

aarchiba · 2022-07-10T11:34:01Z

src/pint/models/timing_model.py

@@ -1502,6 +1496,57 @@ def delete_jump_and_flags(self, toa_table, jump_num):
            return
        self.components["PhaseJump"].setup()

+    def get_spin_freq(self, times, calctype="modelF0"):


Should this allow times=None when just using modelF0 to avoid forcing users to make up a bogus time?

That's not a bad idea. Although if you know you are not giving times and explicitly specifying "modelF0", why wouldn't you just use m.F0.quantity rather than calling a method?

src/pint/simulation.py

aarchiba · 2022-07-10T11:43:18Z

src/pint/simulation.py

+    return resids << u.s if return_time else resids
+
+
+def compute_random_model_resids_fast(fitter, models, toas, return_time=False):


The design matrix can also be directly used to compute one-sigma "error bands", with a small additional amount of linear algebra. Such error bands don't fully capture the shape of deviations but can be a very tidy way to summarize the uncertainty in a model. Would it be good to have a function to compute these and a plotting mode in pintk to show them?

This is basically exactly how/why we use the random models now in pintk. But I definitely think it might be useful to be able to do this computation, for instance, at the time of a non-phase-connected TOA in the future to see if it is likely phase-connected or not. We use random models now for that, but this is probably a much more "surgical" option. Do you know of a ref for the linear algebra to do this?

scottransom · 2022-07-10T16:48:29Z

Just to check: "how many TOAs there are" means "how many places you want to sample the random models", right? Not how many input TOAs there are? (I know the current random models splices a few real TOAs at real frequencies in with the fake TOAs at infinite frequencies, which leads to weirdness when fitting DMs.)

Yes, that is correct. I'll edit that in the description.

Could one speed up the process by keeping design matrix entries for any real TOAs one wanted to include from when they were calculated during the fitting? The Downhill fitters already cache design matrices, though they probably don't expose that to the user.

I think cached design matrix calculations would be super useful.

One doesn't actually need to do a fit, just to compute the parameter covariance matrix, before one can use random matrices. Should fitters have a public interface for computing the covariance matrix only? (Even just "fit with zero iterations"?) Should there be a way to record parameter covariance matrices in, or adjacent to, par files?

That would also be really useful, I think.

aarchiba reviewed Jul 10, 2022

View reviewed changes

src/pint/models/timing_model.py Outdated Show resolved Hide resolved

aarchiba reviewed Jul 10, 2022

View reviewed changes

src/pint/simulation.py Outdated Show resolved Hide resolved

aarchiba reviewed Jul 10, 2022

View reviewed changes

src/pint/simulation.py Outdated Show resolved Hide resolved

aarchiba reviewed Jul 10, 2022

View reviewed changes

src/pint/simulation.py Outdated Show resolved Hide resolved

aarchiba reviewed Jul 10, 2022

View reviewed changes

scottransom mentioned this pull request Nov 11, 2022

What is the difference between pint.random_models.random_models and pint.simulation.calculate_random_models? #1442

Open

scottransom added 6 commits November 15, 2022 21:27

Improvements to designmatrix and fast random models

5c880f2

Changed PEPOCH_P0 to PEPOCH and added get_spin_terms() method

9573efc

only copy ref model rather than full fitter when making random models

29e7f68

Fixed docstring first lines

2db9de1

Specify Residuals load directly

110425f

Replace two asserts with exception raises, if needed

145abb3

scottransom force-pushed the random_models_cleanup branch from a92f788 to 145abb3 Compare November 16, 2022 02:36

scottransom added 3 commits November 15, 2022 21:47

Blacked

58d77f8

stylistic code cleanups recommended by Sorcery

00cf6f2

Merge branch 'nanograv:master' into random_models_cleanup

10bd528

abhisrkckl mentioned this pull request Jan 3, 2023

the correct way to get_PSR_freq #713

Open

abhisrkckl mentioned this pull request Nov 16, 2023

Wishlist for PINT development #1576

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Improvements to designmatrix and fast random models #1334

WIP: Improvements to designmatrix and fast random models #1334

scottransom commented Jul 9, 2022

codecov bot commented Jul 9, 2022 •

edited

Loading

paulray commented Jul 9, 2022

scottransom commented Jul 9, 2022

aarchiba commented Jul 10, 2022

aarchiba Jul 10, 2022

scottransom Jul 10, 2022

aarchiba Jul 11, 2022

aarchiba Jul 10, 2022

scottransom Jul 10, 2022

aarchiba Jul 10, 2022

scottransom Jul 10, 2022

aarchiba Jul 10, 2022

scottransom Jul 10, 2022

aarchiba Jul 10, 2022 •

edited

Loading

scottransom Jul 10, 2022

aarchiba Jul 10, 2022

scottransom Jul 10, 2022

aarchiba Jul 10, 2022

scottransom Jul 10, 2022

aarchiba Jul 10, 2022

scottransom Jul 10, 2022

aarchiba Jul 10, 2022

scottransom Jul 10, 2022

scottransom commented Jul 10, 2022

		return resids << u.s if return_time else resids


		def compute_random_model_resids_fast(fitter, models, toas, return_time=False):

WIP: Improvements to designmatrix and fast random models #1334

Are you sure you want to change the base?

WIP: Improvements to designmatrix and fast random models #1334

Conversation

scottransom commented Jul 9, 2022

codecov bot commented Jul 9, 2022 • edited Loading

Codecov Report

paulray commented Jul 9, 2022

scottransom commented Jul 9, 2022

aarchiba commented Jul 10, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aarchiba Jul 10, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scottransom commented Jul 10, 2022

codecov bot commented Jul 9, 2022 •

edited

Loading

aarchiba Jul 10, 2022 •

edited

Loading