Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Improvements to designmatrix and fast random models #1334

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

scottransom
Copy link
Member

This provides a new, faster, default method of computing random model residuals that uses the design matrix. It appears to be factors of 2-6 faster than the earlier "exact" method, depending on how many TOAs there are.

In doing so, I improved the designmatrix() method of the timing models so that it doesn't use just a fixed F0 to convert from phases to times, but allows also Taylor expansion calculations, numerical derivatives, and a phase-only method. This is at least part of what has been discussed in #1331

To make the above work, I moved the fitter get_PSR_freq() method to become a timing model method called get_spin_freq(), which also takes the same options (except phase-only) that are mentioned above.

I think all the current tests pass and pintk works, but I haven't yet written new tests.

@codecov
Copy link

codecov bot commented Jul 9, 2022

Codecov Report

Base: 65.31% // Head: 63.35% // Decreases project coverage by -1.95% ⚠️

Coverage data is based on head (00cf6f2) compared to base (962674d).
Patch coverage: 69.76% of modified lines in pull request are covered.

❗ Current head 00cf6f2 differs from pull request most recent head 10bd528. Consider uploading reports for the commit 10bd528 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1334      +/-   ##
==========================================
- Coverage   65.31%   63.35%   -1.96%     
==========================================
  Files          91       90       -1     
  Lines       21268    20941     -327     
  Branches     3666     3600      -66     
==========================================
- Hits        13891    13267     -624     
- Misses       6525     6861     +336     
+ Partials      852      813      -39     
Impacted Files Coverage Δ
src/pint/pintk/plk.py 13.41% <0.00%> (ø)
src/pint/pintk/pulsar.py 32.67% <ø> (ø)
src/pint/models/timing_model.py 82.25% <68.18%> (-0.76%) ⬇️
src/pint/simulation.py 77.92% <78.37%> (-0.23%) ⬇️
src/pint/residuals.py 70.14% <100.00%> (+2.65%) ⬆️
src/pint/templates/lctemplate.py 28.48% <0.00%> (-25.17%) ⬇️
src/pint/gridutils.py 43.37% <0.00%> (-14.68%) ⬇️
src/pint/templates/lcprimitives.py 28.12% <0.00%> (-12.50%) ⬇️
src/pint/templates/lcnorm.py 47.34% <0.00%> (-8.59%) ⬇️
src/pint/templates/lcfitters.py 24.51% <0.00%> (-2.40%) ⬇️
... and 4 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@paulray
Copy link
Member

paulray commented Jul 9, 2022

If the calctype is numerical it isn't really returning the spin frequency right? It would be modified by orbital Doppler shifts and such. In fact, isn't it really returning the observed topocentric frequency, not the barycentric frequency?

@scottransom
Copy link
Member Author

Yeah, I have not yet tested the detailed differences between the different methods. All I know for sure right now is that they give reasonably close values for an MSP binary system. I'm truthfully not sure exactly what delays should be removed for which use cases. It seems like most use cases would either require the spin rate at the SSB, or maybe at the pulsar system center. I think it should be possible to choose which delays get applied for the numerical method so that you can get either of those or for the topocenter. Maybe that is a good addition. And the default would be SSB-like? (which would include binary doppler shifts?) There are a lot of options.

@aarchiba
Copy link
Contributor

This provides a new, faster, default method of computing random model residuals that uses the design matrix. It appears to be factors of 2-6 faster than the earlier "exact" method, depending on how many TOAs there are.

Just to check: "how many TOAs there are" means "how many places you want to sample the random models", right? Not how many input TOAs there are? (I know the current random models splices a few real TOAs at real frequencies in with the fake TOAs at infinite frequencies, which leads to weirdness when fitting DMs.)

  • Could one speed up the process by keeping design matrix entries for any real TOAs one wanted to include from when they were calculated during the fitting? The Downhill fitters already cache design matrices, though they probably don't expose that to the user.
  • One doesn't actually need to do a fit, just to compute the parameter covariance matrix, before one can use random matrices. Should fitters have a public interface for computing the covariance matrix only? (Even just "fit with zero iterations"?) Should there be a way to record parameter covariance matrices in, or adjacent to, par files?

----------
times : float or long double MJD (can be array), astropy.Time object, TOAs
The times (barycentric, inf freq) at which to compute the spin frequency
calctype : {'modelF0', 'numerical', 'taylor'}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a fourth option, that allowed passing in a fixed frequency, might be helpful to avoid fitter shenanigans where they realize they can rescale their errors by changing the F0 in the model? Or is that better handled by using the prefit model to do this computation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would that fixed frequency come from? Would that be perhaps passed as an optional argument? Although, I guess, as you are saying, it could default to being the prefit F0....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion was that the caller be allowed to supply it, whether as an optional argument or actually as a floating-point number in calctype. But I'm not sure that's the right way to allow callers to avoid updating the frequency used mid-fit. Given the current implementation, Fitter instances could ensure that the model this was called on was the pre-fit model, not the model updated to reflect fitting.

# Handle various types of input times for "taylor" calc
if isinstance(times, TOAs):
# If we pass the TOA table, then barycenter the TOAs
bts = self.get_barycentric_toas(times) # have units of days
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually want to barycenter the TOAs (like, make this dependent on sky position)? If so, should there be a faster shortcut computation method (fasttaylor) that just uses the un-barycentered times?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't know the answer to the first question. I think it depends on your use case and how precise you want things. A fasttaylor method is probably a good idea no matter what, though.

else:
raise ValueError("times as Time instance in needs scale=='tdb'")
elif isinstance(times, MJDParameter):
bts = np.asarray(times.value) << u.d # .value is always a MJD long double
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's a Parameter maybe better to use .quantity, which always returns a Quantity ready to go?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because .quantity for an MJDParameter returns an astropy Time instance, which doesn't work for us here (at least not in an easy way).

elif isinstance(times, MJDParameter):
bts = np.asarray(times.value) << u.d # .value is always a MJD long double
else:
bts = np.asarray(times) << u.d
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if it's already a Quantity but accidentally in seconds?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Maybe a ValueError since the input times are supposed to be dates (i.e. MJDs, if an array)?

bts = np.asarray(times.value) << u.d # .value is always a MJD long double
else:
bts = np.asarray(times) << u.d
dts = bts - (self.PEPOCH.value << u.d)
Copy link
Contributor

@aarchiba aarchiba Jul 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again using the .quantity member of a parameter is probably better than stripping the units and then reapplying days. Particularly since, as shown in the example notebook, users may have created their own PEPOCH parameter - if that's in time units, we want to convert to days, and if not we want an exception.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEPOCH is an MJDParam, so it returns a Time instance and not a day quantity. So I think we have to do it something like this unless we want to work with astropy TimeDeltas, and we definitely don't want that.

incfrozen : bool
Whether to include frozen parameters in the design matrix
incoffset : bool
Whether to include the constant offset in the design matrix
calctype : {'modelF0', 'numerical', 'taylor'}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to make the possibilities for calctype the same here as in other functions that accept this parameter? I guess it's a bit trivial for get_spin_freq but it's nicer to have a more uniform calling convention.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are all identical now (including with the docstrings), except that .designmatrix also allows a phase option.

@@ -1068,11 +1068,11 @@ def plotResiduals(self, keepAxes=False):
else self.psr.postfit_resids
)
if self.y_unit == u.us:
f0 = r.get_PSR_freq().to(u.MHz).value
f0 = r.model.F0.quantity.to(u.MHz).value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be r.model.get_spin_freq() with some appropriate options? At a minimum so we can easily identify this as a place we do that conversion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? But get_spin_freq requires times to be specified. What time are we going to use for plotting? (and that is all this is used for: the 2nd axis of the residual plots)

@@ -1502,6 +1496,57 @@ def delete_jump_and_flags(self, toa_table, jump_num):
return
self.components["PhaseJump"].setup()

def get_spin_freq(self, times, calctype="modelF0"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this allow times=None when just using modelF0 to avoid forcing users to make up a bogus time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not a bad idea. Although if you know you are not giving times and explicitly specifying "modelF0", why wouldn't you just use m.F0.quantity rather than calling a method?

src/pint/simulation.py Outdated Show resolved Hide resolved
src/pint/simulation.py Outdated Show resolved Hide resolved
src/pint/simulation.py Outdated Show resolved Hide resolved
return resids << u.s if return_time else resids


def compute_random_model_resids_fast(fitter, models, toas, return_time=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design matrix can also be directly used to compute one-sigma "error bands", with a small additional amount of linear algebra. Such error bands don't fully capture the shape of deviations but can be a very tidy way to summarize the uncertainty in a model. Would it be good to have a function to compute these and a plotting mode in pintk to show them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically exactly how/why we use the random models now in pintk. But I definitely think it might be useful to be able to do this computation, for instance, at the time of a non-phase-connected TOA in the future to see if it is likely phase-connected or not. We use random models now for that, but this is probably a much more "surgical" option. Do you know of a ref for the linear algebra to do this?

@scottransom
Copy link
Member Author

Just to check: "how many TOAs there are" means "how many places you want to sample the random models", right? Not how many input TOAs there are? (I know the current random models splices a few real TOAs at real frequencies in with the fake TOAs at infinite frequencies, which leads to weirdness when fitting DMs.)

Yes, that is correct. I'll edit that in the description.

  • Could one speed up the process by keeping design matrix entries for any real TOAs one wanted to include from when they were calculated during the fitting? The Downhill fitters already cache design matrices, though they probably don't expose that to the user.

I think cached design matrix calculations would be super useful.

  • One doesn't actually need to do a fit, just to compute the parameter covariance matrix, before one can use random matrices. Should fitters have a public interface for computing the covariance matrix only? (Even just "fit with zero iterations"?) Should there be a way to record parameter covariance matrices in, or adjacent to, par files?

That would also be really useful, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants