-
-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve approximate order on getitem calls #35265
Improve approximate order on getitem calls #35265
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## develop #35265 +/- ##
========================================
Coverage 88.61% 88.61%
========================================
Files 2148 2148
Lines 398855 398841 -14
========================================
+ Hits 353438 353439 +1
+ Misses 45417 45402 -15
... and 23 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this might slow things down a little bit, it should make the code more robust. My proposed change is to try and simplify the logic a bit (at least to my eyes).
…/sage into improve_approximate_order_on__getitem__calls
Co-authored-by: Travis Scrimshaw <[email protected]>
I think that this should also simplify the computation of |
dense case is missing, working on it |
…dense case, adapt doctests
src/sage/data_structures/stream.py
Outdated
# Dense implementation | ||
i = n - self._offset | ||
if i >= len(self._cache): | ||
a = len(self._cache) + self._offset | ||
if self._true_order: | ||
# It is important to extend by generator: | ||
# self._iter might recurse, and thereby extend the | ||
# cache itself, too. | ||
self._cache.extend(next(self._iter) for _ in range(a, n+1)) | ||
c = self._cache[i] | ||
else: | ||
for _ in range(a, n+1): | ||
c = next(self._iter) | ||
self._cache.append(c) | ||
if c: | ||
self._true_order = True | ||
else: | ||
self._approximate_order += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how to make inline review suggestions that span such changes like this. It would be nice if things worked much more uniformly than they do...
# Dense implementation | |
i = n - self._offset | |
if i >= len(self._cache): | |
a = len(self._cache) + self._offset | |
if self._true_order: | |
# It is important to extend by generator: | |
# self._iter might recurse, and thereby extend the | |
# cache itself, too. | |
self._cache.extend(next(self._iter) for _ in range(a, n+1)) | |
c = self._cache[i] | |
else: | |
for _ in range(a, n+1): | |
c = next(self._iter) | |
self._cache.append(c) | |
if c: | |
self._true_order = True | |
else: | |
self._approximate_order += 1 | |
# Dense implementation | |
i = n - self._offset | |
if i >= len(self._cache): | |
a = len(self._cache) + self._offset | |
if not self._true_order: | |
for _ in range(a, n+1): | |
c = next(self._iter) | |
if c: | |
self._true_order = True | |
break | |
a = len(self._cache) + self._offset | |
self._approximate_order = a | |
if self._true_order: | |
# It is important to extend by generator: | |
# self._iter might recurse, and thereby extend the | |
# cache itself, too. | |
self._cache.extend(next(self._iter) for _ in range(a, n+1)) |
It is slightly more complicated logic, but it reduces some duplication and should be faster. Basically, it works by updating until it finds the true order then proceeds by a direct extension. If it doesn't find the true order but reaches n
, then it falls through with a large enough cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea, however, as coded it does not work for me. One reason is that we have to put c
into the cache. However, this still breaks _offset
. I am working on it.
Your comment sparkled quite a simplification, please have a look. The idea is to get rid of I noticed, that some (or, at least one) of the (in)equality checks could be improved, I have not done this yet. There is one minor (?) optimisation that is possible in the sparse case: if we compute |
I think improving the (in)equality checks could be done in a separate ticket. |
Sorry, there is still a bug - I tried to make the logic clean, but I still tripped over a case. If |
Doing that for the dense case was something I remember us considering and then rejecting for some reason (which I don't remember). It might have been something relevant to the implementation at the time. I also think for the sparse case not putting things into the cache is the right way to go. The smaller the cache, the less we have to copy/manipulate/store-in-memory/etc. |
This should be OK now. It is not the most beautiful piece of code I've ever written, but I don't see how to make it much better. I also tried it locally with #35293, which works after resolving a trivial merge conflict. It is not clear to me whether in which circumstances I should update to develop. Feel free to click on "update branch" (or tell me to do it in case you do not have permission). |
Oh, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I think this is a good improvement. Let's get this in. If we find anything else, we can do a followup PR. I will also update #35293 accordingly.
Documentation preview for this PR is ready! 🎉 |
Merge conflict |
<!-- ^^^^^ Please provide a concise, informative and self-explanatory title. Don't put issue numbers in there, do this in the PR body below. For example, instead of "Fixes #1234" use "Introduce new method to calculate 1+1" --> ### 📚 Description This address the arity problem noted in #35261. It also removes a `TODO` in a `revert()` implementation by making the behavior the same across different implementations. Warnings are added for the user for the assumptions made by the code. ### 📝 Checklist <!-- Put an `x` in all the boxes that apply. --> <!-- If your change requires a documentation PR, please link it appropriately --> <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> - [x] I have made sure that the title is self-explanatory and the description concisely explains the PR. - [x] I have linked an issue or discussion. - [x] I have created tests covering the changes. - [x] I have updated the documentation accordingly. ### ⌛ Dependencies <!-- List all open pull requests that this PR logically depends on --> - #35127 Avoiding a long test. - #35254 Avoiding merge conflicts. - #35291 Making sure these changes work together. - #35265 For some slight simplification of the new stream's `__getitem__`. URL: #35293 Reported by: Travis Scrimshaw Reviewer(s): Martin Rubey, Travis Scrimshaw
<!-- Please provide a concise, informative and self-explanatory title. --> <!-- Don't put issue numbers in the title. Put it in the Description below. --> <!-- For example, instead of "Fixes sagemath#12345", use "Add a new method to multiply two integers" --> We allow lazy series to compute infinite sums `\sum_{i \in I} p_i` and products `\prod_{i \in I} (1 + p_i)` over arbitrary (enumerated) index sets `I` subject to a somewhat mild technical constraint that the order of each input `p_i` is weakly increasing wrt the iteration order and the preimage of each order is finite. ### 📝 Checklist <!-- Put an `x` in all the boxes that apply. It should be `[x]` not `[x ]`. --> - [x] The title is concise, informative, and self-explanatory. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [x] I have created tests covering the changes. - [x] I have updated the documentation accordingly. ### ⌛ Dependencies <!-- List all open PRs that this PR logically depends on - sagemath#12345: short description why this is a dependency - sagemath#34567: ... --> - sagemath#35265 <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> URL: sagemath#35362 Reported by: Travis Scrimshaw Reviewer(s): Martin Rubey, Travis Scrimshaw
<!-- Please provide a concise, informative and self-explanatory title. --> <!-- Don't put issue numbers in the title. Put it in the Description below. --> <!-- For example, instead of "Fixes sagemath#12345", use "Add a new method to multiply two integers" --> We allow lazy series to compute infinite sums `\sum_{i \in I} p_i` and products `\prod_{i \in I} (1 + p_i)` over arbitrary (enumerated) index sets `I` subject to a somewhat mild technical constraint that the order of each input `p_i` is weakly increasing wrt the iteration order and the preimage of each order is finite. ### 📝 Checklist <!-- Put an `x` in all the boxes that apply. It should be `[x]` not `[x ]`. --> - [x] The title is concise, informative, and self-explanatory. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [x] I have created tests covering the changes. - [x] I have updated the documentation accordingly. ### ⌛ Dependencies <!-- List all open PRs that this PR logically depends on - sagemath#12345: short description why this is a dependency - sagemath#34567: ... --> - sagemath#35265 <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> URL: sagemath#35362 Reported by: Travis Scrimshaw Reviewer(s): Martin Rubey, Travis Scrimshaw
📚 Description
Calling
Stream_inexact.__getitem__(n)
forn == self._approximate_order
, we can updateself._true_order
andself._approximate_order
as follows:n
is the true ordern+1
is a lower bound for the order.This fixes #35261, but it may not be the best fix, because it potentially makes
__getitem__
slower.Fixes #34556, which was the original issue.
📝 Checklist
⌛ Dependencies