Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(Data/Nat): faster computation of Nat.log #17325

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open

Conversation

b-mehta
Copy link
Contributor

@b-mehta b-mehta commented Oct 1, 2024

Give an alternate definition Nat.logC for Nat.log which computes more efficiently (roughly speaking, this should have logarithmic time in the output, while Nat.log has linear time in its output).
Prove that these two give the same output, and use csimp to ensure evaluation of Nat.log uses Nat.logC instead.


Open in Gitpod

@b-mehta b-mehta added the t-data Data (lists, quotients, numbers, etc) label Oct 1, 2024
Copy link

github-actions bot commented Oct 1, 2024

PR summary 5e2f2d7089

Import changes for modified files

No significant changes to the import graph

Import changes for all files
Files Import difference

Declarations diff

+ logC
+ logTR
+ log_eq_logC

You can run this locally as follows
## summary with just the declaration names:
./scripts/declarations_diff.sh <optional_commit>

## more verbose report:
./scripts/declarations_diff.sh long <optional_commit>

The doc-module for script/declarations_diff.sh contains some details about this script.


No changes to technical debt.

You can run this locally as

./scripts/technical-debt-metrics.sh pr_summary
  • The relative value is the weighted sum of the differences with weight given by the inverse of the current value of the statistic.
  • The absolute value is the relative value divided by the total sum of the inverses of the current values (i.e. the weighted average of the differences).

Mathlib/Data/Nat/Log.lean Outdated Show resolved Hide resolved
Comment on lines +347 to +348
#eval Nat.logTR 2 (2 ^ 1000000)
#eval Nat.logC 2 (2 ^ 1000000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kim-em, what's the SOTA for committing benchmarks / performance tests like these? It would be great to have a test file that computes a big logarithm, and even better if we can assert it is "quick".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I read it again, the point of this was that Nat.logTR times out on this computation whereas Nat.logC doesn't, and Nat.log hits stack limit at something like Nat.log 2 (2 ^ 11000).
I wouldn't mind removing these lines either, it's largely to save a future person's time if they think "I wonder if tail-recursion would be better than this weird doubling algorithm"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment really was "let's add a test somewhere of something that didn't work before your change"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to just create tests/nat/log.lean, if these are meant to be regression tests.

If they are documentation, they should be code blocks inside a module doc next to the definition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong preference here. What's the distinction between having a code block as a module doc as opposed to a code block in the definition docstring?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean in particular that these aren't meant as regression tests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to just create tests/nat/log.lean, if these are meant to be regression tests.

We should add this file, with an invocation that took >20s before and now takes <1s.

@b-mehta b-mehta requested a review from eric-wieser October 8, 2024 12:49
Mathlib/Data/Nat/Log.lean Outdated Show resolved Hide resolved
Mathlib/Data/Nat/Log.lean Outdated Show resolved Hide resolved
@kim-em kim-em added the awaiting-author A reviewer has asked the author a question or requested changes label Oct 25, 2024
Mathlib/Data/Nat/Log.lean Outdated Show resolved Hide resolved
Mathlib/Data/Nat/Log.lean Outdated Show resolved Hide resolved
Co-authored-by: Yury G. Kudryashov <[email protected]>
Mathlib/Data/Nat/Log.lean Outdated Show resolved Hide resolved
Mathlib/Data/Nat/Log.lean Outdated Show resolved Hide resolved
@b-mehta b-mehta removed the awaiting-author A reviewer has asked the author a question or requested changes label Nov 12, 2024
Adapted from https://downloads.haskell.org/~ghc/9.0.1/docs/html/libraries/ghc-bignum-1.0/GHC-Num-BigNat.html#v:bigNatLogBase-35-
-/
@[pp_nodot] def logC (b m : ℕ) : ℕ :=
if h : 1 < b then let (_, e) := step b h; e else 0 where
Copy link
Member

@urkud urkud Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional (I didn't test if it results in the same IR):

Suggested change
if h : 1 < b then let (_, e) := step b h; e else 0 where
if h : 1 < b then (step b h).2 else 0 where

Copy link
Member

@jcommelin jcommelin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors d+

Comment on lines +347 to +348
#eval Nat.logTR 2 (2 ^ 1000000)
#eval Nat.logC 2 (2 ^ 1000000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean in particular that these aren't meant as regression tests?

@mathlib-bors
Copy link
Contributor

mathlib-bors bot commented Dec 20, 2024

✌️ b-mehta can now approve this pull request. To approve and merge a pull request, simply reply with bors r+. More detailed instructions are available here.

@b-mehta
Copy link
Contributor Author

b-mehta commented Jan 21, 2025

bors merge

@leanprover-community-mathlib4-bot leanprover-community-mathlib4-bot added the ready-to-merge This PR has been sent to bors. label Jan 21, 2025
test/Nat/log.lean Outdated Show resolved Hide resolved
mathlib-bors bot pushed a commit that referenced this pull request Jan 21, 2025
@mathlib-bors
Copy link
Contributor

mathlib-bors bot commented Jan 21, 2025

Canceled.

@mattrobball
Copy link
Collaborator

Can you also update the PR description with the changes and their reason (for the people who grep the git log)? Thanks.

@b-mehta
Copy link
Contributor Author

b-mehta commented Jan 21, 2025

Updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
delegated ready-to-merge This PR has been sent to bors. t-data Data (lists, quotients, numbers, etc)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants