implement a fast atan approximation function #1583

zhichen3 · 2024-06-18T18:46:07Z

implement a fast atan function using an approximate function for x in [-1, 1], for |x| > 1, the identity arctan(x) = sign(pi/2, x) - arctan(1/x) (Thanks to Eric) is used.

Two versions of the approximate functions are included.

Efficient Approximations for the Arctangent Function by Rajan 2006: Equation 9: Max error ~0.0015 rad
https://stackoverflow.com/questions/42537957/fast-accurate-atan-arctan-approximation-algorithm: Max error ~0.00063 rad

Generally I find the form from stackoverflow is more accurate and also faster. 1) is also discussed in 2).

In general, from testing, atanf is roughly more than 2 times faster than std::atan.

util/approx_math/fast_atan.H

zhichen3 · 2024-06-20T01:29:25Z

Not sure why hip failed to compile. Also I kept the volatile qualifier for testing because runtime wouldn't increase with iteration otherwise.

zingale · 2024-06-20T11:24:34Z

I suspect the HIP failure is due to changes in ROCm and there is a namespace clash now.

zingale · 2024-06-20T11:32:38Z

can you try not including <AMReX_GpuUtility.H> and instead directly include just what you need?

zingale · 2024-06-20T11:37:56Z

or alternately, include <AMReX_Gpu.H>, since that seems to do some namespace magic.

yut23 · 2024-06-20T13:57:47Z

It's probably this line:

Microphysics/networks/iso7/actual_network.H

Line 10 in 30469dd

using namespace amrex;

yut23 · 2024-06-20T14:02:49Z

atanf() may conflict with the definition from the standard library. I think fast_atan() works fine.

zingale · 2024-06-20T15:37:37Z

indeed, we should get rid of those using namespace amrex
I'll try that in a separate PR

zhichen3 · 2024-06-20T20:42:12Z

I figured checking nan is unnecessary, so I removed it and also removed AMReX_GpuUtility.H since I don't need isnan anymore. I guess this also fixed hip compilation error.

yut23 · 2024-06-25T20:20:45Z

I've been looking at the screening code a bunch for the autodiff stuff, and I think we may want to include the third-order term in the Taylor series, since we do sqrt_gamma - atan(sqrt_gamma) which cancels to zero at first-order. Here's a comparison between the first and third order series: the relative error of the first-order series goes up to around 32% and levels off. There's also a discontinuity when it switches from fast_atan_1 to the Taylor series.

zhichen3 · 2024-06-25T21:38:58Z

sounds good to me

zhichen3 added 9 commits June 17, 2024 18:41

initial commit

0e28113

update test with nested loop

4080ed3

update

e836697

update screening to use fast_atan

f9968d9

fix make.package path

8228288

update to use fast_atan between [-1,1]

7e298ff

Fix sign issue

6753f4f

some update

07d7786

remove some comment

89c17d9

zingale reviewed Jun 18, 2024

View reviewed changes

util/approx_math/fast_atan.H Outdated Show resolved Hide resolved

zingale reviewed Jun 18, 2024

View reviewed changes

util/approx_math/fast_atan.H Outdated Show resolved Hide resolved

zhichen3 added 7 commits June 18, 2024 15:38

rename fast_atan -> approx_math

3a7944c

add reference to the approximation formula

7b4ab8f

update tests

f4d86a2

codespell

4126135

update benchmark

829e147

update benchmark again

f155af8

update benchmark again

284b918

zhichen3 changed the title ~~Fast atan~~ implement a fast atan approximation function Jun 19, 2024

zhichen3 added 3 commits June 19, 2024 13:50

update benchmark

fed1fea

update benchmark

4dea18f

update header guard name

2578057

zhichen3 marked this pull request as ready for review June 20, 2024 01:28

remove uncessary header

be7af14

zhichen3 added 2 commits June 20, 2024 11:23

revert name: atanf -> fast_atan

a045d52

update screening atan call

cf873b8

zhichen3 added 6 commits June 20, 2024 11:54

update unit test suggested by eric

860ff11

update when x < 1.e-2, then just return x

fe29743

do some cleaning and get rid of isnan check

682f4ac

slightly more cleaning

709cbed

update benchmark

7c52dc7

update benchmakr again

4e9696a

zhichen3 mentioned this pull request Jun 20, 2024

Fast exp algorithm implementation. #1586

Merged

zingale and others added 5 commits June 21, 2024 13:50

Merge branch 'development' into fast_atan

bfe01d6

Merge branch 'development' into fast_atan

7b1efed

update benchmark

8c4e5db

update benchmark again

fea71ae

Merge branch 'development' into fast_atan

3120318

yut23 approved these changes Jun 22, 2024

View reviewed changes

zingale merged commit 134200e into AMReX-Astro:development Jun 22, 2024
29 checks passed

yut23 mentioned this pull request Jul 8, 2024

Update microphysics ase_nse_net benchmark pynucastro/pynucastro#755

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement a fast atan approximation function #1583

implement a fast atan approximation function #1583

zhichen3 commented Jun 18, 2024 •

edited

Loading

zhichen3 commented Jun 20, 2024

zingale commented Jun 20, 2024

zingale commented Jun 20, 2024

zingale commented Jun 20, 2024

yut23 commented Jun 20, 2024

yut23 commented Jun 20, 2024

zingale commented Jun 20, 2024

zhichen3 commented Jun 20, 2024

yut23 commented Jun 25, 2024

zhichen3 commented Jun 25, 2024

implement a fast atan approximation function #1583

implement a fast atan approximation function #1583

Conversation

zhichen3 commented Jun 18, 2024 • edited Loading

zhichen3 commented Jun 20, 2024

zingale commented Jun 20, 2024

zingale commented Jun 20, 2024

zingale commented Jun 20, 2024

yut23 commented Jun 20, 2024

yut23 commented Jun 20, 2024

zingale commented Jun 20, 2024

zhichen3 commented Jun 20, 2024

yut23 commented Jun 25, 2024

zhichen3 commented Jun 25, 2024

zhichen3 commented Jun 18, 2024 •

edited

Loading