Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First cut at new rational_adaptor. #366

Merged
merged 22 commits into from
Oct 4, 2021
Merged

Conversation

jzmaddock
Copy link
Collaborator

This is a first cut at a new rational_adaptor that doesn't use Boost.Rational and is better optimised for arbitrary precision integers. It passes our current tests, but more tests are required.

@jzmaddock jzmaddock marked this pull request as ready for review September 7, 2021 07:32
@jzmaddock
Copy link
Collaborator Author

If anyone (@ckormanyos ?) can take a look and review/double check this I'd really appreciate it. This removes the dependency to Boost.Rational. I still need to run some performance comparisons, but other than that it should be ready to go now.

@ckormanyos
Copy link
Member

ckormanyos commented Sep 7, 2021

anyone (@ckormanyos ?) can take a look and review/double

Thank you , John for diving into this topic and trying to remove the rational dependency --- a great step toward potential standalone.

  • In lines
    like these, I have gotten into the habit of setting the default parameter to nullptr instead of 0. This is really just a matter of style and I'm not sure if/or when this should/could be done consistently. To me nullptrmakes more sense.
  • When I saw this line, I wondered if std::is_arithmetic would confidently capture all the necessary cases?
  • This preprocessor line is always catching the 0, as has also been pointed out above

@jzmaddock
Copy link
Collaborator Author

I have gotten into the habit of setting the default parameter to nullptr instead of 0. This is really just a matter of style and I'm not sure if/or when this should/could be done consistently. To me nullptrmakes more sense.

Will fix.

When I saw this line, I wondered if std::is_arithmetic would confidently capture all the necessary cases?

The definition in here: https://github.com/boostorg/multiprecision/blob/1b31fbea21bcf5b5b1e7d5ef4a8870aabed2161e/include/boost/multiprecision/traits/std_integer_traits.hpp is almost the same as std::is_arithmetic but adds consistent support for __Int128 etc. Ironically, this workaround was only needed because we switched from boost type_traits to the std ones!

This preprocessor line is always catching the 0, as has also been pointed out above

Not good, good catch, will fix shortly.

Tidy up whitespace, and use nullptr where appropriate.
@afabri
Copy link

afabri commented Sep 9, 2021

At the end of arithmetic operations, instead of simplifying by dividing with the gcd, the gcd function is called several times to simplify intermediate expressions. Does anybody know if that pays off for all sizes of integers or if there is a threshold.

@jzmaddock
Copy link
Collaborator Author

At the end of arithmetic operations, instead of simplifying by dividing with the gcd, the gcd function is called several times to simplify intermediate expressions. Does anybody know if that pays off for all sizes of integers or if there is a threshold.

Good question, I'll experiment.

In the mean time, here's the preliminary performance comparisons, mostly the new code does OK, but there are a couple of strange outliers I need to look into:

Table 1.72. Operator *

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 1.10175 (1.29208s) 1.17481 (3.01313s) 1.26352 (6.78634s) 1.37821 (16.0559s)
cpp_rational(develop) 1.18315 (1.4364s) 1.39656 (3.60312s) 1.40385 (7.58984s) 1.41806 (17.0363s)
mpq_rational 1 (1.17276s) 1 (2.56478s) 1 (5.37097s) 1 (11.6499s)

Table 1.73. Operator *(int)

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 1 (0.0440046s) 1 (0.0849735s) 1 (0.148665s) 1 (0.278337s)
cpp_rational(develop) [4.1] (0.181914s) 5.6 (0.387531s) 2.9 (0.573863s) 3.12 (0.870819s)
mpq_rational 7.87033 (0.346331s) 5.60552 (0.476321s) 2.71581 (0.403744s) 1.66801 (0.464269s)

Table 1.74. Operator *(unsigned long long)

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 1 (0.245235s) 1 (0.482828s) 1.16943 (0.893633s) 2.11565 (1.67612s)
cpp_rational(develop) 1.8 (0.447167s) 1.7 (0.844429s) 1.90081 (1.35896s) 2.97039 (2.37626s)
mpq_rational 2.72477 (0.668207s) 1.42855 (0.689746s) 1 (0.764158s) 1 (0.792249s)

Table 1.75. Operator *=(unsigned long long)

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 1 (0.29657s) 1 (0.549148s) 1.2823 (0.96603s) 2.38235 (1.9245s)
cpp_rational(develop) 1.45 (0.435778s) 1.23469 (0.847931s) 2.20072 (1.57936s) 3.02908 (2.41011s)
mpq_rational 2.28287 (0.677029s) 1.24615 (0.684319s) 1 (0.753355s) 1 (0.807814s)

Table 1.76. Operator +

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 1.49938 (0.784278s) 1.34743 (1.6984s) 1.37717 (3.84149s) 1.41546 (9.19853s)
cpp_rational(develop) 2.17171 (1.18719s) 1.91454 (2.43372s) 1.69452 (4.78894s) 1.64027 (10.7223s)
mpq_rational 1 (0.52307s) 1 (1.26047s) 1 (2.78941s) 1 (6.49862s)

Table 1.77. Operator +(int)

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 5.57552 (0.641649s) 10.796 (1.46357s) 20.894 (3.25673s) 35.3507 (7.36671s)
cpp_rational(develop) 1.81035 (0.212141s) 3.50398 (0.443488s) 3.88847 (0.733996s) 4.12376 (0.902819s)
mpq_rational 1 (0.115083s) 1 (0.135566s) 1 (0.155869s) 1 (0.208389s)

Table 1.78. Operator +(unsigned long long)

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 2.29109 (0.681809s) 4.62899 (1.50387s) 8.4593 (3.33481s) 21.2893 (8.21041s)
cpp_rational(develop) 1.14587 (0.331196s) 1.59063 (0.467973s) 1.95391 (0.650182s) 2.52932 (0.971168s)
mpq_rational 1 (0.297591s) 1 (0.324881s) 1 (0.394219s) 1 (0.385659s)

Table 1.79. Operator +=(unsigned long long)

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 1 (0.0627029s) 1 (0.0695193s) 1 (0.0885983s) 1 (0.119314s)
cpp_rational(develop) 1.04679 (0.305264s) 1.48998 (0.453898s) 1.95382 (0.64145s) 2.2881 (0.974988s)
mpq_rational 4.63843 (0.290843s) 4.47252 (0.310927s) 3.82819 (0.339171s) 3.40139 (0.405833s)

Table 1.80. Operator -

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 1.4821 (0.781748s) 1.34429 (1.7058s) 1.37777 (3.89057s) 1.41853 (9.23662s)
cpp_rational(develop) 1.92786 (1.05892s) 1.88654 (2.40996s) 1.7012 (4.8025s) 1.61842 (10.7252s)
mpq_rational 1 (0.52746s) 1 (1.26893s) 1 (2.82382s) 1 (6.51138s)

Table 1.81. Operator -(int)

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 5.3698 (0.638218s) 8.3919 (1.46922s) 20.5619 (3.25605s) 34.2709 (7.42821s)
cpp_rational(develop) 1.47683 (0.19s) 3.42541 (0.440702s) 3.26896 (0.675656s) 3.98653 (0.917965s)
mpq_rational 1 (0.118853s) 1 (0.175076s) 1 (0.158353s) 1 (0.21675s)

Table 1.82. Operator -(unsigned long long)

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 2.28341 (0.680618s) 4.66372 (1.4809s) 8.5964 (3.26282s) 19.2239 (7.52569s)
cpp_rational(develop) 1.08424 (0.330445s) 1.55894 (0.4649s) 2.03842 (0.673936s) 2.60596 (1.03343s)
mpq_rational 1 (0.298071s) 1 (0.317537s) 1 (0.379557s) 1 (0.391477s)

Table 1.83. Operator -=(unsigned long long)

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 1 (0.0616676s) 1 (0.0716217s) 1 (0.0915789s) 1 (0.123615s)
cpp_rational(develop) 1.11705 (0.321743s) 1.5533 (0.471277s) 1.99514 (0.669887s) 2.5145 (0.994094s)
mpq_rational 4.84987 (0.29908s) 4.31996 (0.309403s) 3.79471 (0.347516s) 3.25761 (0.402688s)

Table 1.84. Operator /

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 1.34343 (3.80757s) 1.3953 (7.11993s) 1.46997 (15.1096s) 1.65132 (34.8349s)
cpp_rational(develop) 1.3776 (3.98234s) 1.61088 (8.19393s) 1.62661 (17.2913s) 1.73601 (37.1927s)
mpq_rational 1 (2.83421s) 1 (5.10279s) 1 (10.2789s) 1 (21.0952s)

Table 1.85. Operator /(int)

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 1 (0.0450395s) 1 (0.0887703s) 1 (0.145568s) 1 (0.26773s)
cpp_rational(develop) 4.122 (0.185692s) 4.39 (0.389876s) 3.9 (0.574013s) 3.2 (0.866507s)
mpq_rational 9.33538 (0.420461s) 6.45783 (0.573264s) 3.99751 (0.581909s) 2.04621 (0.547832s)

Table 1.86. Operator /(unsigned long long)

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 1 (0.248277s) 1 (0.477218s) 1.11799 (0.900395s) 1.77994 (1.70253s)
cpp_rational(develop) 1.81 (0.449759s) 1.77 (0.846367s) 1.71944 (1.3685s) 2.40262 (2.38789s)
mpq_rational 3.01522 (0.74861s) 1.60033 (0.763707s) 1 (0.80537s) 1 (0.956506s)

Table 1.87. Operator /=(unsigned long long)

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 1 (0.293623s) 1 (0.5531s) 1.10816 (0.964113s) 2.15991 (1.90076s)
cpp_rational(develop) 1.48 (0.4348s) 1.52 (0.839948s) 1.71274 (1.41427s) 2.66951 (2.37585s)
mpq_rational 2.54365 (0.746874s) 1.39243 (0.770152s) 1 (0.870013s) 1 (0.880021s)

Table 1.88. Operator construct

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 1.6 (0.0073899s) 1.8 (0.0077339s) 1.46 (0.0073439s) 1.65 (0.0075128s)
cpp_rational(develop) 1 (0.0045752s) 1 (0.0042987s) 1 (0.0050471s) 1 (0.0045598s)
mpq_rational 19.6985 (0.14557s) 25.3958 (0.196408s) 25.4413 (0.186838s) 18.6016 (0.13975s)

Table 1.89. Operator construct(unsigned long long)

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 1 (0.0041005s) 1 (0.0042113s) 1 (0.004049s) 1 (0.004131s)
cpp_rational(develop) 2.66 (0.0109136s) 2.7 (0.0115913s) 2.66 (0.0107292s) 2.47 (0.0102217s)
mpq_rational 93.727 (0.384328s) 107.33 (0.452s) 114.492 (0.463578s) 91.5304 (0.378112s)

Table 1.90. Operator construct(unsigned)

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 1 (0.0038144s) 1 (0.0040615s) 1 (0.0040119s) 1 (0.0043592s)
cpp_rational(develop) 2.8 (0.01064s) 2.7 (0.0108306s) 2.7 (0.0107739s) 2.6 (0.0112458s)
mpq_rational 37.6055 (0.143443s) 43.7209 (0.177572s) 45.7938 (0.18372s) 30.8193 (0.134348s)

Table 1.91. Operator str

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 2.71214 (0.001483s) 2.99688 (0.0037521s) 4.5722 (0.0085148s) 5.01046 (0.0215971s)
cpp_rational 5.79066 (0.0033111s) 4.84023 (0.0046534s) 4.92879 (0.0105407s) 5.34843 (0.0224324s)
mpq_rational 1 (0.0005468s) 1 (0.001252s) 1 (0.0018623s) 1 (0.0043104s)

@jzmaddock
Copy link
Collaborator Author

Update: there's a bug in the add-scalar routine that does an unnecessary gcd. Testing fixes now.

@jzmaddock
Copy link
Collaborator Author

At the end of arithmetic operations, instead of simplifying by dividing with the gcd, the gcd function is called several times to simplify intermediate expressions. Does anybody know if that pays off for all sizes of integers or if there is a threshold.

Leaving aside experiments for a moment, there is a good theoretical reason for this: if we assume that gcd is O(N^2) then it is better to do an O(N) operation twice, than an O(2N) operation once, ie 2N^2 is better than (2N)^2.

@jzmaddock
Copy link
Collaborator Author

Much better for +- a scalar now:

Operator +(int)

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational 1 (0.0147207s) 1 (0.0151953s) 1 (0.018852s) 1 (0.0308647s)
cpp_rational(develop) 14.4 (0.212141s) 29.2 (0.443488s) 39.8 (0.733996s) 29.2 (0.902819s)
mpq_rational 7.85834 (0.11568s) 8.38258 (0.127376s) 8.62888 (0.162672s) 6.65762 (0.205485s)

@jzmaddock
Copy link
Collaborator Author

At the end of arithmetic operations, instead of simplifying by dividing with the gcd, the gcd function is called several times to simplify intermediate expressions. Does anybody know if that pays off for all sizes of integers or if there is a threshold.

I coded this up quickly for multiply/divide, and while the code is noticeably simpler, it's also slower at all the bit-counts I tested:

Backend 128 Bits 256 Bits 512 Bits 1024 Bits
cpp_rational, 1x gcd 1.54878s 5.27546s) 8.08625s 19.7865s
cpp_rational, 2x gcd 1.29208s 3.01313s 6.78634s 16.0559s

@afabri
Copy link

afabri commented Sep 11, 2021

Thank you for the benchmarks. Could you put them in this PR or on gist.github.com?
It might be interesting to also compare directly with gmpq_class and maybe even make a benchmark for an expression like a = b * c + d * e.

@jzmaddock
Copy link
Collaborator Author

Thank you for the benchmarks. Could you put them in this PR or on gist.github.com?

The benchmark program is libs/multiprecision/performance/performance_test.cpp (though this does need updating and documenting a bit better).

For the benefit of future googlers, the alternative multiply divide code is:

template <class Backend> 
void eval_multiply_imp(rational_adaptor<Backend>& result, const rational_adaptor<Backend>& a, const Backend& b_num, const Backend& b_denom)
{
   using default_ops::eval_multiply;
   using default_ops::eval_divide;
   using default_ops::eval_gcd;
   using default_ops::eval_get_sign;
   using default_ops::eval_eq;

   eval_multiply(result.num(), a.num(), b_num);
   eval_multiply(result.denom(), a.denom(), b_denom);

   Backend gcd;
   eval_gcd(gcd, result.num(), result.denom());
   if (!eval_eq(gcd, rational_adaptor<Backend>::one()))
   {
      Backend t;
      eval_divide(t, result.num(), gcd);
      result.num().swap(t);
      eval_divide(t, result.denom(), gcd);
      result.denom() = std::move(t);
   }
   //
   // We may have b_denom negative if this is actually division, if so just correct things now:
   //
   if (eval_get_sign(b_denom) < 0)
   {
      result.num().negate();
      result.denom().negate();
   }
}

@jzmaddock
Copy link
Collaborator Author

It might be interesting to also compare directly with gmpq_class and maybe even make a benchmark for an expression like a = b * c + d * e.

What I've been keeping my eye out for, but haven't found yet, is a nice example of a rational number sequence - not too crazy hard to compute, but hard enough to properly exercise the arithmetic operators. Any one any ideas?

@ckormanyos
Copy link
Member

nice example of a rational number sequence

Bernoulli numbers. But they diverge rapidly.

There are some really neat results here. I wonder if something like zeta(18) would work out?

@ckormanyos
Copy link
Member

ckormanyos commented Sep 12, 2021

Consider 10 terms of the expansion of zeta(18) via Table[1/(1 - (1/Prime[n]^18)), {n, 1, 10}]. It looks really neat. Also consider Product[1/(1 - (1/Prime[n]^18)), {n, 1, 10}]

The table is, for instance, here

The result of the product is, I believe:

11920749906148903974622678329766235165644684303503019936856106078371960015175135631236820267948711321333682716658606986520111083984375 / 11920704401324278745505365929928231339092941389995016138294384892997997915658307322817405715785120303841876752017118685699841898110976

@afabri
Copy link

afabri commented Sep 12, 2021

a nice example of a rational number sequence

What we have in our project (www.cgal,org) are things like orientation tests for plane/point, or sphere/point , both computing essentially signs of determinants of small matrices.

Fix generic_interconvert.hpp for unsigned types.
Update performance testing code to include testing with ::value_type.
Start testing unsigned and checked integer types with rational_adaptor.
Update arithmetic tests to test mixed arithmetic with ::value_type.
@jzmaddock
Copy link
Collaborator Author

OK, there are a couple of new Google benchmark programs added, one calculates Bernoulli(m) using the Louis Saalschütz explicit definition, the other calculates zeta(18) using Chris' equivalence.

These results were run on Windows, with the unoptimized "generic" MPIR library, I'd expect gmp to perform much better once architecture specific optimizations are enabled: if someone would like to run these on other platforms that would be cool.

zeta[18] calculation:

Type Time(ns)
BM_zeta18<cpp_rational> 54688
BM_zeta18<mpq_rational> 76730
BM_zeta18<number<rational_adaptor<gmp_int>>> 68011
BM_zeta18<mpq_class,mpz_class> 59375

This is a relatively trivial calculation where the numbers don't grow too large, I suspect cpp_int's better memory usage for small values is the main driver here, though mpq_class is doing rather well - I need to check to see if it's eluding some temporaries somewhere that rational_adaptor isn't.

Bernoulli calculation is a rather sterner test of arithmetic performance, here's what I'm seeing, all the result except those for cpp_rational are relative to cpp_rational's performance, so smaller is better:

m cpp_rational mpq_rational rational_adaptor<gmp_int> mpq_class
10 40806 4.230529824 4.7184237612 3.1453217664
12 59375 4.1353431579 4.4405894737 3.1722778947
14 83702 3.8508398844 4.4081145014 2.8667773769
16 106027 3.908495006 4.4736718006 2.9605194903
18 147524 3.5935441013 4.5392139584 2.7047327892
20 185904 3.6020903262 4.2755723384 2.6895602031
22 245857 3.3337061788 3.8285060015 2.6102205754
24 304813 3.22525286 3.5699592865 2.5172778064
26 374930 3.2558157523 3.4310297922 2.4547195476
28 464965 3.0365683439 3.2254492274 2.4153366382
30 585938 2.8173612225 2.948400684 2.1428547048
32 732422 2.5683868043 2.7826075678 2.0562244171
34 889369 2.4156879765 2.6441094754 1.9424783189
36 1074219 2.3376620596 2.3950339735 1.8328050425
38 1283482 2.200104871 2.3728817389 1.7507639375
40 1506024 2.094483886 2.2886029705 1.7044642051
42 1727580 2.0407975318 2.1764705542 1.7245759965
44 2083333 1.8854753417 1.962209594 1.6197184992
46 2299331 1.8533060268 2.0067115174 1.6352699981
48 2678571 1.8008953281 1.925000308 1.5600777429
50 3045551 1.7406830488 1.8781054725 1.5150276584
52 3216912 1.8214284382 1.9515305361 1.6514284506
54 3760027 1.7067459888 1.8180555618 1.5212300337
56 3997093 1.7451297731 1.8242425183 1.6055195613
58 4589844 1.664302534 1.7399525997 1.7021275669
60 5156250 1.6161615515 1.6565657212 1.6161615515
62 5580357 1.6053333147 1.680000043 1.555555675
64 5998884 1.5975193053 1.6322480981 1.5627906791
66 6556920 1.5638296639 1.6382978594 1.5265957187
68 7291667 1.5066963426 1.6071427837 1.5401785353
70 7812500 1.531250048 1.642857088 1.562499968
72 8333333 1.5401785816 1.6125000645 1.6071429043
74 9166667 1.4659090376 1.5681817612 1.5681817612
76 9583333 1.5217391486 1.5942029772 1.5942029772
78 10253906 1.4899471479 1.591534387 1.6724738846
80 10986328 1.4915989219 1.5609756053 1.6144143885
82 11718750 1.495934976 1.5934958933 1.6470588587
84 12276786 1.5135134717 1.6096255974 1.6704544659
86 13437500 1.4705882047 1.5988372093 1.6715116651
88 14062500 1.5277777778 1.6319444622 1.7063492267
90 15277778 1.5 1.570616879 1.6914335318
92 16666667 1.4397321312 1.4732142305 1.649999967
94 17530488 1.4642857061 1.5083612048 1.6756521553
96 18158784 1.4561717348 1.4799999824 1.7209302121
98 19301471 1.4571428261 1.5178571105 1.7346938479
100 20680147 1.4166666707 1.4767676942 1.7099415202

In the expression-template case these do not return expression templates as a result of #175.
However, the previous fix was inefficient and created unnecessary temporaries.
Speeds up construction from 2 components (rationals etc).
Correct ET operators for pre-C++17.
Add test case for allocator usage.
@jzmaddock
Copy link
Collaborator Author

OK big update to reduce temporary usage, memory allocations look much better now, here for Bernoulli[m] compiled with gcc-11 mingw:

m cpp_rational mpq_rational number<rational_adaptor<gmp_int>> mpq_class
2 0 77 123 101
4 0 187 320 252
6 0 345 612 471
8 0 551 988 758
10 0 805 1464 1113
12 0 1107 2044 1536
14 0 1457 2698 2027
16 0 1857 3458 2587
18 0 2336 4320 3216
20 0 2885 5297 3913
22 6 3511 6358 4706
24 22 4203 7601 5600
26 83 4963 8911 6575
28 377 5806 10370 7632
30 780 6738 11947 8769
32 1454 7771 13644 9988
34 2001 9357 15947 11289
36 2789 10598 18023 12704
38 3669 11948 20185 14252
40 4653 13403 22538 15891
42 5923 14976 24990 17620
44 7379 16622 27596 19449
46 8839 18367 30287 21367
48 10296 20227 33295 23431
50 12045 22857 36898 25646
52 13603 25044 40220 27962
54 15276 27331 43755 30389
56 17239 29749 47410 32919
58 19337 32257 51151 35552
60 21409 34958 55308 38417
62 23694 37800 59426 41396
64 27923 39556 62540 44498
66 30240 44706 69096 47711
68 32566 47934 73872 51042
70 35019 51417 78779 54637
72 37460 55047 84163 58363
74 40282 58777 89439 62211
76 42914 62691 95055 66183
78 45752 66694 100709 70296
80 48681 70905 106834 74620
82 51986 77633 115045 79160
84 54855 82364 121949 83842
86 59032 87239 128670 88659
88 63595 92256 135592 93618
90 68352 97486 142665 98820
92 72446 102974 150380 104256
94 76468 108620 158006 109844
96 80361 111109 162765 115594
98 84783 121460 174932 121486
100 89044 127730 183611 127633
102 93561 134241 192393 134050
104 98452 140919 201349 140616
106 103530 147763 210413 147364
108 108326 154804 220168 154276
110 113891 162184 229824 161562
112 119213 169736 239900 169038
114 124770 182417 255032 176693
116 130624 190589 265966 184519
118 137941 198975 276703 192525
120 144829 207759 288645 200952
122 152045 216736 300007 209561
124 158610 225905 311860 218371
126 165383 235283 323744 227360
128 173971 230678 322303 236670
130 181696 248593 342618 246308
132 189310 258615 356021 256148
134 197708 268800 369164 266192
136 205800 279215 382459 276436
138 212940 290112 396424 287167
140 217502 301235 410903 298101
142 223486 312564 425431 309243
144 229579 324133 440945 320605
146 237213 344671 464320 332333
148 248799 357200 479845 344420
150 261345 369947 495567 356745
152 272741 382909 512422 369279
154 283982 396162 528719 382048
156 293626 409993 546804 395353
158 304036 424022 564050 408907
160 313869 427747 571797 422690
162 323626 454723 601473 436715
164 333294 469863 620894 451304
166 343072 485238 639579 466149
168 352236 500922 659464 481221
170 362793 516840 678534 496561
172 372645 533169 698760 512315
174 382908 549962 719939 528510
176 392507 567018 741432 544947
178 404163 597694 775274 561666
180 417539 615615 797960 578647
182 432009 634079 819557 596234
184 444684 652902 842915 614114
186 457104 672042 866413 632248
188 469331 691451 890093 650654
190 481583 711512 913753 669753
192 477225 698188 905638 689111
194 488955 736905 948365 708760
196 499797 757502 973031 728675
198 511163 778591 998650 749102

Runtime performance looks rather better too, again mingw, and again performance figures are relative to cpp_rational, so smaller is better:

m cpp_rational (ms) mpq_rational rational_adaptor<gmp_int> mpq_class
50 2351589 1.2754958456 1.8687495987 1.4982567107
54 2949297 1.4165433322 1.6953192574 1.3906914088
58 3685897 1.5139752956 1.589674101 1.3989132089
62 4464286 1.249999888 1.5166664949 1.406249958
66 5468750 1.17346944 1.4285714286 1.4095237486
70 6423611 1.2379343955 1.4270271036 1.3783784541
74 7638889 1.136363678 1.406250045 1.406250045
78 8333333 1.175000087 1.4732143789 1.4397322176
82 10000000 1.1230469 1.3950893 1.4583333
86 11474609 1.1428571553 1.3617021722 1.4613389441
90 12555804 1.1891357973 1.7489488527 1.5471471202
94 14687500 1.1347517957 1.5248226723 1.5602837106
98 15625000 1.162162176 1.433333312 1.576923072
102 17530488 1.1534526592 1.3963768151 1.6043478082
106 19425676 1.1528985143 1.3612040065 1.6489130159
110 21484375 1.1688311622 1.3636363636 1.6459330095
114 23958333 1.1413043637 1.3369565403 1.6666667084
118 27644231 1.1047430836 1.2791761869 1.5449275113
122 30505952 1.097560994 1.2955523893 1.6463414746
126 32894737 1.1249999962 1.3953125085 1.7574999916
130 35361842 1.1291989823 1.4139534926 1.6871036017
134 38194444 1.1310160452 1.3500000157 1.6735537504
138 41360294 1.1333333366 1.3600000039 1.6828282942
142 45833333 1.1250000082 1.3636363736 1.6287878955
146 49107143 1.1280991647 1.3595041357 1.6969696852
150 53125000 1.1437908518 1.3636363671 1.7320261459
154 59659091 1.095238092 1.3386243347 1.7959183622
158 63920455 1.0888888854 1.3037036892 1.7111110989
162 68181818 1.0949074136 1.2767857114 1.756944454
166 72916667 1.1190476109 1.3928571365 1.6785714163
170 78125000 1.1111111168 1.2666666624 1.72
174 85069444 1.0758017532 1.2551020435 1.6989796007
178 90277778 1.1126373646 1.2692307624 1.9038461492
182 98214286 1.0909090863 1.2196969695 1.7897727221
186 104910714 1.0921985814 1.2212765991 1.7500000048
190 111979167 1.0930232496 1.2279069731 1.813953483
194 114583333 1.2000000035 1.2818181855 1.8181818206
198 125000000 1.125 1.3125 1.75

@afabri
Copy link

afabri commented Sep 29, 2021

Wouldn't it make sense to cherry-pick your fix of the gcd into this branch. It may have an impact on performance, and without it computations might be plain wrong.

@afabri
Copy link

afabri commented Sep 29, 2021

@jzmaddock do you have a (vague) schedule for an integration into the develop branch of boost or a boost release. It's a question coming from @lrineau, the release manager of the CGAL project.

@jzmaddock
Copy link
Collaborator Author

Wouldn't it make sense to cherry-pick your fix of the gcd into this branch. It may have an impact on performance, and without it computations might be plain wrong.

Yep.

do you have a (vague) schedule for an integration into the develop branch of boost or a boost release. It's a question coming from @lrineau, the release manager of the CGAL project.

It's basically good to go, I just need to work on documentation a bit, and do a quick matrix determinant performance test. I just got a bit distracted by bug reports elsewhere ;)

@jzmaddock
Copy link
Collaborator Author

Wouldn't it make sense to cherry-pick your fix of the gcd into this branch. It may have an impact on performance, and without it computations might be plain wrong.

Done, and 3x3 determinant calculation added as a benchmark - it's a surprisingly tough calculation for rational values isn't it?

Determinant calculation shows mpq significantly ahead of cpp_rational, this calculation mainly just hammers gcd I suspect, but if I have time, I'll investigate some more. As before, column for cpp_rational shows the actual time, the others are relative to that:

Bits cpp_rational (ms) mpq_rational mpq_class
512 42.3 0.3096926714 0.3026004728
1024 93.8 0.3187633262 0.3123667377
2048 229 0.3100436681 0.3100436681
4096 609 0.2955665025 0.2955665025
8192 1797 0.2693377852 0.2782415136
16384 5844 0.2299794661 0.2299794661
32768 20641 0.1824524006 0.1816772443

Bernoulli calculations are largely the same as before:

m cpp_rational mpq_rational rational_adaptor<gmp_int> mpq_class
50 2197266 1.3136989331 1.819121126 1.4603170486
54 2761044 1.3059440559 1.7172413768 1.4147728178
58 3446691 1.2016064103 1.6190476605 1.4381611232
62 4087936 1.223111125 1.6138271245 1.4674603516
66 4882812 1.2000001229 1.5786667601 1.4571429332
70 5937500 1.1748119579 1.4736842105 1.4035087158
74 6770833 1.1794872211 1.550480864 1.4461539666
78 7812500 1.173333376 1.535714304 1.464285696
82 8958333 1.1718750576 1.495016651 1.5000000558
86 10416667 1.1785714183 1.4666666411 1.4666666411
90 11718750 1.1666666667 1.495934976 1.549549568
94 14375000 1.0386473739 1.3219741217 1.4066496
98 14930556 1.1395348572 1.4389534455 1.5697673951
102 16666667 1.1402026572 1.4374999513 1.550480729
106 18581081 1.1377005461 1.4230769458 1.5416666555
110 20507812 1.1428571707 1.4285714634 1.5963719094
114 22949219 1.1671732271 1.3941236083 1.612541978
118 25240385 1.160714268 1.3928571216 1.6022408533
122 27500000 1.1600378909 1.4520202182 1.6287878909
126 31250000 1.125 1.6 1.6
130 33593750 1.1015911888 1.5221987423 1.6490486177
134 37006579 1.0928104432 1.4202020133 1.6505050359
138 40441176 1.1107954675 1.352272743 1.6859504283
142 43198529 1.1092198764 1.315280481 1.6769826121
146 50000000 1.05113636 1.25 1.5625
150 51562500 1.1212121212 1.3468013382 1.6450216533
154 57812500 1.0810810811 1.261261267 1.6216216216
158 66761364 0.9787233976 1.1702127596 1.5212765875
162 69602273 1.0226757393 1.1972789021 1.5714285653
166 69602273 1.0975056662 1.2827988247 1.7210884334
170 74652778 1.0930232496 1.2857142838 1.709302325
174 80357143 1.101851854 1.2777777702 1.7111111081
178 86805556 1.1057142817 1.2599999935 1.7099999912
182 91517857 1.1463414621 1.2804878069 1.7500000027
186 98214286 1.0909090863 1.272727269 1.7499999949
190 104166667 1.1249999964 1.2299999961 1.7624999944
194 109375000 1.1428571429 1.2857142857 1.7857142857
198 114583333 1.1590909125 1.2954545492 1.8181818206

@jzmaddock
Copy link
Collaborator Author

Barring CI SNAFU's, this is now ready to go.

operator*(const detail::expression<tag, Arg1, Arg2, Arg3, Arg4>& a, number<B, ET>&& b)
{
b *= a;
return static_cast<number<B, et_on>&&>(b);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing the CI results at https://drone.cpp.al/boostorg/multiprecision/292/179/2, can you comment on that static_cast? The type of b is number<B, ET>&&. Why should it cast nicely into number<B, et_on>&&?

operator*(number<B, ET>&& a, const detail::expression<tag, Arg1, Arg2, Arg3, Arg4>& b)
{
a *= b;
return static_cast<number<B, et_on>&&>(a);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here: why et_on and not ET?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch: too much blind refactoring in one go!

I'm a bit concerned that didn't show up in local testing, also thinking I should be using std::foward here.

Copy link
Contributor

@lrineau lrineau Oct 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::forward and the static_cast (where right) are completely equivalent. The advantage of using std::forward is a better understanding of the intent of the developer.

Edit: fix the typo indent/intent

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brilliant! Now that I see your commit e4827ce... of course it was not std::forward, but std::move, and now the code is a lot clearer to read! Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants