Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expanding 5 macros STAN_ADD_REQUIRE_* directly into code. #2965

Merged
merged 152 commits into from
Apr 13, 2024

Conversation

syclik
Copy link
Member

@syclik syclik commented Oct 22, 2023

Summary

The mixed use of macros to generate a full set of template metaprograms is harder to read than having the template metaprograms in the code.

This PR expands the macros to the minimal set of macros required by the Math and Stan libraries.

Detail

There are 5 macros. I'll link to the latest released version (so the links are permanent), but they can be found on current develop. They are all inside stan/math/prim/meta/require_helpers.hpp:

  1. STAN_ADD_REQUIRE_UNARY : https://github.com/stan-dev/math/blob/v4.7.0/stan/math/prim/meta/require_helpers.hpp#L72
  2. STAN_ADD_REQUIRE_UNARY_INNER: https://github.com/stan-dev/math/blob/v4.7.0/stan/math/prim/meta/require_helpers.hpp#L114
  3. STAN_ADD_REQUIRE_BINARY: https://github.com/stan-dev/math/blob/v4.7.0/stan/math/prim/meta/require_helpers.hpp#L186
  4. STAN_ADD_REQUIRE_BINARY_INNER: https://github.com/stan-dev/math/blob/v4.7.0/stan/math/prim/meta/require_helpers.hpp#L229
  5. STAN_ADD_REQUIRE_CONTAINER: https://github.com/stan-dev/math/blob/v4.7.0/stan/math/prim/meta/require_helpers.hpp#L332

All these macros, when used for every type, expands into a lot of definitions. Practically we don't use most of them.

Tests

None added.

Side Effects

None.

Release notes

Removes 5 macros and expands directly into code: STAN_ADD_REQUIRE_UNARY, STAN_ADD_REQUIRE_UNARY_INNER, STAN_ADD_REQUIRE_BINARY, STAN_ADD_REQUIRE_BINARY_INNER, STAN_ADD_REQUIRE_CONTAINER.

Checklist

  • Copyright holder: Daniel Lee

    The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
    - Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
    - Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • dependencies checks pass, (make test-math-dependencies)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

@SteveBronder
Copy link
Collaborator

The way these were expanded deletes all the docs for requires from the stan math website

https://mc-stan.org/math/group__rev__row__vector__types_ga04400fc27cdcef7938e1a79d14068a7b.html#ga04400fc27cdcef7938e1a79d14068a7b

@syclik
Copy link
Member Author

syclik commented Oct 23, 2023 via email

@SteveBronder
Copy link
Collaborator

imo if we want to go this route I'd first see how many requires we can delete. Is this all the requires folded out? adding ~5k lines is not the worst, but it is a lot of replicated code over and over and I worry about the maintenance of it.

@syclik
Copy link
Member Author

syclik commented Oct 23, 2023 via email

@syclik
Copy link
Member Author

syclik commented Oct 23, 2023

@SteveBronder, I just walked through all 1108 require_* template metaprograms. Good news, we only use 201 of them. (I think some of those are used by the other definitions and that's it. We may be to reduce the footprint even more.)

@syclik syclik force-pushed the feature/simplify-meta branch 2 times, most recently from faa6a03 to e540cba Compare November 29, 2023 14:09
syclik and others added 23 commits December 3, 2023 22:05
@syclik
Copy link
Member Author

syclik commented Mar 24, 2024

@SteveBronder, I think the doc in doxygen/contributor_help_pages/require_meta.md is much better now!! (I don't think it's perfect, but I think it documents the non-type template parameter much clearer than it's been written about before.)

Mind taking a look? If you have any suggestions on how to make that doc better, happy to give it a try.

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
arma/arma.stan 0.19 0.18 1.04 3.66% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.01 0.01 1.07 6.31% faster
gp_regr/gen_gp_data.stan 0.02 0.02 1.08 7.28% faster
gp_regr/gp_regr.stan 0.11 0.1 1.05 5.21% faster
sir/sir.stan 76.46 75.7 1.01 0.98% faster
irt_2pl/irt_2pl.stan 3.77 3.95 0.95 -4.9% slower
eight_schools/eight_schools.stan 0.05 0.05 0.99 -1.26% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.25 0.25 0.99 -1.18% slower
pkpd/one_comp_mm_elim_abs.stan 17.73 18.39 0.96 -3.71% slower
garch/garch.stan 0.44 0.47 0.94 -6.07% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.74 2.85 0.96 -3.96% slower
arK/arK.stan 1.61 1.67 0.96 -3.71% slower
gp_pois_regr/gp_pois_regr.stan 2.48 2.56 0.97 -3.19% slower
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 9.22 9.34 0.99 -1.36% slower
performance.compilation 176.9 174.14 1.02 1.56% faster
Mean result: 0.998846365102633

Jenkins Console Log
Blue Ocean
Commit hash: a1e38b23f053b005fd392cec1e8f55b0f5a2c76f


Machine information No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focal

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 80
On-line CPU(s) list: 0-79
Thread(s) per core: 2
Core(s) per socket: 20
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
Stepping: 4
CPU MHz: 2400.000
CPU max MHz: 3700.0000
CPU min MHz: 1000.0000
BogoMIPS: 4800.00
Virtualization: VT-x
L1d cache: 1.3 MiB
L1i cache: 1.3 MiB
L2 cache: 40 MiB
L3 cache: 55 MiB
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79
Vulnerability Gather data sampling: Mitigation; Microcode
Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Retbleed: Mitigation; IBRS
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; IBRS, IBPB conditional, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Mitigation; Clear CPU buffers; SMT vulnerable
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke md_clear flush_l1d arch_capabilities

G++:
g++ (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Clang:
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

Copy link
Collaborator

@SteveBronder SteveBronder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few changes for the docs but overall I think this is good!

doxygen/contributor_help_pages/require_meta.md Outdated Show resolved Hide resolved
doxygen/contributor_help_pages/require_meta.md Outdated Show resolved Hide resolved
doxygen/contributor_help_pages/require_meta.md Outdated Show resolved Hide resolved
doxygen/contributor_help_pages/require_meta.md Outdated Show resolved Hide resolved
doxygen/contributor_help_pages/require_meta.md Outdated Show resolved Hide resolved
doxygen/contributor_help_pages/require_meta.md Outdated Show resolved Hide resolved
doxygen/contributor_help_pages/require_meta.md Outdated Show resolved Hide resolved
doxygen/contributor_help_pages/require_meta.md Outdated Show resolved Hide resolved
@syclik syclik force-pushed the feature/simplify-meta branch from e41ec3c to a1d05d9 Compare April 2, 2024 01:43
@syclik
Copy link
Member Author

syclik commented Apr 2, 2024

@SteveBronder, thank you for the detailed review!

  1. I accepted all the changes!
  2. I fixed the remaining ::type / ::value doc issues. (I went and wrote a simple test to make sure I knew exactly what compiled and didn't.)
  3. I added the second link to all the require meta traits.
  4. I made sure the numbering stayed contiguous. I think it looks ok. Please see the attached image. This is how it renders when I run doxygen.
image

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
arma/arma.stan 0.19 0.19 1.03 2.69% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.01 0.01 1.04 4.28% faster
gp_regr/gen_gp_data.stan 0.02 0.02 1.03 2.64% faster
gp_regr/gp_regr.stan 0.11 0.11 1.01 1.19% faster
sir/sir.stan 77.54 77.28 1.0 0.34% faster
irt_2pl/irt_2pl.stan 3.84 3.71 1.03 3.37% faster
eight_schools/eight_schools.stan 0.05 0.05 1.1 8.86% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.25 0.24 1.02 2.43% faster
pkpd/one_comp_mm_elim_abs.stan 18.45 17.6 1.05 4.6% faster
garch/garch.stan 0.46 0.43 1.05 4.69% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.79 2.71 1.03 2.75% faster
arK/arK.stan 1.64 1.59 1.04 3.39% faster
gp_pois_regr/gp_pois_regr.stan 2.49 2.43 1.02 2.3% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 9.14 9.07 1.01 0.83% faster
performance.compilation 182.67 175.88 1.04 3.72% faster
Mean result: 1.033554873711316

Jenkins Console Log
Blue Ocean
Commit hash: 0fcf10855f923eb24c8e9958f1f19fde97572810


Machine information No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focal

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 80
On-line CPU(s) list: 0-79
Thread(s) per core: 2
Core(s) per socket: 20
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
Stepping: 4
CPU MHz: 2954.565
CPU max MHz: 3700.0000
CPU min MHz: 1000.0000
BogoMIPS: 4800.00
Virtualization: VT-x
L1d cache: 1.3 MiB
L1i cache: 1.3 MiB
L2 cache: 40 MiB
L3 cache: 55 MiB
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79
Vulnerability Gather data sampling: Mitigation; Microcode
Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Retbleed: Mitigation; IBRS
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; IBRS, IBPB conditional, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Mitigation; Clear CPU buffers; SMT vulnerable
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke md_clear flush_l1d arch_capabilities

G++:
g++ (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Clang:
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

@syclik
Copy link
Member Author

syclik commented Apr 7, 2024

@SteveBronder, anything else we need to change on this PR before merging?

@SteveBronder
Copy link
Collaborator

Let me look this over one more time but I think it is good! Should this get merged into the 5.0 branch since we are getting rid of a lot of the requires? If so then I think we should wait till the other ones are merge into the 5.0 branch as this will probably be the biggest diff

Copy link
Collaborator

@SteveBronder SteveBronder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one section of the docs to look over. Rest looks good!

Comment on lines 53 to 67
In the Math library, we use a technique that allows the definition of
multiple template functions where each handles a subset of allowable
types. We add a pointer [non-type template
parameter](https://en.cppreference.com/w/cpp/language/template_parameters#Non-type_template_parameter)
to the template parameter list with a default value of `nullptr`. Any
`void*` non-type template parameter with a default of `nullptr` is
valid and the non-type template parameter is ignored by the
compiler. Utilizing [substitution failure is not an error
(SFNIAE)](https://en.cppreference.com/w/cpp/language/sfinae), a
substitution failure for the non-type template parameter will result
in that definition being removed from the possible function
definitions. Using this technique we have to be careful not to violate
the [One Definition
Rule](https://en.cppreference.com/w/cpp/language/definition) and only
provide one definition for any set of types.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like the below instead? I think it would be good to flesh out the explanation of how this works more like in the below


In the Math library, we use a technique similar to C++20's require keyword that allows the definition of
multiple template functions where each handles a subset of allowable
types.

When the compiler attempts to resolve which function should be called from a set of templated function signatures there must be only one possibly valid function signature available. This is called the One Definition
Rule
. For example, the following code would fail because the compiler is unable to differentiate between the two function signatures.

template <typename T>
T foo(T x) {
  return x;
}

template <typename K>
K foo(K x) {
  return x;
}

The compiler needs a way to differentiate between the two signatures to select one and satisfy the One Definition Rule. One trick to have a single valid definition is to utilize Substitution Failure Is Not An Error
(SFNIAE)
to purposefully create conditions where only one signature is valid because all of the other conditions fail to compile. The simplest way to do this is to start with a type trait like the below enable_if. The enable_if is only defined for the case where B is true and so if B is ever false the compiler would throw an error saying that enable_if is not well defined.

// Forward declare enable_if
template<bool B, class T = void>
struct enable_if {};

// Only define the case where B is true 
template<class T>
struct enable_if<true, T> { typedef T type; };

template <bool B, typename T>
using enable_if_t = typename enable_if<B, T>::type;

Attempting to construct this enable_if with B being false anywhere else in the program would cause the compiler to crash. But using it in the
template of a function signature allows SFINAE to deduce which signature we
would like to use.

// foo only works with floating point types 
template <typename T,  enable_if_t<std::is_floating_point<T>::value>>* = nullptr>
T foo(T x) {
  return x;
}

// foo only works with integer types
template <typename K,  enable_if_t<std::is_intergral<K>::value>>* = nullptr>
K foo(K x) {
  return x;
}

// Calls the first signature
double x_dbl = 1.0;
double y_dbl = foo(x_dbl); 

// Calls the second signature
int x = 1;
int y = foo(x);

The second template argument is referred to as a [non-type template
parameter](https://en.cppreference.com/w/cpp/language template_parameters#Non-type_template_parameter) and has a default value of void.
When the templated signature has the correct type the enable_if_t produces a void type which is then made into a pointer and assigned a default value of nullptr.
When the templated signature does not have the correct type, the compiler utilizes Substitution Failure Is Not An Error
(SFNIAE)
, to remove the offending signature from the list of possible matches while continuing to search for the correct signature.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks good. I'll put that into the doc.

doxygen/contributor_help_pages/require_meta.md Outdated Show resolved Hide resolved
doxygen/contributor_help_pages/require_meta.md Outdated Show resolved Hide resolved
@syclik syclik force-pushed the feature/simplify-meta branch from 96dc4e9 to dd75ad7 Compare April 12, 2024 13:52
@syclik
Copy link
Member Author

syclik commented Apr 12, 2024

@SteveBronder, I updated the last block of text! There were a couple of very minor edits made:

  • A couple words changed for grammar.
  • For the code example, the enable_if definition isn't a "forward declaration" -- it's an actual definition of the struct. I clarified the comment there. (A forward declaration would have looked like template<bool B, class T = void> struct enable_if;. The { } provides the definition of the struct, albeit an empty struct.)

It really clarifies what's happening. Thanks!!!

Should this get merged into the 5.0 branch since we are getting rid of a lot of the requires?

My slight (very slight) preference is for it to go into develop. Reasons:

  1. The doc is updated!! That is valuable. (If we put in effort, we can pull that out and get it on another PR, but I'm not really looking forward to trying that out.)
  2. The expansion of the macros makes it much more readable to a set of developers. (Not everyone. A set will not notice a difference, but this will help a different set.)
  3. In the strictest sense, removing the availability of these macros does change what's available, but to say that these traits were meant to be part of external API is pushing the definition a bit. There are other PRs that we merge that violate this with changing function signatures, so I don't think this is really a version-bumping breaking change.
  4. I think this moves the code base forward and we can start thinking about breaking changes to include in 5.0. (One thing I've been thinking about is why not have require_*_t<T> extend to require_*_t<T1, T2, ...> to replace require_all_*_t<T1, T2, ...>.)
  5. Having this in the code base in develop will help me figure out how to pull apart some of the utils in tests. Having it on another branch is just a context switch that makes it tougher.

With all that said, I'm really not fussed if we put it on 5.0.

@SteveBronder
Copy link
Collaborator

I'm cool with merging to develop on this. Agree that those requires are not really part of the Stan Math API in the strict sense

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
arma/arma.stan 0.2 0.19 1.05 5.15% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.01 0.01 1.02 1.67% faster
gp_regr/gen_gp_data.stan 0.02 0.02 1.02 1.68% faster
gp_regr/gp_regr.stan 0.11 0.11 1.02 1.88% faster
sir/sir.stan 82.04 78.25 1.05 4.62% faster
irt_2pl/irt_2pl.stan 4.21 4.15 1.02 1.54% faster
eight_schools/eight_schools.stan 0.05 0.05 1.01 1.03% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.28 0.26 1.09 8.07% faster
pkpd/one_comp_mm_elim_abs.stan 18.94 18.6 1.02 1.83% faster
garch/garch.stan 0.48 0.48 1.0 -0.25% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.86 2.89 0.99 -1.2% slower
arK/arK.stan 1.69 1.68 1.0 0.23% faster
gp_pois_regr/gp_pois_regr.stan 2.58 2.59 1.0 -0.47% slower
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 9.53 9.52 1.0 0.16% faster
performance.compilation 198.73 191.37 1.04 3.7% faster
Mean result: 1.0207882235801728

Jenkins Console Log
Blue Ocean
Commit hash: fb02dc7240781e4fe7ea4bc1fb4e06e70079765a


Machine information No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focal

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 80
On-line CPU(s) list: 0-79
Thread(s) per core: 2
Core(s) per socket: 20
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
Stepping: 4
CPU MHz: 2400.000
CPU max MHz: 3700.0000
CPU min MHz: 1000.0000
BogoMIPS: 4800.00
Virtualization: VT-x
L1d cache: 1.3 MiB
L1i cache: 1.3 MiB
L2 cache: 40 MiB
L3 cache: 55 MiB
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79
Vulnerability Gather data sampling: Mitigation; Microcode
Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Retbleed: Mitigation; IBRS
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; IBRS, IBPB conditional, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Mitigation; Clear CPU buffers; SMT vulnerable
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke md_clear flush_l1d arch_capabilities

G++:
g++ (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Clang:
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

@syclik syclik merged commit b9d0a33 into develop Apr 13, 2024
8 checks passed
@syclik syclik deleted the feature/simplify-meta branch April 13, 2024 01:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants