Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialization of standard posits #343

Open
RaulMurillo opened this issue Jun 2, 2023 · 47 comments
Open

Specialization of standard posits #343

RaulMurillo opened this issue Jun 2, 2023 · 47 comments
Assignees

Comments

@RaulMurillo
Copy link
Collaborator

Since the latest standard for posits (https://github.com/posit-standard/Posit-Standard-Community-Feedback) fixes the exponent size to 2 bits, it would be desirable to have fast specialization of posit types with 2 exponent bits (at least, 8, 16, 32, and even 64 bits) for performance in software experimentation.
I don't know how difficult it would be to adapt the current specialized/posit_8_0.hpp and specialized/posit_16_1.hpp (specialized/posit_32_2.hpp will be the same I guess) to the new standard.

@ghost
Copy link

ghost commented Jun 2, 2023

@RaulMurillo I totally agree with your request. I started that work a while back and also to redesign the implementation on a fast limb-based system. The specialization mechanism would still be the better performing configuration as we can make static decisions on the specific parameters of the posit.

Thus a reasonable shortcut to get to high-performance standard posits with es=2 would be to add the specialized implementations. Would you be able to drive this development?

@RaulMurillo
Copy link
Collaborator Author

Would you be able to drive this development?

Sure! As far as I know, it should resemble previous specialized implementations, with adjustments to the parameters and arithmetic operations for those specific formats. Am I right?

Also, could you please point me to some regression tests to ensure that the specialized implementations provide the same functionality as the default ones?

@ghost
Copy link

ghost commented Jun 3, 2023 via email

@Ravenwater
Copy link
Contributor

@RaulMurillo I have created a branch 'fast_specialized_posit', and enabled the fast posit path in the regressions. I put a skeleton together for posit<16,2> and that yielded this:

Fast specialization posit<16,2> configuration tests
 posit< 16,2> useed scale     4     minpos scale        -56     maxpos scale         56  :  0

Special case tests
posit< 16, 2> Initialize to zero:            PASS
posit< 16, 2> Initialize to NAN              PASS
posit< 16, 2> Initialize to INFINITY         PASS
posit< 16, 2> sign is true                   PASS
posit< 16, 2> is negative                    PASS
posit< 16, 2> sign is false                  PASS
posit< 16, 2> is positive                    PASS
posit< 16, 2>                                                addition       PASS
posit< 16, 2>                                                subtraction    PASS
posit< 16, 2>                                                multiplication PASS
posit< 16, 2>                                                division       PASS

For REGRESSION_LEVEL_1. Levels 2, 3, and 4 will add logic, randoms, and the math library to the mix. I also enabled it in the CI, so any check-ins against that branch will automatically run the regression suite.

I can show you how to work with the REGRESSION levels and the manual test regression infrastructure.

The 32bit and 64bit posits are much more difficult to validate as the native double and long doubles are not sufficiently precise to serve as reference. I have not weighed down the regression suite with a high-precision reference as it would make the regression suites run too slowly. For 32bit and 64bit regression testing we should do one of two things:

1- create a judicious direct testing suite to target specific corner cases so that we are fast and light weight
2- create a double-double reference type for 32-bit randoms, and double-double-double reference type for 64-bit randoms

A third option is to complete the Priest arithmetic type that I started as an adaptive precision Oracle that still has hardware support and thus will run quick enough not to weight down the CI regression cycle.

@Ravenwater
Copy link
Contributor

This is what you should see when you build with cmake -DBUILD_NUMBER_POSITS=ON ..

1>Test project C:/Users/tomtz/Documents/dev/clones/universal/build
1>      Start  1: posit_api
1> 1/59 Test  #1: posit_api ........................   Passed    0.04 sec
1>      Start  2: posit_casting
1> 2/59 Test  #2: posit_casting ....................   Passed    0.04 sec
1>      Start  3: posit_constexpr_test
1> 3/59 Test  #3: posit_constexpr_test .............   Passed    0.03 sec
1>      Start  4: posit_decode
1> 4/59 Test  #4: posit_decode .....................   Passed    0.03 sec
1>      Start  5: posit_number_traits
1> 5/59 Test  #5: posit_number_traits ..............   Passed    0.04 sec
1>      Start  6: posit_postfix
1> 6/59 Test  #6: posit_postfix ....................   Passed    0.04 sec
1>      Start  7: posit_prefix
1> 7/59 Test  #7: posit_prefix .....................   Passed    0.03 sec
1>      Start  8: posit_quire_accumulation
1> 8/59 Test  #8: posit_quire_accumulation .........   Passed    0.04 sec
1>      Start  9: posit_reciprocal_tables
1> 9/59 Test  #9: posit_reciprocal_tables ..........   Passed    0.03 sec
1>      Start 10: posit_serialization
1>10/59 Test #10: posit_serialization ..............   Passed    0.03 sec
1>      Start 11: posit_ulp
1>11/59 Test #11: posit_ulp ........................   Passed    0.03 sec
1>      Start 12: posit_assignment
1>12/59 Test #12: posit_assignment .................   Passed    0.03 sec
1>      Start 13: posit_conversion
1>13/59 Test #13: posit_conversion .................   Passed    0.04 sec
1>      Start 14: posit_logic
1>14/59 Test #14: posit_logic ......................   Passed    0.26 sec
1>      Start 15: posit_addition
1>15/59 Test #15: posit_addition ...................   Passed    1.42 sec
1>      Start 16: posit_complex_add
1>16/59 Test #16: posit_complex_add ................   Passed    0.15 sec
1>      Start 17: posit_decrement
1>17/59 Test #17: posit_decrement ..................   Passed    0.03 sec
1>      Start 18: posit_division
1>18/59 Test #18: posit_division ...................   Passed    0.54 sec
1>      Start 19: posit_fma
1>19/59 Test #19: posit_fma ........................   Passed    0.03 sec
1>      Start 20: posit_increment
1>20/59 Test #20: posit_increment ..................   Passed    0.03 sec
1>      Start 21: posit_literals
1>21/59 Test #21: posit_literals ...................   Passed    0.43 sec
1>      Start 22: posit_multiplication
1>22/59 Test #22: posit_multiplication .............   Passed    0.36 sec
1>      Start 23: posit_negation
1>23/59 Test #23: posit_negation ...................   Passed    0.08 sec
1>      Start 24: posit_reciprocation
1>24/59 Test #24: posit_reciprocation ..............   Passed    0.03 sec
1>      Start 25: posit_sqrt
1>25/59 Test #25: posit_sqrt .......................   Passed    0.16 sec
1>      Start 26: posit_subtraction
1>26/59 Test #26: posit_subtraction ................   Passed    0.34 sec
1>      Start 27: posit_classify
1>27/59 Test #27: posit_classify ...................   Passed    0.04 sec
1>      Start 28: posit_complex
1>28/59 Test #28: posit_complex ....................   Passed    0.03 sec
1>      Start 29: posit_exponent
1>29/59 Test #29: posit_exponent ...................   Passed    0.50 sec
1>      Start 30: posit_hyperbolic
1>30/59 Test #30: posit_hyperbolic .................   Passed    0.05 sec
1>      Start 31: posit_hypotenuse
1>31/59 Test #31: posit_hypotenuse .................   Passed    0.03 sec
1>      Start 32: posit_logarithm
1>32/59 Test #32: posit_logarithm ..................   Passed    0.03 sec
1>      Start 33: posit_next
1>33/59 Test #33: posit_next .......................   Passed    0.03 sec
1>      Start 34: posit_pow
1>34/59 Test #34: posit_pow ........................   Passed    0.12 sec
1>      Start 35: posit_trigonometry
1>35/59 Test #35: posit_trigonometry ...............   Passed    0.03 sec
1>      Start 36: posit_truncate
1>36/59 Test #36: posit_truncate ...................   Passed    0.03 sec
1>      Start 37: fast_posit_128_2
1>37/59 Test #37: fast_posit_128_2 .................   Passed    0.34 sec
1>      Start 38: fast_posit_128_4
1>38/59 Test #38: fast_posit_128_4 .................   Passed    0.31 sec
1>      Start 39: fast_posit_16_1
1>39/59 Test #39: fast_posit_16_1 ..................   Passed    0.03 sec
1>      Start 40: fast_posit_16_2
1>40/59 Test #40: fast_posit_16_2 ..................   Passed    0.03 sec
1>      Start 41: fast_posit_256_2
1>41/59 Test #41: fast_posit_256_2 .................   Passed    1.11 sec
1>      Start 42: fast_posit_256_5
1>42/59 Test #42: fast_posit_256_5 .................   Passed    1.06 sec
1>      Start 43: fast_posit_2_0
1>43/59 Test #43: fast_posit_2_0 ...................   Passed    0.03 sec
1>      Start 44: fast_posit_32_2
1>44/59 Test #44: fast_posit_32_2 ..................   Passed    0.07 sec
1>      Start 45: fast_posit_3_0
1>45/59 Test #45: fast_posit_3_0 ...................   Passed    0.03 sec
1>      Start 46: fast_posit_48_2
1>46/59 Test #46: fast_posit_48_2 ..................   Passed    0.11 sec
1>      Start 47: fast_posit_4_0
1>47/59 Test #47: fast_posit_4_0 ...................   Passed    0.04 sec
1>      Start 48: fast_posit_64_2
1>48/59 Test #48: fast_posit_64_2 ..................   Passed    0.16 sec
1>      Start 49: fast_posit_64_3
1>49/59 Test #49: fast_posit_64_3 ..................   Passed    0.16 sec
1>      Start 50: fast_posit_8_0
1>50/59 Test #50: fast_posit_8_0 ...................   Passed    0.05 sec
1>      Start 51: fast_posit_8_1
1>51/59 Test #51: fast_posit_8_1 ...................   Passed    0.33 sec
1>      Start 52: fast_posit_8_2
1>52/59 Test #52: fast_posit_8_2 ...................   Passed    0.29 sec
1>      Start 53: fast_quire_32_2
1>53/59 Test #53: fast_quire_32_2 ..................   Passed    1.99 sec
1>      Start 54: posit2_api
1>54/59 Test #54: posit2_api .......................   Passed    0.03 sec
1>      Start 55: posit2_attributes
1>55/59 Test #55: posit2_attributes ................   Passed    0.03 sec
1>      Start 56: posit2_manipulators
1>56/59 Test #56: posit2_manipulators ..............   Passed    0.03 sec
1>      Start 57: posit2_traits
1>57/59 Test #57: posit2_traits ....................   Passed    0.03 sec
1>      Start 58: posit2_ulp
1>58/59 Test #58: posit2_ulp .......................   Passed    0.03 sec
1>      Start 59: posit2_addition
1>59/59 Test #59: posit2_addition ..................   Passed    0.38 sec
1>
1>100% tests passed, 0 tests failed out of 59

fast specialized posit<64,2> is stubbed out as we need to figure out how to properly regress test this.

@Ravenwater
Copy link
Contributor

image

@RaulMurillo
Copy link
Collaborator Author

Thank you Theo!
I will start working on this. Meanwhile, we should define how to validate such datatypes.
What is this Priest arithmetic format you are talking about?

@Ravenwater
Copy link
Contributor

@RaulMurillo https://www.semanticscholar.org/paper/Algorithms-for-arbitrary-precision-floating-point-Priest/93e08bcc5581478bf109c22e68b84cf08e4d7354

I started this work but ran into a problem reproducing the paper's results. Need to find a brother in arms to RCA the problem and get over the hump.

@Ravenwater
Copy link
Contributor

@RaulMurillo if you look at the regression suite for posits in UNIVERSAL_ROOT/include/universal/verification/posit_test_suite.hpp

you will find a test called VerifyConversion. This test uses the inductive property that posit<nbits+1, es> with the last bit set falling exactly in the middle of two posit<nbits,es> values. So we can test all the rounding cases for an arbitrary posit by using this property and adding/keeping/subtracting a delta to the higher posit to generate the round-up, tie to even, and round-down values.

One idea for a quick regression test is to visit all (and just) the encodings where the rounding decision is going to rejigger the regime, and use this property to generate the golden reference.

For 32bit posits we would need to sit on a system that has proper long doubles, but for 64bit posits we need at least a triple-double reference arithmetic.

That would be a lovely addition to Universal, to have David Bailey's double-double, quad-double, and add a triple-double arithmetic type so that the lib has a fast Oracle number system for these types of verifications. Douglas Priest would be the adaptive precision Oracle for arbitrary precision questions.

@Ravenwater
Copy link
Contributor

@RaulMurillo also don't forget that there is a method: setbits(uint64_t) that allows you to set the posit to arbitrary bits. With the patterns suggested above we would have a super quick sanity test of only 64 tests in the case of posit<32,2> and 128 test cases for a posit<64,2>.

When the rounding doesn't change the field configuration, the rounding algorithm is 'invariant' across the field configurations, so no need to test all these cases when you have proven that one works. But when the rounding changes the field configuration the rounding needs to be verified.

@Ravenwater
Copy link
Contributor

@RaulMurillo how is this task going? Do you need any help?

@RaulMurillo
Copy link
Collaborator Author

Hi @Ravenwater , I'm sorry but I've been very busy during these days (and the following days I will be too) with the end of the course. I think I won't be able to dedicate myself to this for a few weeks, but when I have progress on this I'll let you know. If you want to make progress on this, go ahead, I don't want to be a bother in this task.

@Ravenwater
Copy link
Contributor

@RaulMurillo thanks for the update. I am currently focused on logarithmic number systems, so I was happy to delegate the task, :-) When you get time, the posit community will thank you.

I am working on Priest and faithfully rounded number systems to try to solve the validation problem of big posits. But that is adjacent, so won't get in the way.

@davidmallasen
Copy link
Contributor

I see that in #384 part of this has been merged. What is the status of these specializations? I'm specially interested in the Posit16 case in case it is incomplete. Thanks!

@davidmallasen
Copy link
Contributor

Hello @Ravenwater @theo-lemurian . I tested the specialized posit<16,2> that was merged into main and the results of my application are incorrect (and different from the generic posit<16,2> that does work properly).

Could you give me some pointers to where I should have a look at how to specialize the posit<16,2> format and how to run some tests to check what is failing?

I guess I should only modify https://github.com/stillwater-sc/universal/blob/main/include/universal/number/posit/specialized/posit_16_2.hpp ?

@ghost
Copy link

ghost commented Dec 12, 2023

Can you share the failures that you are seeing in the specialized posit<16,2>?

If you want to fix/extend these specialized implementations, the architecture is that of a standard template specialization for the specific template arguments, in this case nbits = 16 and es = 2. The code you have found is exactly the code for that posit<16,2> specialization.

The functionality is always in a single include file, in this case, https://github.com/stillwater-sc/universal/blob/main/include/universal/number/posit/specialized/posit_16_2.hpp. Then there are the regression tests. For posits, you can find these here: https://github.com/stillwater-sc/universal/tree/main/static/posit/specialized

When you find the incorrect behavior, you can try to fix it and add regression tests to posit_16_2.cpp so that we close the hole.

@davidmallasen
Copy link
Contributor

davidmallasen commented Dec 12, 2023

Unfortunately I cannot share the application I'm running, but it is quite large and I haven't pinned exactly when/how it fails. In any case, it seems like a generalized/common output error because I'm getting practically random output results.

To fix this I think it would be better to run and check the regression tests. Could you point me to how I can build and run the tests that you mention in https://github.com/stillwater-sc/universal/tree/main/static/posit/specialized ? I can't seem to find any documentation on that. The CMakeLists in that directory doesn't make much sense to me and I guess there is another top-level one.

@ghost
Copy link

ghost commented Dec 12, 2023

The CMake you want to look at is the top level in the root.

The build of Universal allows you to enable and disable specific sets of tests. If you just want to run the posit tests, issue the following cmake configuration commands:

mkdir build
cd build
cmake -DBUILD_NUMBER_POSITS=ON -DBUILD_DEMONSTRATION=OFF ..

now you will have all the posit regression tests enabled, and nothing else. Simple make and make test will then build and run the regression tests.

The specialized posit_16_2 regression test executable will sit in the build directory under specialized, so when you modify the posit_16_2.cpp, and rebuild, you can focus your testing on just that executable, much quicker than constantly running the full regression suite.

@davidmallasen
Copy link
Contributor

Perfect. I will try this and see if I can fix it. Thanks @theo-lemurian !

@Ravenwater
Copy link
Contributor

@davidmallasen I had a typo in the cmake command line, which I have fixed in the message above.

Here is the output you should see when you want to build just the posit regression tests:

tomtz@sw-desktop-300 MINGW64 /F/Users/tomtz/dev/clones/universal/testbuild (v3.74)
$ cmake -DBUILD_DEMONSTRATION=OFF -DBUILD_NUMBER_POSITS=ON ..

 _____  _____  ____  _____  _____  ____   ____  ________  _______     ______        _       _____
|_   _||_   _||_   \|_   _||_   _||_  _| |_  _||_   __  ||_   __ \  .' ____ \      / \     |_   _|
  | |    | |    |   \ | |    | |    \ \   / /    | |_ \_|  | |__) | | (___ \_|    / _ \      | |
  | '    ' |    | |\ \| |    | |     \ \ / /     |  _| _   |  __ /   _.____`.    / ___ \     | |   _
   \ \__/ /    _| |_\   |_  _| |_     \ ' /     _| |__/ | _| |  \ \_| \____) | _/ /   \ \_  _| |__/ |
    `.__.'    |_____|\____||_____|     \_/     |________||____| |___|\______.'|____| |____||________|

-- Selecting Windows SDK version 10.0.19041.0 to target Windows 10.0.22621.
-- C++20 has been enabled by default
--
-- PROJECT_NAME                = universal
-- PROJECT_NAME_NOSPACES       = universal
-- PROJECT_SOURCE_DIR          = F:/Users/tomtz/dev/clones/universal
-- PROJECT_VERSION             = 3.74.1.3ab79135
-- CMAKE_C_COMPILER            = C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.36.32532/bin/Hostx64/x64/cl.exe
-- CMAKE_CXX_COMPILER          = C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.36.32532/bin/Hostx64/x64/cl.exe
-- CMAKE_CURRENT_SOURCE_DIR    = F:/Users/tomtz/dev/clones/universal
-- CMAKE_CURRENT_BINARY_DIR    = F:/Users/tomtz/dev/clones/universal/testbuild
-- GIT_COMMIT_HASH             = 3ab79135
-- GIT_BRANCH                  = v3.74
-- include_install_dir         = Include
-- include_install_dir_full    = Include
-- config_install_dir          = CMake
-- include_install_dir_postfix =
--
-- ******************* Universal Arithmetic Library Configuration Summary *******************
-- General:
--   Version                          :   3.74.1.3ab79135
--   System                           :   Windows
--   C++ Language Requirement         :   C++20
--   C compiler                       :   C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.36.32532/bin/Hostx64/x64/cl.exe
--   Release C flags                  :   /O2 /Ob2 /DNDEBUG  /Oi /Ot /Ox /Oy /fp:fast /GS- /DWIN32 /D_WINDOWS  /MP /Zc:__cplusplus
--   Debug C flags                    :   /Zi /Ob0 /Od /RTC1  /Wall /bigobj /DWIN32 /D_WINDOWS  /MP /Zc:__cplusplus
--   C++ compiler                     :   C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.36.32532/bin/Hostx64/x64/cl.exe
--   Release CXX flags                :   /O2 /Ob2 /DNDEBUG   /MP /Zc:__cplusplus  /Oi /Ot /Ox /Oy /fp:fast /GS- /DWIN32 /D_WINDOWS /EHsc   /MP /Zc:__cplusplus /EHsc
--   Debug CXX flags                  :   /Zi /Ob0 /Od /RTC1   /MP /Zc:__cplusplus  /Wall /bigobj /DWIN32 /D_WINDOWS /EHsc   /MP /Zc:__cplusplus /EHsc
--   Build type                       :   Release
--
--   BUILD_ALL                        :   OFF
--   BUILD_CI                         :   OFF
--
--   BUILD_DEMONSTRATION              :   OFF
--   BUILD_NUMBERS                    :   OFF
--   BUILD_NUMERICS                   :   OFF
--   BUILD_BENCHMARKS                 :   OFF
--   BUILD_MIXEDPRECISION_SDK         :   OFF
--
--   BUILD_CMD_LINE_TOOLS             :   OFF
--   BUILD_EDUCATION                  :   OFF
--   BUILD_APPLICATIONS               :   OFF
--   BUILD_PLAYGROUND                 :   OFF
--
--   BUILD_NUMBER_INTERNALS           :   OFF
--   BUILD_NUMBER_NATIVE_TYPES        :   OFF
--   BUILD_NUMBER_ELASTICS            :   OFF
--   BUILD_NUMBER_STATICS             :   OFF
--   BUILD_NUMBER_CONVERSIONS         :   OFF
--
--   BUILD_NUMBER_EINTEGERS           :   OFF
--   BUILD_NUMBER_DECIMALS            :   OFF
--   BUILD_NUMBER_RATIONALS           :   OFF
--   BUILD_NUMBER_EFLOATS             :   OFF
--   BUILD_NUMBER_EPOSITS             :   OFF
--
--   BUILD_NUMBER_INTEGERS            :   OFF
--   BUILD_NUMBER_FIXPNTS             :   OFF
--   BUILD_NUMBER_BFLOATS             :   OFF
--   BUILD_NUMBER_CFLOATS             :   OFF
--   BUILD_NUMBER_DFLOATS             :   OFF
--   BUILD_NUMBER_AREALS              :   OFF
--   BUILD_NUMBER_UNUM1S              :   OFF
--   BUILD_NUMBER_UNUM2S              :   OFF
--   BUILD_NUMBER_POSITS              :   ON
--   BUILD_NUMBER_VALIDS              :   OFF
--   BUILD_NUMBER_LNS                 :   OFF
--   BUILD_NUMBER_DBNS                :   OFF
--   BUILD_NUMBER_SORNS               :   OFF
--
--   BUILD_NUMERIC_FUNCTIONS          :   OFF
--   BUILD_NUMERIC_QUIRES             :   OFF
--   BUILD_NUMERIC_CHALLENGES         :   OFF
--   BUILD_NUMERIC_UTILS              :   OFF
--   BUILD_NUMERIC_FPBENCH            :   OFF
--
--   BUILD_BENCHMARK_ERROR            :   OFF
--   BUILD_BENCHMARK_ACCURACY         :   OFF
--   BUILD_BENCHMARK_REPRODUCIBILITY  :   OFF
--   BUILD_BENCHMARK_PERFORMANCE      :   OFF
--   BUILD_BENCHMARK_ENERGY           :   OFF
--
--   BUILD_MIXEDPRECISION_ROOTS       :   OFF
--   BUILD_MIXEDPRECISION_APPROXIMATE :   OFF
--   BUILD_MIXEDPRECISION_INTEGRATE   :   OFF
--   BUILD_MIXEDPRECISION_INTERPOLATE :   OFF
--   BUILD_MIXEDPRECISION_OPTIMIZE    :   OFF
--   BUILD_MIXEDPRECISION_TENSOR      :   OFF
--
--   BUILD_LINEAR_ALGEBRA_BLAS        :   OFF
--   BUILD_LINEAR_ALGEBRA_VMATH       :   OFF
--   BUILD_LINEAR_ALGEBRA_DATA        :   OFF
--
--
--   BUILD_C_API_PURE_LIB             :   OFF
--   BUILD_C_API_SHIM_LIB             :   OFF
--   BUILD_C_API_LIB_PIC              :   OFF
--   BUILD_DOCS                       :   OFF
--
-- Regression Testing Level:
--   BUILD_REGRESSION_SANITY          :   ON
--
-- Dependencies:
--   SSE3                             :   NO
--   AVX                              :   NO
--   AVX2                             :   NO
--   Pthread                          :   NO
--   TBB                              :   NO
--   OMP                              :   NO
--
-- Utilities:
--   Serializer                       :   NO
--
-- Install:
--   Install path                     :   C:/Program Files (x86)/universal
--

 _____  _____  ____  _____  _____  ____   ____  ________  _______     ______        _       _____
|_   _||_   _||_   \|_   _||_   _||_  _| |_  _||_   __  ||_   __ \  .' ____ \      / \     |_   _|
  | |    | |    |   \ | |    | |    \ \   / /    | |_ \_|  | |__) | | (___ \_|    / _ \      | |
  | '    ' |    | |\ \| |    | |     \ \ / /     |  _| _   |  __ /   _.____`.    / ___ \     | |   _
   \ \__/ /    _| |_\   |_  _| |_     \ ' /     _| |__/ | _| |  \ \_| \____) | _/ /   \ \_  _| |__/ |
    `.__.'    |_____|\____||_____|     \_/     |________||____| |___|\______.'|____| |____||________|

-- Configuring done (0.2s)
-- Generating done (0.4s)
-- Build files have been written to: F:/Users/tomtz/dev/clones/universal/testbuild

when the build is done, the executable you are looking for will be './static/posit/specialized/fast_posit_16_2`

tomtz@sw-desktop-300 MINGW64 /F/Users/tomtz/dev/clones/universal/testbuild (v3.74)
$ ./static/posit/specialized/Debug/fast_posit_16_2.exe
Fast specialization posit<16,2> configuration tests
 posit< 16,2> useed scale     4     minpos scale        -56     maxpos scale         56  :  0

Special case tests
posit< 16, 2> Initialize to zero:            PASS
posit< 16, 2> Initialize to NAN              PASS
posit< 16, 2> Initialize to INFINITY         PASS
posit< 16, 2> sign is true                   PASS
posit< 16, 2> is negative                    PASS
posit< 16, 2> sign is false                  PASS
posit< 16, 2> is positive                    PASS
posit< 16, 2>                                                addition       PASS
posit< 16, 2>                                                subtraction    PASS
posit< 16, 2>                                                multiplication PASS
posit< 16, 2>                                                division       PASS

@davidmallasen
Copy link
Contributor

Hello @Ravenwater ,
Thank you, I figured out the typo and how to run the stress tests too (not just the sanity ones) with -DBUILD_REGRESSION_STRESS=ON. However, the output seems to be all pass:

Fast specialization posit<16,2> configuration tests
 posit< 16,2> useed scale     4     minpos scale        -56     maxpos scale         56  :  0

Special case tests
posit< 16, 2> Initialize to zero:            PASS
posit< 16, 2> Initialize to NAN              PASS
posit< 16, 2> Initialize to INFINITY         PASS
posit< 16, 2> sign is true                   PASS
posit< 16, 2> is negative                    PASS
posit< 16, 2> sign is false                  PASS
posit< 16, 2> is positive                    PASS
posit< 16, 2>                                                addition       PASS
posit< 16, 2>                                                subtraction    PASS
posit< 16, 2>                                                multiplication PASS
posit< 16, 2>                                                division       PASS
Logic operator tests
posit< 16, 2>                                                    ==         (native)   PASS
posit< 16, 2>                                                    !=         (native)   PASS
posit< 16, 2>                                                    <          (native)   PASS
posit< 16, 2>                                                    <=         (native)   PASS
posit< 16, 2>                                                    >          (native)   PASS
posit< 16, 2>                                                    >=         (native)   PASS
Assignment/conversion tests
posit< 16, 2>                                                integer assign (native)   PASS
Arithmetic tests 1048576 randoms each
posit< 16, 2>                                                addition       (native)   PASS
posit< 16, 2>                                                +=             (native)   PASS
posit< 16, 2>                                                subtraction    (native)   PASS
posit< 16, 2>                                                -=             (native)   PASS
posit< 16, 2>                                                multiplication (native)   PASS
posit< 16, 2>                                                *=             (native)   PASS
posit< 16, 2>                                                division       (native)   PASS
posit< 16, 2>                                                /=             (native)   PASS
Elementary function tests
posit< 16, 2>                                                sqrt           (native)   PASS
posit< 16, 2>                                                exp                       PASS
posit< 16, 2>                                                exp2                      PASS
posit< 16, 2>                                                log                       PASS
posit< 16, 2>                                                log2                      PASS
posit< 16, 2>                                                log10                     PASS
posit< 16, 2>                                                sin                       PASS
posit< 16, 2>                                                cos                       PASS
posit< 16, 2>                                                tan                       PASS
posit< 16, 2>                                                asin                      PASS
posit< 16, 2>                                                acos                      PASS
posit< 16, 2>                                                atan                      PASS
posit< 16, 2>                                                sinh                      PASS
posit< 16, 2>                                                cosh                      PASS
posit< 16, 2>                                                tanh                      PASS
posit< 16, 2>                                                asinh                     PASS
posit< 16, 2>                                                acosh                     PASS
posit< 16, 2>                                                atanh                     PASS
VerifyPower has been truncated
posit< 16, 2>                                                pow                       PASS

I'm in the main branch with commit hash 92d08344. The specialized posit32 does work properly for me with the -DPOSIT_FAST_POSIT_32_2 compile flag. But the same for posit16 with the -DPOSIT_FAST_POSIT_16_2 flag doesn't. I've triple-checked that this is the only difference. Could it be that the stress tests are not up to date?

@Ravenwater
Copy link
Contributor

The regression tests are the same and up to date for general and specialized, we just swap out the arithmetic type.

Given that the bit pattern checks are passing for both general and specialized arithmetic, I assume that the arithmetic type is tested.

I am wondering if we have a specific problem with the storage format. A posit32 fits nicely in a standard unsigned int, but a posit16 needs to sit in an uint16_t. If we aggregate that in vectors and matrices, we need to be certain that the compiler environment doesn't straddle past the uint16_t.

Are there analytical test cases in your application that we can use to test memory alignment is honored?

@Ravenwater
Copy link
Contributor

@davidmallasen got some new info. As you know the posit<16,2> arithmetic tests exhaustively trace 4B combinations. To make these regressions complete in a reasonable time I use randoms just to quickly sample the state space and pick up obvious failures.

I modified the regression test to do an exhaustive enumeration of the state space and this popped out:

Fast specialization posit<16,2> configuration tests
 posit< 16,2> useed scale     4     minpos scale        -56     maxpos scale         56  :  0

Special case tests
posit< 16, 2> Initialize to zero:            PASS
posit< 16, 2> Initialize to NAN              PASS
posit< 16, 2> Initialize to INFINITY         PASS
posit< 16, 2> sign is true                   PASS
posit< 16, 2> is negative                    PASS
posit< 16, 2> sign is false                  PASS
posit< 16, 2> is positive                    PASS
Arithmetic tests
posit< 16, 2>                                                add            (native)   PASS
posit< 16, 2>                                                subtract       (native)   PASS
posit< 16, 2>                                                multiply       (native)   PASS
posit< 16, 2>                                                divide         (native)   PASS
posit< 16, 2>                                                negate         (native)   PASS
Uncaught posit arithmetic exception: posit arithmetic exception: divide by zero

The specialized versions appear not to have the same exception behavior as the general version.

Can you think of a mechanism that you can use in your app to catch different exceptions?

Typically boiler place I use is this:

int main()
try {
   ...
}
catch (char const* msg) {
	std::cerr << msg << std::endl;
	return EXIT_FAILURE;
}
catch (const sw::universal::posit_arithmetic_exception& err) {
	std::cerr << "Uncaught posit arithmetic exception: " << err.what() << std::endl;
	return EXIT_FAILURE;
}
catch (const sw::universal::quire_exception& err) {
	std::cerr << "Uncaught quire exception: " << err.what() << std::endl;
	return EXIT_FAILURE;
}
catch (const sw::universal::posit_internal_exception& err) {
	std::cerr << "Uncaught posit internal exception: " << err.what() << std::endl;
	return EXIT_FAILURE;
}
catch (const std::runtime_error& err) {
	std::cerr << "Uncaught runtime exception: " << err.what() << std::endl;
	return EXIT_FAILURE;
}
catch (...) {
	std::cerr << "Caught unknown exception" << std::endl;
	return EXIT_FAILURE;
}

@Ravenwater
Copy link
Contributor

@davidmallasen oh, this is so embarrassing! I went through all the history of this issue that @RaulMurillo started. Raul offered to do the implementation as the posit<16,2> is now the standard, and I had implemented the posit<16,1> of the previous standard. I created the posit<16,2> file and regression test and was dependent on Raul to do the implementation. The specialized posit<16,2> is a verbatim copy of the posit<16,1> except for the template parameters.

What we are seeing is the result of the specialized posit<16,2> actually implementing a posit<16,1> behavior. The way the regression tests are written is that I enumerate the bit encodings, derive the double floating-point value, do a reference computation with the double values, then use the double to posit conversion to create the golden value, and then do the posit operation natively and compare the result to the golden value. So if the type implements any of these consistently, which in our case is doing it as a posit<16,1>, the regression test will pass.

Ok, so the solution is now known: we need to implement the specialized posit<16,2>.

@davidmallasen
Copy link
Contributor

davidmallasen commented Dec 15, 2023

Hello @Ravenwater . Thanks! This makes sense now, although I have a couple of comments:

  1. In the VerifyBinaryOperatorThroughRandoms function that I see is used in the tests, I see these two lines:

    testresult = testref;
    if (testresult != testref) {

    Doesn't this mean that the code is never going to enter the if statement (aka, it's never going to fail)? I guess the solution would be to remove the first assignment which overrides what is being computed as the result to test.

  2. Since there was a division by zero in the exhaustive tests, does this mean that there is a bug in the specialized posit<16,1> implementation?

  3. How is it correctly implementing a posit<16,1> behavior if at the begiining of the specialized/posit_16_2.hpp the ES_IS_2 is used? Is it replacing the posit<16,2> name with the template and implementing it as if it were a posit<16,2>?

  4. In the case of testing the specialized posits, could it be possible to compare them directly to the computations done by the generic posit implementation that we know to be working properly? I guess this would be simpler as only the bit patterns would need to be checked to see that they are equal.

I'll work on trying to change the specialized/posit_16_2.hpp from implementing the specialized posit<16,1> to the proper posit<16,2>.

@davidmallasen
Copy link
Contributor

Also @Ravenwater do you have any documentation on the algorithm and notation you are following here? It differs from what I'm used to (I follow the scheme in section II-A of out PERCIVAL paper).

@Ravenwater
Copy link
Contributor

Ravenwater commented Dec 17, 2023

@davidmallasen

  1. Dang, thanks for finding that bug. I vaguely remember that I was trying to implement posit<40,2> and posit<48,2> as our application profiling showed these posits are good replacements for IEEE-754 doubles typically providing slightly more precision. Trying to verify these posits we were pressed that we need an Oracle precision to follow our particular testing strategy as even long double does not carry enough precision to cover the rounding cases of these posits. I put that line nr 273 in to shut the regression failures up but then forgot about it. I'll need to do some restructuring to get that back on the rails.
  2. Tracked the division by zero to reciprocal(). I need to just close the hole in that code path.
  3. the subtlety here is that the template specialization is just to select the specialized code of choice. The posit<16,2> specialized code is not using any template parameters, it is all hardcoded for speed. Since that hardcoded code is doing a posit<16,1> functionally, it is all consistent, wrong, but consistent.
  4. I thought about that, but the specialization cuts you off from the generic. But we can just add another copy with a slightly different name and bring it in like that. I have wanted that functionality so often, this is the time, I'll add it.

I also started to try to fix the posit<16,2>. I am working in the v3.74 branch. I can go and focus on the bugs you have just highlighted, and you can maybe finish the posit<16,2> implementation.

@Ravenwater
Copy link
Contributor

@davidmallasen the specialized posits are following the softposit algorithms with the proper C++ skeleton around it to create a plug-in type.

@Ravenwater
Copy link
Contributor

Ravenwater commented Dec 17, 2023

@davidmallasen This commit is a new baseline to work from: cac65dc

I discovered that the fast posit<32,2> has regressed and is failing as well. Need to RCA why that has gone sideways too.

Next step for me is to bring in a reference posit we can use to put this system back together.

@Ravenwater
Copy link
Contributor

Ravenwater commented Dec 17, 2023

In this commit: 025e51f I have added a posito, short for posit oracle that we can use to RCA the specialized posit failures.

you can bring in this type via:

#include <universal/number/posito/posito.hpp>

has the same interface as the regular posit<nbits, es>.

@davidmallasen
Copy link
Contributor

Hello @Ravenwater . Thanks for all the info and the work so far! It looks like a plan to me,
I'll try to continue working on the specialized posit<16,2> on the branch that you mention, following the softposit algorithms. I had realized some of the 14 -> 13 changes in the constants that I see you already added to the code.
Just to be sure, the failings that you mention are only with the specialized posit<32,2> right? This means that the generic posit (and thus the oracle one) still work well to use as a reference in the tests? Once I have a full version of the posit<16,2> I'll try and test it with the updated test infrastructure.

@Ravenwater
Copy link
Contributor

@davidmallasen correct, just the specialized posit<32,2>. I'll look at the commit history on that file to see when it changed and see if it has a quick fix.

@davidmallasen
Copy link
Contributor

@Ravenwater I don't understand the rationale behind changing the 2to a 3 in

. Isn't this function doing the same as decode_regime except for the different shift parameter (m)? I looked at the code in softposit, but there are no comments there to help guide what is going on.

@Ravenwater
Copy link
Contributor

I am postulating here, but the algorithm that John and Cerlane developed accumulates the fraction bits and leaves the MSB set to 0 as not to trigger any signed/unsigned reinterpretation. As the exponent field is now potentially 2 bits, we need to shift away the exponent field by adding one more to the left shift. One problem that I haven't reverse engineered is how they are dealing with the situation when the encoding doesn't have a full exponent field, like, 0b0.000'0000'0000'001.0 Maybe it is just as simple as the fact that if you don't have enough bits for a full exponent, your fraction bits are zero too, and thus 'over shifting' does not matter.

@Ravenwater
Copy link
Contributor

P.S. commit fb0d0ed contains an addition test case that brings in both posit and posito to compare specialized and generalized implementations.

@davidmallasen
Copy link
Contributor

I am postulating here, but the algorithm that John and Cerlane developed accumulates the fraction bits and leaves the MSB set to 0 as not to trigger any signed/unsigned reinterpretation. As the exponent field is now potentially 2 bits, we need to shift away the exponent field by adding one more to the left shift.

The exponent field is after the regime, and in that function we are basically extracting the regime and returning the remaining bits (exponent and fraction), so this shouldn't be taken into account here? It is the same as in the decode_regime. If we want to remove the exponent, that should be done as what you added with remaining <<= 3; remaining >>= 2;. If not the function could not work if the regime is only 2 bits (I haven't tested this for all the regime-length patterns though).

One problem that I haven't reverse engineered is how they are dealing with the situation when the encoding doesn't have a full exponent field, like, 0b0.000'0000'0000'001.0 Maybe it is just as simple as the fact that if you don't have enough bits for a full exponent, your fraction bits are zero too, and thus 'over shifting' does not matter.

Since the values of the bits that are to the right of the LSB should be zero, I think this is accomplished when doing the shifts?

I will pull and check that new commit, thanks

@davidmallasen
Copy link
Contributor

I just realized that Softposit has a positX_2 code that could help us! I'll continue tomorrow looking at this. https://gitlab.com/cerlane/SoftPosit/-/blob/master/source/s_addMagsPX2.c

@Ravenwater
Copy link
Contributor

I got the add for positive values working, but we need the subtraction routine to work as well.

I also restructured quite a bit of the regression tests across all the number systems, so quite a few files have changed.

If you have time tomorrow, @davidmallasen take a look at operator-=() and see how that works.

@davidmallasen
Copy link
Contributor

davidmallasen commented Dec 21, 2023

Ah this makes sense @Ravenwater . I didn't realize that when adding two numbers with different signs, it's calling the -= operator and I didn't change that one. I'll have a look at it. I only tackled the += first to have something small working and then fill in the rest.

@davidmallasen
Copy link
Contributor

Could you have a look at #404 @Ravenwater ? I think with this we could try all the exhaustive testing on the specialized posit<16,2> with the oracle posit<16,2> since the rest of the operations should be correct. I'm not sure about the integer_assign and the float_assign helper functions though.

@Ravenwater
Copy link
Contributor

@davidmallasen I got the regime and exponent field algorithms figured out, but I haven't been able to understand the rounding algorithm that SoftPosit implements. I checked in the code as I need a second pair of eyes on this to try to figure out how this is supposed to work.

Here is the results of the regression tests fast_posit_16_2:

Fast specialization posit<16,2>: results only
 posit< 16,2> useed scale     4     minpos scale        -56     maxpos scale         56  :  0

posit< 16, 2>                                                +=             (native)   PASS
posit< 16, 2>                                                -=             (native)   PASS
posit< 16, 2>                                                *=             (native)   PASS
FAIL
-13.62890625              /= -0.443115234375           != 30.765625                 golden reference is 30.75

0b1.10.11.10110100001     /= 0b1.01.10.11000101110     != 0b0.110.00.1110110001     golden reference is 0b0.110.00.1110110000
FAIL
124.1875                  /= 0.0001163482666015625     != 1081344                   golden reference is 1064960

0b0.110.10.1111000011     /= 0b0.00001.10.11101000     != 0b0.1111110.00.000010     golden reference is 0b0.1111110.00.000001
FAIL
-0.004791259765625        /= 0.2958984375              != -0.016204833984375        golden reference is -0.0161895751953125
0b1.001.00.0011101000     /= 0b0.01.10.00101111000     != 0b1.001.10.0000100110     golden reference is 0b1.001.10.0000100101
FAIL
-0.443359375              /= 0.005786895751953125      != -76.5625                  golden reference is -76.625

0b1.01.10.11000110000     /= 0b0.001.00.0111101101     != 0b1.110.10.0011001001     golden reference is 0b1.110.10.0011001010
FAIL
0.0018825531005859375     /= -2516                     != -7.5250864028930664062e-07 golden reference is -7.450580596923828125e-07
0b0.0001.10.111011011     /= 0b1.1110.11.001110101     != 0b1.0000001.11.100101     golden reference is 0b1.0000001.11.100100
FAIL
9.96875                   /= 7.662109375               != 1.30078125                golden reference is 1.30126953125

0b0.10.11.00111111000     /= 0b0.10.10.11101010011     != 0b0.10.00.01001101000     golden reference is 0b0.10.00.01001101001
FAIL
-0.1373291015625          /= 0.610107421875            != -0.22503662109375         golden reference is -0.22509765625

0b1.01.01.00011001010     /= 0b0.01.11.00111000011     != 0b1.01.01.11001100111     golden reference is 0b1.01.01.11001101000
FAIL
-1.791015625              /= 0.2860107421875           != -6.263671875              golden reference is -6.26171875

0b1.10.00.11001010100     /= 0b0.01.10.00100100111     != 0b1.10.10.10010000111     golden reference is 0b1.10.10.10010000110
FAIL
-27.765625                /= 2.162109375               != -12.83984375              golden reference is -12.84375

0b1.110.00.1011110001     /= 0b0.10.01.00010100110     != 0b1.10.11.10011010111     golden reference is 0b1.10.11.10011011000
FAIL
-0.120452880859375        /= -122.5                    != 0.0009822845458984375     golden reference is 0.00098419189453125
0b1.01.00.11101101011     /= 0b1.110.10.1110101000     != 0b0.0001.10.000000011     golden reference is 0b0.0001.10.000000100
FAIL
-0.0034027099609375       /= 0.772705078125            != -0.004405975341796875     golden reference is -0.00440216064453125
0b1.0001.11.101111100     /= 0b0.01.11.10001011101     != 0b1.001.00.0010000011     golden reference is 0b1.001.00.0010000010
FAIL
0.000214099884033203125   /= -0.0258026123046875       != -0.00829315185546875      golden reference is -0.00830078125

0b0.00001.11.11000001     /= 0b1.001.10.1010011011     != 0b1.001.01.0000111111     golden reference is 0b1.001.01.0001000000
FAIL
-0.2811279296875          /= -0.13555908203125         != 2.0732421875              golden reference is 2.07421875

0b1.01.10.00011111111     /= 0b1.01.01.00010101101     != 0b0.10.01.00001001011     golden reference is 0b0.10.01.00001001100
FAIL
7.080078125               /= 0.3287353515625           != 21.546875                 golden reference is 21.53125

0b0.10.10.11000101001     /= 0b0.01.10.01010000101     != 0b0.110.00.0101100011     golden reference is 0b0.110.00.0101100010
FAIL
10.140625                 /= -0.302001953125           != -33.59375                 golden reference is -33.5625

0b0.10.11.01000100100     /= 0b1.01.10.00110101010     != 0b1.110.01.0000110011     golden reference is 0b1.110.01.0000110010
FAIL
-1.5869140625             /= 0.06842041015625          != -23.203125                golden reference is -23.1875

0b1.10.00.10010110010     /= 0b0.01.00.00011000010     != 0b1.110.00.0111001101     golden reference is 0b1.110.00.0111001100
FAIL
314                       /= 0.005863189697265625      != 53632                     golden reference is 53504

0b0.1110.00.001110100     /= 0b0.001.00.1000000001     != 0b0.11110.11.10100011     golden reference is 0b0.11110.11.10100010
FAIL
-0.039520263671875        /= -0.14569091796875         != 0.2713623046875           golden reference is 0.271240234375

0b1.001.11.0100001111     /= 0b1.01.01.00101010011     != 0b0.01.10.00010101111     golden reference is 0b0.01.10.00010101110
FAIL
0.7255859375              /= -0.0028839111328125       != -251.5                    golden reference is -251.625

0b0.01.11.01110011100     /= 0b1.0001.11.011110100     != 0b1.110.11.1111011100     golden reference is 0b1.110.11.1111011101
FAIL
-0.051239013671875        /= 22.296875                 != -0.002300262451171875     golden reference is -0.00229644775390625
0b1.001.11.1010001111     /= 0b0.110.00.0110010011     != 0b1.0001.11.001011011     golden reference is 0b1.0001.11.001011010
FAIL
3.6796875                 /= 21.71875                  != 0.16937255859375          golden reference is 0.16943359375

0b0.10.01.11010111000     /= 0b0.110.00.0101101110     != 0b0.01.01.01011010111     golden reference is 0b0.01.01.01011011000
FAIL
-0.01110076904296875      /= -0.9736328125             != 0.01140594482421875       golden reference is 0.0113983154296875
0b1.001.01.0110101111     /= 0b1.01.11.11110010100     != 0b0.001.01.0111010111     golden reference is 0b0.001.01.0111010110
FAIL
2964                      /= 1.03662109375             != 2856                      golden reference is 2860

0b0.1110.11.011100101     /= 0b0.10.00.00001001011     != 0b0.1110.11.011001010     golden reference is 0b0.1110.11.011001011
FAIL
-0.00539398193359375      /= 122.125                   != -4.41074371337890625e-05  golden reference is -4.422664642333984375e-05
0b1.001.00.0110000110     /= 0b0.110.10.1110100010     != 0b1.00001.01.01110010     golden reference is 0b1.00001.01.01110011
FAIL
-0.311279296875           /= 0.10211181640625          != -3.0478515625             golden reference is -3.048828125

0b1.01.10.00111110110     /= 0b0.01.00.10100010010     != 0b1.10.01.10000110001     golden reference is 0b1.10.01.10000110010
FAIL
-81.625                   /= 0.777587890625            != -104.9375                 golden reference is -105

0b1.110.10.0100011010     /= 0b0.01.11.10001110001     != 0b1.110.10.1010001111     golden reference is 0b1.110.10.1010010000
FAIL
38.1875                   /= -1.14501953125            != -33.375                   golden reference is -33.34375

0b0.110.01.0011000110     /= 0b1.10.00.00100101001     != 0b1.110.01.0000101100     golden reference is 0b1.110.01.0000101011
FAIL
-1.580078125              /= -0.005619049072265625     != 281.5                     golden reference is 281

0b1.10.00.10010100100     /= 0b1.001.00.0111000001     != 0b0.1110.00.000110011     golden reference is 0b0.1110.00.000110010
FAIL
-0.1455078125             /= -0.4827880859375          != 0.30126953125             golden reference is 0.3013916015625

0b1.01.01.00101010000     /= 0b1.01.10.11101110011     != 0b0.01.10.00110100100     golden reference is 0b0.01.10.00110100101
FAIL
0.0171966552734375        /= -2.572265625              != -0.006683349609375        golden reference is -0.006687164306640625
0b0.001.10.0001100111     /= 0b1.10.01.01001001010     != 0b1.001.00.1011011000     golden reference is 0b1.001.00.1011011001
FAIL
-0.000171184539794921875  /= 3.5849609375              != -4.76837158203125e-05     golden reference is -4.780292510986328125e-05
0b1.00001.11.01100111     /= 0b0.10.01.11001010111     != 0b1.00001.01.10010000     golden reference is 0b1.00001.01.10010001
FAIL
2.7080078125              /= 1.3017578125              != 2.0810546875              golden reference is 2.080078125

0b0.10.01.01011010101     /= 0b0.10.00.01001101010     != 0b0.10.01.00001010011     golden reference is 0b0.10.01.00001010010
FAIL
107.5625                  /= -855                      != -0.1258544921875          golden reference is -0.12579345703125
0b0.110.10.1010111001     /= 0b1.1110.01.101010111     != 0b1.01.01.00000001110     golden reference is 0b1.01.01.00000001101
FAIL
2188                      /= 7.87890625                != 278                       golden reference is 277.5

0b0.1110.11.000100011     /= 0b0.10.10.11111000010     != 0b0.1110.00.000101100     golden reference is 0b0.1110.00.000101011
FAIL
-0.2962646484375          /= -42.375                   != 0.006988525390625         golden reference is 0.006992340087890625
0b1.01.10.00101111011     /= 0b1.110.01.0101001100     != 0b0.001.00.1100101000     golden reference is 0b0.001.00.1100101001
FAIL
253.625                   /= -112.75                   != -2.25                     golden reference is -2.2490234375

0b0.110.11.1111101101     /= 0b1.110.10.1100001100     != 0b1.10.01.00100000000     golden reference is 0b1.10.01.00011111111
FAIL
-0.4892578125             /= -2.07421875               != 0.23583984375             golden reference is 0.23590087890625
0b1.01.10.11110101000     /= 0b1.10.01.00001001100     != 0b0.01.01.11100011000     golden reference is 0b0.01.01.11100011001
FAIL
0.22015380859375          /= 0.105255126953125         != 2.0908203125              golden reference is 2.091796875

0b0.01.01.11000010111     /= 0b0.01.00.10101111001     != 0b0.10.01.00001011101     golden reference is 0b0.10.01.00001011110
FAIL
-0.8388671875             /= 414.5                     != -0.00202178955078125      golden reference is -0.002025604248046875
0b1.01.11.10101101100     /= 0b0.1110.00.100111101     != 0b1.0001.11.000010010     golden reference is 0b1.0001.11.000010011
FAIL
8.01953125                /= 0.00038623809814453125    != 20800                     golden reference is 20736

0b0.10.11.00000000101     /= 0b0.0001.00.100101010     != 0b0.11110.10.01000101     golden reference is 0b0.11110.10.01000100
FAIL
0.1378173828125           /= 209.375                   != 0.00065898895263671875    golden reference is 0.0006580352783203125
0b0.01.01.00011010010     /= 0b0.110.11.1010001011     != 0b0.0001.01.010110011     golden reference is 0b0.0001.01.010110010
FAIL
-10.20703125              /= -13216                    != 0.00077152252197265625    golden reference is 0.0007724761962890625
0b1.10.11.01000110101     /= 0b1.11110.01.10011101     != 0b0.0001.01.100101001     golden reference is 0b0.0001.01.100101010
FAIL
217                       /= -11.671875                != -18.578125                golden reference is -18.59375

0b0.110.11.1011001000     /= 0b1.10.11.01110101100     != 0b1.110.00.0010100101     golden reference is 0b1.110.00.0010100110
FAIL
15.4296875                /= -1.93017578125            != -7.9921875                golden reference is -7.994140625

0b0.10.11.11101101110     /= 0b1.10.00.11101110001     != 0b1.10.10.11111111100     golden reference is 0b1.10.10.11111111101
FAIL
-49920                    /= 56.84375                  != -879                      golden reference is -878

0b1.11110.11.10000110     /= 0b0.110.01.1100011011     != 0b1.1110.01.101101111     golden reference is 0b1.1110.01.101101110
FAIL
-3.29296875               /= -1.78955078125            != 1.83984375                golden reference is 1.84033203125

0b1.10.01.10100101100     /= 0b1.10.00.11001010001     != 0b0.10.00.11010111000     golden reference is 0b0.10.00.11010111001
FAIL
1.2744140625              /= 116.9375                  != 0.01090240478515625       golden reference is 0.010894775390625
0b0.10.00.01000110010     /= 0b0.110.10.1101001111     != 0b0.001.01.0110010101     golden reference is 0b0.001.01.0110010100
FAIL
12.0078125                /= -0.00049686431884765625   != -24128                    golden reference is -24192

0b0.10.11.10000000010     /= 0b1.0001.01.000001001     != 0b1.11110.10.01111001     golden reference is 0b1.11110.10.01111010
FAIL
0.0892333984375           /= 25.859375                 != 0.003448486328125         golden reference is 0.003452301025390625
0b0.01.00.01101101100     /= 0b0.110.00.1001110111     != 0b0.0001.11.110001000     golden reference is 0b0.0001.11.110001001
FAIL
2.1755695343017578125e-05 /= 0.24542236328125          != 8.84532928466796875e-05   golden reference is 8.869171142578125e-05
0b0.00001.00.01101101     /= 0b0.01.01.11110110101     != 0b0.00001.10.01110011     golden reference is 0b0.00001.10.01110100
FAIL
0.053314208984375         /= -0.073333740234375        != -0.726806640625           golden reference is -0.72705078125

0b0.001.11.1011010011     /= 0b1.01.00.00101100011     != 0b1.01.11.01110100001     golden reference is 0b1.01.11.01110100010
FAIL
-1.5273690223693847656e-07 /= 5.14453125                != -2.9336661100387573242e-08 golden reference is -2.98023223876953125e-08
0b1.0000001.01.010010     /= 0b0.10.10.01001001010     != 0b1.00000001.10.11111     golden reference is 0b1.00000001.11.00000
FAIL
-0.0052642822265625       /= 2820                      != -1.8626451492309570312e-06 golden reference is -1.8700957298278808594e-06
0b1.001.00.0101100100     /= 0b0.1110.11.011000001     != 0b1.000001.00.1111010     golden reference is 0b1.000001.00.1111011
FAIL
-8.55922698974609375e-05  /= -72.9375                  != 1.1697411537170410156e-06 golden reference is 1.1771917343139648438e-06
0b1.00001.10.01100111     /= 0b1.110.10.0010001111     != 0b0.000001.00.0011101     golden reference is 0b0.000001.00.0011110
posit< 16, 2>                                                /=             (native)   FAIL 54 failed test cases
0b0.000000000001.01.0 / 0b0.000000000001.00.1 = 0b0.10.00.01010101010
1.13687e-13 / 8.52651e-14 = 1.33301
0b0.000000000001.01.0 / 0b0.000000000001.00.1 = 0b0.10.00.01010101011
1.13687e-13 / 8.52651e-14 = 1.3335
FAIL
Fast specialization posit<16,2>: FAIL

It appears that the bitNPlusOne calculation is not correct, and causing these failures, but how to fix it is stumping me.

@Ravenwater
Copy link
Contributor

@davidmallasen @RaulMurillo finally, got it figured out. we now have a fast posit<16,2>

Fast specialization posit<16,2>: results only
 posit< 16,2> useed scale     4     minpos scale        -56     maxpos scale         56  :  0

Exhaustive tests
posit< 16, 2>                                                div            (native)   PASS
posit< 16, 2>                                                mul            (native)   PASS
posit< 16, 2>                                                sub            (native)   PASS
posit< 16, 2>                                                add            (native)   PASS


Fast specialization posit<16,2>: results only
 posit< 16,2> useed scale     4     minpos scale        -56     maxpos scale         56  :  0

Special case tests
posit< 16, 2> Initialize to zero:            PASS
posit< 16, 2> Initialize to NAN              PASS
posit< 16, 2> Initialize to INFINITY         PASS
posit< 16, 2> sign is true                   PASS
posit< 16, 2> is negative                    PASS
posit< 16, 2> sign is false                  PASS
posit< 16, 2> is positive                    PASS
posit< 16, 2>                                                addition       PASS
posit< 16, 2>                                                subtraction    PASS
posit< 16, 2>                                                multiplication PASS
posit< 16, 2>                                                division       PASS
Logic operator tests
posit< 16, 2>                                                    ==         (native)   PASS
posit< 16, 2>                                                    !=         (native)   PASS
posit< 16, 2>                                                    <          (native)   PASS
posit< 16, 2>                                                    <=         (native)   PASS
posit< 16, 2>                                                    >          (native)   PASS
posit< 16, 2>                                                    >=         (native)   PASS
Assignment/conversion tests
posit< 16, 2>                                                integer assign (native)   PASS
Arithmetic tests 1048576 randoms each
posit< 16, 2>                                                addition       (native)   PASS
posit< 16, 2>                                                +=             (native)   PASS
posit< 16, 2>                                                subtraction    (native)   PASS
posit< 16, 2>                                                -=             (native)   PASS
posit< 16, 2>                                                multiplication (native)   PASS
posit< 16, 2>                                                *=             (native)   PASS
posit< 16, 2>                                                division       (native)   PASS
posit< 16, 2>                                                /=             (native)   PASS
Elementary function tests
posit< 16, 2>                                                sqrt           (native)   PASS
posit< 16, 2>                                                exp                       PASS
posit< 16, 2>                                                exp2                      PASS
posit< 16, 2>                                                log                       PASS
posit< 16, 2>                                                log2                      PASS
posit< 16, 2>                                                log10                     PASS
posit< 16, 2>                                                sin                       PASS
posit< 16, 2>                                                cos                       PASS
posit< 16, 2>                                                tan                       PASS
posit< 16, 2>                                                asin                      PASS
posit< 16, 2>                                                acos                      PASS
posit< 16, 2>                                                atan                      PASS
posit< 16, 2>                                                sinh                      PASS
posit< 16, 2>                                                cosh                      PASS
posit< 16, 2>                                                tanh                      PASS
posit< 16, 2>                                                asinh                     PASS
posit< 16, 2>                                                acosh                     PASS
posit< 16, 2>                                                atanh                     PASS
VerifyPower has been truncated
posit< 16, 2>                                                pow                       PASS
Fast specialization posit<16,2>: PASS

@davidmallasen
Copy link
Contributor

Hello @Ravenwater . Great news, thanks for debugging that! I've been out of office for a couple of weeks but I'll check this soon to corroborate that it's also working on our end. In the end did you use the posit oracle to check the bit patterns or was that too slow?

@Ravenwater
Copy link
Contributor

@davidmallasen the posit oracle was very useful in debugging and generating hypotheses. Fundamentally, I had to reverse-engineer the algorithm that John and Cerlean had created. They created algorithms for es=0,1, and 2, and these take different shortcut to encode/decode, and round. The debugging allowed me to rediscover that (it has been 5 years since I wrote any of the posit code). Once I knew what was going on, I realized that the rounding was using the shortcuts/interpretation of the es=1 algorithm, and thus I needed to rip and replace it with the es=2 algorithms.

Given the amount of time I spent on this, you better use this type, :-)

@davidmallasen
Copy link
Contributor

Thanks for the info and for all of the time you put into this @Ravenwater . We definitely will be using this, since with initial tests it improves performance by around 2.5x. However, with the application I have the results are not the same than with the non-specialized posit<16,2>, although they are similar. I'll try to check why this is happening, but if the exhaustive tests are succeeding it might be some bit-level difference between the non-specialized and the specialized algorithms.

@ghost
Copy link

ghost commented Jan 9, 2024

The verification of both implementations happens with the same test, which follows posit arithmetic rules. If there is something inconsistent, the most likely place to look would be the special cases, NaR, NaN assignment, divide by zero.

Now that we have the posito environment, I can set up an exhaustive test between the two and let it run for a day.

@davidmallasen
Copy link
Contributor

Perfect! Using the posito environment would definitely be helpful to see where the two diverge.
Looking again at the code, I'm not sure what the is_power_of_2 function is doing. Also, should we have to revise the integer_assign and float_assign functions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Development

No branches or pull requests

3 participants