Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-34918: [C++] Update vendored double-conversion 3.2.1 #34919

Merged
merged 1 commit into from
May 19, 2023

Conversation

liujiacheng777
Copy link
Contributor

@liujiacheng777 liujiacheng777 commented Apr 6, 2023

Modifications based on doubleConversion version 3.2.1

@github-actions
Copy link

github-actions bot commented Apr 6, 2023

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

In the case of PARQUET issues on JIRA the title also supports:

PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

See also:

@liujiacheng777
Copy link
Contributor Author

#34918 (comment)

@kou kou changed the title ARROW-34918: [C++] Update vendored double-conversion GH-34918: [C++] Update vendored double-conversion Apr 6, 2023
@github-actions
Copy link

github-actions bot commented Apr 6, 2023

@github-actions
Copy link

github-actions bot commented Apr 6, 2023

⚠️ GitHub issue #34918 has been automatically assigned in GitHub to PR creator.

@kou
Copy link
Member

kou commented Apr 6, 2023

Could you use pull request template content next time?

Comment on lines 239 to 247
vendored/double-conversion/bignum.cc
vendored/double-conversion/double-conversion.cc
vendored/double-conversion/string-to-double.cc
vendored/double-conversion/double-to-string.cc
vendored/double-conversion/bignum-dtoa.cc
vendored/double-conversion/fast-dtoa.cc
vendored/double-conversion/cached-powers.cc
vendored/double-conversion/fixed-dtoa.cc
vendored/double-conversion/diy-fp.cc
vendored/double-conversion/strtod.cc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you sort this list in alphabetical order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, I have sorted alphabetically in the new commit

@github-actions github-actions bot added awaiting review Awaiting review awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Apr 6, 2023
@github-actions github-actions bot added awaiting review Awaiting review and removed awaiting review Awaiting review awaiting changes Awaiting changes labels Apr 7, 2023
@kou
Copy link
Member

kou commented Apr 10, 2023

@anthonylouisbsb @projjal @praveenbingo You changed our vendored double-conversion in #9816. Is the change still needed?

Some Gandiva tests failed by upgrading vendored double-conversion (the change is revered):

https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/46724699#L2520

[ RUN      ] TestGdvFnStubs.TestCastVARCHARFromFloat
C:/projects/arrow/cpp/src/gandiva/gdv_function_stubs_test.cc(400): error: Expected equality of these values:
  std::string(out_str, out_len)
    Which is: "1E-5"
  "1.0E-5"
[  FAILED  ] TestGdvFnStubs.TestCastVARCHARFromFloat (0 ms)
[ RUN      ] TestGdvFnStubs.TestCastVARCHARFromDouble
C:/projects/arrow/cpp/src/gandiva/gdv_function_stubs_test.cc(435): error: Expected equality of these values:
  std::string(out_str, out_len)
    Which is: "1E-5"
  "1.0E-5"
[  FAILED  ] TestGdvFnStubs.TestCastVARCHARFromDouble (0 ms)

If you really need the change, could you improve double-conversion in upstream instead of changing vendored double-conversion.

@ianmcook
Copy link
Member

ianmcook commented May 1, 2023

@liujiacheng777 As Kou says above, there were a few small changes made to the vendored double-conversion in #9816.

See these changes at https://github.com/apache/arrow/pull/9816/files#diff-d1cc5b70a5e980626bb70ae604a050d3393ac25a717a5a4c8dc40e8b5caf4b05
It looks like they were very small changes: only two files affected, less than 10 lines of code added, plus some comments.

Could you please push a commit to this PR that ports these changes to the vendored double-conversion 3.2.1?

As Kou says above, we would like to upstream these changes, but porting the changes here will help, and I think it will fix the failing CI.

@kou
Copy link
Member

kou commented May 1, 2023

FYI: #35135 added an update script. We need to update the patches under https://github.com/apache/arrow/tree/main/cpp/src/arrow/vendored/double-conversion/patches .

FYI: https://lists.apache.org/thread/mzb2qj4vkwhtbc64d1yfloc4s3r2po7j is the mailing list discussion about upstreaming.

@liujiacheng777
Copy link
Contributor Author

`66% tests passed, 15 tests failed out of 44

Label Time Summary:
arrow-benchmarks = 2663.27 secproc (44 tests)
benchmark = 2663.27 sec
proc (44 tests)

Total Test time (real) = 980.08 sec

The following tests FAILED:
6 - arrow-compute-function-benchmark (Failed)
7 - arrow-compute-scalar-arithmetic-benchmark (Failed)
8 - arrow-compute-scalar-boolean-benchmark (Failed)
10 - arrow-compute-scalar-compare-benchmark (Failed)
11 - arrow-compute-scalar-if-else-benchmark (Failed)
12 - arrow-compute-scalar-random-benchmark (Failed)
13 - arrow-compute-scalar-round-benchmark (Failed)
14 - arrow-compute-scalar-set-lookup-benchmark (Failed)
15 - arrow-compute-scalar-string-benchmark (Failed)
16 - arrow-compute-scalar-temporal-benchmark (Failed)
18 - arrow-compute-vector-sort-benchmark (Failed)
19 - arrow-compute-vector-partition-benchmark (Failed)
20 - arrow-compute-vector-topk-benchmark (Failed)
21 - arrow-compute-vector-replace-benchmark (Failed)
22 - arrow-compute-vector-selection-benchmark (Failed)
Errors while running CTest`

@kou ,I tested using “cmake .. -DARROW_BUILD_BENCHMARKS=ON && ctest -j8 --output-on-failure” according to the official manual, and there were 15 errors in the test results. The reason for the error is similar to the function not being registered(Key error: No function registered with name: greater). Do I need to turn on some test switches

@kou
Copy link
Member

kou commented May 10, 2023

You may need to add -DARROW_COMPUTE=ON.

liujiacheng777 added a commit to LoongArch-Python/arrow that referenced this pull request May 11, 2023
@liujiacheng777
Copy link
Contributor Author

liujiacheng777 commented May 11, 2023

@kou thanks, I turned on the DARROW_COMPUT and passed all the tests

100% tests passed, 0 tests failed out of 44

Label Time Summary:
arrow-benchmarks    = 4639.70 sec*proc (44 tests)
benchmark           = 4639.70 sec*proc (44 tests)

Total Test time (real) = 691.67 sec

I already integrated the two commits: #9816 &&
#35135, upgrade double-conversion version from v3.1.5 to v3.2.1

@kou
Copy link
Member

kou commented May 11, 2023

Thanks.
I started CI jobs.

If nobody reports our patches to upstream in this month, let's remove our patches and merge this.

@ianmcook
Copy link
Member

I opened a PR to upstream the double-conversion change at google/double-conversion#195

@kou
Copy link
Member

kou commented May 17, 2023

@ursabot please benchmark

@ursabot
Copy link

ursabot commented May 17, 2023

Benchmark runs are scheduled for baseline = 9ef2f65 and contender = ed9907e. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.44% ⬆️2.73%] test-mac-arm
[Finished ⬇️1.02% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️2.83% ⬆️0.63%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] ed9907e9 ec2-t3-xlarge-us-east-2
[Failed] ed9907e9 test-mac-arm
[Finished] ed9907e9 ursa-i9-9960x
[Finished] ed9907e9 ursa-thinkcentre-m75q
[Finished] 9ef2f655 ec2-t3-xlarge-us-east-2
[Finished] 9ef2f655 test-mac-arm
[Finished] 9ef2f655 ursa-i9-9960x
[Finished] 9ef2f655 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@ianmcook
Copy link
Member

@kou do you want to merge this and then we can do #35669 after?

@ursabot
Copy link

ursabot commented May 18, 2023

['Python', 'R'] benchmarks have high level of regressions.
ursa-i9-9960x

@kou
Copy link
Member

kou commented May 18, 2023

Yes.

But could you check the benchmark result before we merge this?

['Python', 'R'] benchmarks have high level of regressions. ursa-i9-9960x

It says that the tpch benchmarks were slow with this change.

(Should we re-run benchmark?)

@kou
Copy link
Member

kou commented May 18, 2023

@ursabot please benchmark

@ursabot
Copy link

ursabot commented May 18, 2023

Commit ed9907e already has scheduled benchmark runs.

@ianmcook
Copy link
Member

@ursabot please benchmark

@ursabot
Copy link

ursabot commented May 18, 2023

Commit ed9907e already has scheduled benchmark runs.

@ianmcook
Copy link
Member

@ursabot please benchmark lang=R

@ursabot
Copy link

ursabot commented May 18, 2023

Commit ed9907e already has scheduled benchmark runs.

@ianmcook
Copy link
Member

It looks like we will have to wait for a while to re-run the benchmarks. It seems hard to tell whether the performance regressions are just noise.

@kou
Copy link
Member

kou commented May 19, 2023

It seems that we can't run our benchmarks multiple times for one commit.

@liujiacheng777 Could you rebase on main to re-run our benchmarks? (Rebasing changes a commit ID.)

@ianmcook
Copy link
Member

@kou I am more confident now that these performance regressions are just noise. The biggest regressions are happening in 0.1 and 0.01 scale factor TPC-H benchmarks (which are known to be very noisy) and the other scale factors of the same benchmarks are not showing regressions.

@kou kou changed the title GH-34918: [C++] Update vendored double-conversion GH-34918: [C++] Update vendored double-conversion 3.2.1 May 19, 2023
Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

OK. I merge this.

@kou kou merged commit 00e6996 into apache:main May 19, 2023
@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting committer review Awaiting committer review labels May 19, 2023
@ursabot
Copy link

ursabot commented May 22, 2023

Benchmark runs are scheduled for baseline = a4bcee3 and contender = 00e6996. 00e6996 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.48% ⬆️0.09%] test-mac-arm
[Finished ⬇️0.33% ⬆️1.63%] ursa-i9-9960x
[Finished ⬇️0.21% ⬆️0.15%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 00e6996d ec2-t3-xlarge-us-east-2
[Finished] 00e6996d test-mac-arm
[Finished] 00e6996d ursa-i9-9960x
[Finished] 00e6996d ursa-thinkcentre-m75q
[Finished] a4bcee3e ec2-t3-xlarge-us-east-2
[Finished] a4bcee3e test-mac-arm
[Finished] a4bcee3e ursa-i9-9960x
[Finished] a4bcee3e ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++] Update vendored double-conversion to 3.2.1
4 participants