Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(predictions): use string's unicode scalars view to compute indexes for Comprehend results #904

Merged
merged 1 commit into from
Dec 9, 2020

Conversation

diegocstn
Copy link
Contributor

Amazon Comprehend returns an array of Entity https://docs.aws.amazon.com/comprehend/latest/dg/API_Entity.html .
Each entity has a beginOffset and endOffset properties, which are positions of each UTF-8 "code point" in the string.

A code point is a value in the Unicode codespace, the range of integers from 0 to 10FFFF, not all code points are assigned to encoded chars.
We need to use Swift strings underlying scalar view to compute indexes correctly and avoid crashes at runtime.

fix #873

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@diegocstn diegocstn force-pushed the fix/issue-873 branch 2 times, most recently from 6142efe to 11082a5 Compare November 20, 2020 22:14
@codecov
Copy link

codecov bot commented Nov 20, 2020

Codecov Report

Merging #904 (af4b017) into main (a0d37bb) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #904      +/-   ##
==========================================
+ Coverage   66.87%   66.89%   +0.01%     
==========================================
  Files         874      874              
  Lines       34755    34755              
==========================================
+ Hits        23244    23248       +4     
+ Misses      11511    11507       -4     
Flag Coverage Δ
API_plugin_unit_test 62.56% <ø> (+0.07%) ⬆️
Analytics_plugin_unit_test 72.38% <ø> (ø)
Auth_plugin_unit_test 39.11% <ø> (ø)
DataStore_plugin_unit_test 83.03% <ø> (ø)
Predictions_plugin_unit_test 49.49% <100.00%> (ø)
Storage_plugin_unit_test 74.74% <ø> (ø)
build_test_amplify 63.16% <ø> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...Predictions/AWSPredictionsService+Comprehend.swift 96.47% <100.00%> (ø)
...Hub/DefaultPluginTests/DefaultHubPluginTests.swift 88.37% <0.00%> (+1.16%) ⬆️
...s/AWSHubPlugin/Internal/HubChannelDispatcher.swift 97.05% <0.00%> (+2.94%) ⬆️
...n/Operation/AWSGraphQLOperation+APIOperation.swift 54.23% <0.00%> (+3.38%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a0d37bb...af4b017. Read the comment docs.

@diegocstn diegocstn marked this pull request as ready for review November 20, 2020 22:53
@diegocstn diegocstn requested review from a team and royjit November 20, 2020 23:00
Copy link
Contributor

@royjit royjit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@diegocstn diegocstn merged commit c9c388f into main Dec 9, 2020
@diegocstn diegocstn deleted the fix/issue-873 branch December 9, 2020 01:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AWS Predictions (AWSPredictionsService) Index out of Range due to Emojis
2 participants