-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ES Search Query Collect All Response #1631
Conversation
TESTING - Tested both locally and in AWS for the following:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I just want to confirm the new way we are dealing with the response of search.
response = {'hits':{'hits':[{'_id:1, ...}...]}}
Before:
- we extract docs.get('hits', {}).get('hits', []) in the catalog indexer task from search
- we extract hits-hits in the FE from search - with the Catalog DataSearch props...
After: - we directly get the hits in the catalog indexer task from search_all
- we extract hits-hits in the FE from search - we do not change anything to not mess up with the Catalog view
Correct! The FE Component we use automatically handles all the pagination for us |
### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - For `catalog_indexer_task` ensure we collect all hits from query response for `with_deletes` option - Up the Query Size to 1000 results (default is 10) - Add logic to continue querying to collect all hits if there are more than the query size limit (i.e. > 1000) ### Relates - <URL or Ticket> ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
### Feature or Bugfix - Security ### Detail * get-parameter CloudfrontDistributionDomainName from us-east-1 (#1687 ) * Added Token Validations (#1682) * add warning to untrust data.all account when removing an environment (#1685) * add custom domain support for apigw (#1679) * Lambda Event Logs Handling (#1678) * Upgrade Spark version to 3.3 (#1675) - a0c63a4 * ES Search Query Collect All Response (#1631) * Extend Tenant Perms Coverage (#1630) * Limit Response info dataset queries (#1665) * Add Removal Policy Retain to Bucket Policy IaC (#1660) * log API handler response only for LOG_LEVEL DEBUG. Set log level INFO for prod deployments (#1662) * Add permission checks to markNotificationAsRead + deleteNotification (#1654) * Added error view and unified utility to check tenant user (#1657 * Userguide signout flow (#1629) ### Relates - Security release ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Noah Paige <[email protected]> Co-authored-by: Petros Kalos <[email protected]>
Feature or Bugfix
Detail
catalog_indexer_task
ensure we collect all hits from query response forwith_deletes
optionRelates
Security
Please answer the questions below briefly where applicable, or write
N/A
. Based onOWASP 10.
fetching data from storage outside the application (e.g. a database, an S3 bucket)?
eval
or similar functions are used?By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.