Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure Opensearch model to search single fields #516

Merged
merged 1 commit into from
May 20, 2022

Conversation

jazairi
Copy link
Contributor

@jazairi jazairi commented May 19, 2022

Why these changes are being introduced:

The search model needs to be able to target any of the fields defined
as single-field searchable in v2.0 of the data model.

Relevant ticket(s):

https://mitlibraries.atlassian.net/browse/RDI-102

How this addresses that need:

This enables matching the following fields in the Opensearch model:

  • citation
  • contributors
  • funding_information
  • identifiers
  • locations
  • subjects

(Note that title was already made searchable as part of a previous
commit.)

Side effects of this change:

  • These fields are not yet enabled in GraphQL, so you'll need to use
    REST to confirm this behavior.
  • While this allows us to search a single nested subfield, it does not
    it does not allow for searching multiple nested subfields. For example,
    searching contributors should search both 'value' and 'identifier', but
    right now we can only search 'value'.

Requires Database Migrations?

NO

Includes new or updated dependencies?

NO

@JPrevost JPrevost self-assigned this May 20, 2022
@JPrevost
Copy link
Member

I've opened https://mitlibraries.atlassian.net/browse/RDI-116 to confirm what the expectations are for nested fields in regards to targeted searching.

Copy link
Member

@JPrevost JPrevost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid approach to the problem. I noted a block of code that needs to be added back in and then I think this works for what we need now and if we understand more later about how we want to target nested fields later we can adjust.

app/models/opensearch.rb Outdated Show resolved Hide resolved
match_single_field_nested(:funding_information, m)
match_single_field_nested(:identifiers, m)
match_single_field_nested(:locations, m)
match_single_field_nested(:subjects, m)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this abstraction.

def match_single_field_nested(field, match_array)
return unless @params[field]

match_array << {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been futzing with this a bit an it looks like this requires an exact match so it works more like a filter than a query. I'm not going to propose we change it just yet because I'm not entirely sure if I'm correct, and I figure we can dig in on it a bit more when we start the GraphQL work or when we take not he ticket that clarifies what expectations are for each field that can be directly searched.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a fix for that for nested fields is to use multi_match. I've noted that in the rebased commit message. However, it seems like this is also the case for simple fields, and I think multi_match only works for nested...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it gets really complicated really fast and it may actually be how the fields are configured in the index so we have a bunch of directions to investigate once we get all the core bits built out.

Why these changes are being introduced:

The search model needs to be able to target any of the fields defined
as single-field searchable in v2.0 of the data model.

Relevant ticket(s):

https://mitlibraries.atlassian.net/browse/RDI-102

How this addresses that need:

This enables matching on the following fields in the Opensearch model:

* citation
* contributors
* funding_information
* identifiers
* locations
* subjects

(Note that title was already made searchable as part of a previous
commit.)

Side effects of this change:

* These fields are not yet enabled in GraphQL, so you'll need to use
REST to confirm this behavior.
* While this allows us to search a single nested subfield, it does not
it does not allow for searching multiple nested subfields. For example,
searching contributors should search both 'value' and 'identifier', but
right now we can only search 'value'. RDI-116 will confirm expectations
on which subfields must be searchable.
* The way this targets individual fields seems to require an exact match.
This may be resolvable for nested fields by using `multi_match`, which
would also allow us to search multiple subfields. We plan to explore
this more as part of the graphql work.
@jazairi jazairi force-pushed the rdi-102-specific-field-searching branch from 7f300a2 to 5fefc47 Compare May 20, 2022 14:46
@jazairi jazairi merged commit 0bd732c into main May 20, 2022
@jazairi jazairi deleted the rdi-102-specific-field-searching branch May 20, 2022 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants