Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support of Elasticsearch Token Filters #16843

Merged
merged 17 commits into from
Oct 17, 2024

Conversation

denispetrische
Copy link
Contributor

@denispetrische denispetrische commented Oct 7, 2024

Fixes #16384

Also:

  1. Added displaying of "settings" segment of elastic search, to make sure, that analyzers with token filters were created.
  2. Analyzers that contain filters will be created with "default" name

Now it is available to create analyzers and filters like that:

"OrchardCore_Elasticsearch": {
      "ConnectionType": "SingleNodeConnectionPool",
      "Url": "http://localhost",
      "Ports": [ 9200 ],
      //"Username": "admin",
      //"Password": "admin",
      //"CloudId": "Orchard_Core_deployment:ZWFzdHVzMi5henVyZS5lbGFzdGljLWNsb3VkLmNvbTo0NDMkNmMxZGQ4YzAzN2=",
      //"CertificateFingerprint": "75:21:E7:92:8F:D5:7A:27:06:38:8E:A4:35:FE:F5:17:D7:37:F4:DF:F0:9A:D2:C0:C4:B6:FF:EE:D1:EA:2B:A7",
      //"EnableApiVersioningHeader": false,
      "Analyzers": {
        "default": {
          "type": "standard",
          "filter": [
            "lowercase",
            "english_stop",
            "english_stemmer"
          ]
        },
        "custom_analyzer": {
          "type": "standard",
          "filter": [
            "lowercase",
            "english_stop",
            "english_stemmer"
          ]
        }
      },
      "Filter": {
        "english_stop": {
          "type": "stop",
          "stopwords": "_english_"
        },
        "english_stemmer": {
          "type": "stemmer",
          "language": "english"
        }
      }
    }

Copy link
Contributor

github-actions bot commented Oct 7, 2024

Thank you for submitting your first pull request, awesome! 🚀 If you haven't already, please take a moment to review our contribution guide. This guide provides helpful information to ensure your contribution aligns with our standards. A core team member will review your pull request.

If you like Orchard Core, please star our repo and join our community channels.

@denispetrische
Copy link
Contributor Author

@dotnet-policy-service agree

@denispetrische denispetrische changed the title Issue16384 Issue16384 Added support of ElasticSearch Token Filters Oct 7, 2024
@Piedone
Copy link
Member

Piedone commented Oct 7, 2024

Could you please check this, @Skrypt?

@Piedone Piedone changed the title Issue16384 Added support of ElasticSearch Token Filters Add support of Elasticsearch Token Filters Oct 9, 2024
@Piedone
Copy link
Member

Piedone commented Oct 14, 2024

@Skrypt what do you think? No problem if not, just please reply.

@Skrypt
Copy link
Contributor

Skrypt commented Oct 15, 2024

I will take look.

@Piedone
Copy link
Member

Piedone commented Oct 15, 2024

Great, thank you!

@@ -432,14 +432,15 @@ public async Task<ActionResult> ForceDelete(ElasticIndexSettingsViewModel model)
return RedirectToAction(nameof(Index));
}

public async Task<IActionResult> Mappings(string indexName)
public async Task<IActionResult> IndexInfo(string indexName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this was named Mappings because that's how ElasticSearch calls it in its terminology.

https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html

Copy link
Contributor

@Skrypt Skrypt Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, need to see what is the difference between GetIndexMappings and GetIndexInfo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you send me a screenshot of the resulting textfield in the Admin UI so that I can compare?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
image

Sure, here it is. The whole idea is to see that analyzers and filters were applied.

Copy link
Contributor Author

@denispetrische denispetrische Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
  "default_newind": {
    "aliases": {},
    "mappings": {
      "_meta": {
        "last_task_id": 0
      },
      "_source": {
        "excludes": [
          "Content.ContentItem.DisplayText.Analyzed"
        ]
      },
      "dynamic_templates": [
        {
          "*.Inherited": {
            "path_match": "*.Inherited",
            "match_mapping_type": "string",
            "mapping": {
              "type": "keyword"
            }
          }
        },
        {
          "*.Ids": {
            "path_match": "*.Ids",
            "match_mapping_type": "string",
            "mapping": {
              "type": "keyword"
            }
          }
        }
      ],
      "properties": {
        "Content": {
          "properties": {
            "ContentItem": {
              "properties": {
                "ContainedPart": {
                  "properties": {
                    "Ids": {
                      "type": "keyword"
                    },
                    "Order": {
                      "type": "float"
                    }
                  }
                },
                "ContentType": {
                  "type": "keyword"
                },
                "DisplayText": {
                  "properties": {
                    "Analyzed": {
                      "type": "text"
                    },
                    "Normalized": {
                      "type": "keyword"
                    },
                    "keyword": {
                      "type": "keyword"
                    }
                  }
                },
                "FullText": {
                  "type": "text"
                },
                "Owner": {
                  "type": "keyword"
                }
              }
            }
          }
        },
        "ContentItemId": {
          "type": "keyword"
        },
        "ContentItemVersionId": {
          "type": "keyword"
        }
      }
    },
    "settings": {
      "index": {
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_content"
            }
          }
        },
        "number_of_shards": "1",
        "provided_name": "default_newind",
        "creation_date": "1728302642111",
        "analysis": {
          "filter": {
            "russian_stemmer": {
              "type": "stemmer",
              "language": "russian"
            },
            "english_stemmer": {
              "type": "stemmer",
              "language": "russian"
            },
            "russian_stop": {
              "type": "stop",
              "stopwords": "_russian_"
            },
            "russian_synonyms": {
              "type": "synonym_graph",
              "synonyms": [
                "\u0437\u043E\u043B\u043E\u0442\u043E, AU, XAU",
                "\u043F\u043E\u0437\u0438\u0446\u0438\u0438, \u0438\u043D\u0442\u0435\u0440\u0435\u0441"
              ]
            },
            "english_stop": {
              "type": "stop",
              "stopwords": "_russian_"
            }
          },
          "analyzer": {
            "default": {
              "filter": [
                "lowercase",
                "russian_stop",
                "russian_synonyms",
                "russian_stemmer",
                "english_stop",
                "english_stemmer"
              ],
              "type": "custom",
              "tokenizer": "standard"
            }
          }
        },
        "number_of_replicas": "1",
        "uuid": "V-oB0lgQRmWMiKQCJLLLTg",
        "version": {
          "created": "7160299"
        }
      }
    }
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, need to see what is the difference between GetIndexMappings and GetIndexInfo.

The difference now, is that shows not only "mappings" section but full index information

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok for IndexInfo then

@@ -7,4 +7,6 @@ public class ElasticsearchOptions
public string IndexPrefix { get; set; }

public Dictionary<string, JsonObject> Analyzers { get; } = [];

public Dictionary<string, JsonObject> Filter { get; } = [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filters

Copy link
Contributor Author

@denispetrische denispetrische Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

I think we should leave the name as "Filter", because it is how this section called in elasticsearch terminology.

https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-custom-analyzer.html
Settings.Analysis.FIlter and there listed custom token filters

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, ok, that's weird but I guess that we can live with it.

@@ -0,0 +1,81 @@
using System.Text.Json;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I like this code. It is cleaner.

@Skrypt
Copy link
Contributor

Skrypt commented Oct 15, 2024

Looking good from first review. Did not test yet.

Copy link
Contributor

@Skrypt Skrypt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved but needs to be tested.

Copy link
Contributor

This pull request has merge conflicts. Please resolve those before requesting a review.

@hishamco
Copy link
Member

@denispetrische could you please fix the conflict to react to my recent changes in #16896

@denispetrische
Copy link
Contributor Author

@denispetrische could you please fix the conflict to react to my recent changes in #16896

Done

@hishamco hishamco enabled auto-merge (squash) October 17, 2024 21:52
@hishamco hishamco merged commit 99c8b77 into OrchardCMS:main Oct 17, 2024
11 checks passed
Copy link
Contributor

Congratulations on your first PR merge! 🎉 Thank you for your contribution! We're looking forward to welcoming other contributions of yours in the future. @all-contributors please add @denispetrische for code.

If you like Orchard Core, please star our repo and join our community channels.

Copy link
Contributor

@github-actions[bot]

I've put up a pull request to add @denispetrische! 🎉

@hishamco
Copy link
Member

Thanks for your contribution @denispetrische

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Orchard corrupts Elastic stemmer configuration
5 participants