Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support of Elasticsearch Token Filters #16843

Merged
merged 17 commits into from
Oct 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -432,14 +432,15 @@ public async Task<ActionResult> ForceDelete(ElasticIndexSettingsViewModel model)
return RedirectToAction(nameof(Index));
}

public async Task<IActionResult> Mappings(string indexName)
public async Task<IActionResult> IndexInfo(string indexName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this was named Mappings because that's how ElasticSearch calls it in its terminology.

https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html

Copy link
Contributor

@Skrypt Skrypt Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, need to see what is the difference between GetIndexMappings and GetIndexInfo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you send me a screenshot of the resulting textfield in the Admin UI so that I can compare?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
image

Sure, here it is. The whole idea is to see that analyzers and filters were applied.

Copy link
Contributor Author

@denispetrische denispetrische Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
  "default_newind": {
    "aliases": {},
    "mappings": {
      "_meta": {
        "last_task_id": 0
      },
      "_source": {
        "excludes": [
          "Content.ContentItem.DisplayText.Analyzed"
        ]
      },
      "dynamic_templates": [
        {
          "*.Inherited": {
            "path_match": "*.Inherited",
            "match_mapping_type": "string",
            "mapping": {
              "type": "keyword"
            }
          }
        },
        {
          "*.Ids": {
            "path_match": "*.Ids",
            "match_mapping_type": "string",
            "mapping": {
              "type": "keyword"
            }
          }
        }
      ],
      "properties": {
        "Content": {
          "properties": {
            "ContentItem": {
              "properties": {
                "ContainedPart": {
                  "properties": {
                    "Ids": {
                      "type": "keyword"
                    },
                    "Order": {
                      "type": "float"
                    }
                  }
                },
                "ContentType": {
                  "type": "keyword"
                },
                "DisplayText": {
                  "properties": {
                    "Analyzed": {
                      "type": "text"
                    },
                    "Normalized": {
                      "type": "keyword"
                    },
                    "keyword": {
                      "type": "keyword"
                    }
                  }
                },
                "FullText": {
                  "type": "text"
                },
                "Owner": {
                  "type": "keyword"
                }
              }
            }
          }
        },
        "ContentItemId": {
          "type": "keyword"
        },
        "ContentItemVersionId": {
          "type": "keyword"
        }
      }
    },
    "settings": {
      "index": {
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_content"
            }
          }
        },
        "number_of_shards": "1",
        "provided_name": "default_newind",
        "creation_date": "1728302642111",
        "analysis": {
          "filter": {
            "russian_stemmer": {
              "type": "stemmer",
              "language": "russian"
            },
            "english_stemmer": {
              "type": "stemmer",
              "language": "russian"
            },
            "russian_stop": {
              "type": "stop",
              "stopwords": "_russian_"
            },
            "russian_synonyms": {
              "type": "synonym_graph",
              "synonyms": [
                "\u0437\u043E\u043B\u043E\u0442\u043E, AU, XAU",
                "\u043F\u043E\u0437\u0438\u0446\u0438\u0438, \u0438\u043D\u0442\u0435\u0440\u0435\u0441"
              ]
            },
            "english_stop": {
              "type": "stop",
              "stopwords": "_russian_"
            }
          },
          "analyzer": {
            "default": {
              "filter": [
                "lowercase",
                "russian_stop",
                "russian_synonyms",
                "russian_stemmer",
                "english_stop",
                "english_stemmer"
              ],
              "type": "custom",
              "tokenizer": "standard"
            }
          }
        },
        "number_of_replicas": "1",
        "uuid": "V-oB0lgQRmWMiKQCJLLLTg",
        "version": {
          "created": "7160299"
        }
      }
    }
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, need to see what is the difference between GetIndexMappings and GetIndexInfo.

The difference now, is that shows not only "mappings" section but full index information

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok for IndexInfo then

{
var mappings = await _elasticIndexManager.GetIndexMappings(indexName);
var formattedJson = JNode.Parse(mappings).ToJsonString(JOptions.Indented);
return View(new MappingsViewModel
var info = await _elasticIndexManager.GetIndexInfo(indexName);

var formattedJson = JNode.Parse(info).ToJsonString(JOptions.Indented);
return View(new IndexInfoViewModel
{
IndexName = _elasticIndexManager.GetFullIndexName(indexName),
Mappings = formattedJson
IndexInfo = formattedJson
});
}

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
using System.Text.Json;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I like this code. It is cleaner.

using System.Text.Json.Nodes;
using Microsoft.Extensions.Configuration;
using OrchardCore.Environment.Shell.Configuration;

namespace OrchardCore.Search.Elasticsearch;

internal static class ElasticsearchOptionsExtensions
{
internal static ElasticsearchOptions AddAnalyzers(this ElasticsearchOptions options, IConfigurationSection configuration)
{
var jsonNode = configuration.GetSection(nameof(options.Analyzers)).AsJsonNode();
var jsonElement = JsonSerializer.Deserialize<JsonElement>(jsonNode);

var analyzersObject = JsonObject.Create(jsonElement, new JsonNodeOptions()
{
PropertyNameCaseInsensitive = true,
});

if (analyzersObject is not null)
{
if (jsonNode is JsonObject jAnalyzers)
{
foreach (var analyzer in jAnalyzers)
{
if (analyzer.Value is not JsonObject jAnalyzer)
{
continue;
}

options.Analyzers.Add(analyzer.Key, jAnalyzer);
}
}
}

if (options.Analyzers.Count == 0)
{
// When no analyzers are configured, we'll define a default analyzer.
options.Analyzers.Add(ElasticsearchConstants.DefaultAnalyzer, new JsonObject
{
["type"] = ElasticsearchConstants.DefaultAnalyzer,
});
}

return options;
}

internal static ElasticsearchOptions AddFilter(this ElasticsearchOptions options, IConfigurationSection configuration)
{
var jsonNode = configuration.GetSection(nameof(options.Filter)).AsJsonNode();
var jsonElement = JsonSerializer.Deserialize<JsonElement>(jsonNode);

var filterObject = JsonObject.Create(jsonElement, new JsonNodeOptions()
{
PropertyNameCaseInsensitive = true,
});

if (filterObject is not null)
{
if (jsonNode is JsonObject jFilters)
{
foreach (var filter in jFilters)
{
if (filter.Value is not JsonObject jFilter)
{
continue;
}

options.Filter.Add(filter.Key, jFilter);
}
}
}

return options;
}

internal static ElasticsearchOptions AddIndexPrefix(this ElasticsearchOptions options, IConfigurationSection configuration)
{
options.IndexPrefix = configuration.GetValue<string>(nameof(options.IndexPrefix));

return options;
denispetrische marked this conversation as resolved.
Show resolved Hide resolved
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -54,42 +54,9 @@ public override void ConfigureServices(IServiceCollection services)
{
var configuration = _shellConfiguration.GetSection(ElasticConnectionOptionsConfigurations.ConfigSectionName);

o.IndexPrefix = configuration.GetValue<string>(nameof(o.IndexPrefix));

var jsonNode = configuration.GetSection(nameof(o.Analyzers)).AsJsonNode();
var jsonElement = JsonSerializer.Deserialize<JsonElement>(jsonNode);

var analyzersObject = JsonObject.Create(jsonElement, new JsonNodeOptions()
{
PropertyNameCaseInsensitive = true,
});

if (analyzersObject != null)
{
o.IndexPrefix = configuration.GetValue<string>(nameof(o.IndexPrefix));

if (jsonNode is JsonObject jAnalyzers)
{
foreach (var analyzer in jAnalyzers)
{
if (analyzer.Value is not JsonObject jAnalyzer)
{
continue;
}

o.Analyzers.Add(analyzer.Key, jAnalyzer);
}
}
}

if (o.Analyzers.Count == 0)
{
// When no analyzers are configured, we'll define a default analyzer.
o.Analyzers.Add(ElasticsearchConstants.DefaultAnalyzer, new JsonObject
{
["type"] = ElasticsearchConstants.DefaultAnalyzer,
});
}
o.AddIndexPrefix(configuration);
o.AddFilter(configuration);
o.AddAnalyzers(configuration);
});

services.AddElasticServices();
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
namespace OrchardCore.Search.Elasticsearch.ViewModels;

public class MappingsViewModel
public class IndexInfoViewModel
{
public string IndexName { get; set; }
public string Mappings { get; set; }
public string IndexInfo { get; set; }
}
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@
<p>@entry.AnalyzerName</p>
</div>
<div class="float-end">
<a asp-action="Mappings" asp-route-IndexName="@entry.Name" class="btn btn-primary btn-sm">@T["Mappings"]</a>
<a asp-action="IndexInfo" asp-route-IndexName="@entry.Name" class="btn btn-primary btn-sm">@T["Index Info"]</a>
<a asp-action="Query" asp-route-IndexName="@entry.Name" class="btn btn-success btn-sm">@T["Query"]</a>
<a asp-action="Reset" asp-route-id="@entry.Name" class="btn btn-primary btn-sm" data-title="@T["Reset Index"]" data-message="@T["This will restart the indexing of all content items. Continue?"]" data-ok-text="@T["Yes"]" data-cancel-text="@T["No"]" data-ok-class="btn-primary" data-url-af="UnsafeUrl">@T["Reset"]</a>
<a asp-action="Rebuild" asp-route-id="@entry.Name" class="btn btn-warning btn-sm" data-title="@T["Rebuild Index"]" data-message="@T["Your index will be rebuilt, which might alter some services during the process. Continue?"]" data-ok-text="@T["Yes"]" data-cancel-text="@T["No"]" data-url-af="UnsafeUrl">@T["Rebuild"]</a>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
@model MappingsViewModel
@model IndexInfoViewModel

<h2>@T["Elasticsearch index mappings"]</h2>
<h2>@T["Elasticsearch index information"]</h2>

<div class="row">
<div class="col">
<div class="mb-4">
<label asp-for="Mappings" class="form-label">@T["Mappings for: "]@Model.IndexName</label>
<label asp-for="IndexInfo" class="form-label">@T["Info for: "]@Model.IndexName</label>
<div class="form-control" style="min-height: 400px;">
<div id="@Html.IdFor(x => x.Mappings)_editor" style="min-height: 385px;" dir="@Orchard.CultureDir()" data-schema='@Model.Mappings'></div>
<div id="@Html.IdFor(x => x.IndexInfo)_editor" style="min-height: 385px;" dir="@Orchard.CultureDir()" data-schema='@Model.IndexInfo'></div>
</div>
<textarea asp-for="Mappings" hidden></textarea>
<span class="hint">@T["The Elasticsearch index mapping. For reference only."]</span>
<textarea asp-for="IndexInfo" hidden></textarea>
<span class="hint">@T["The Elasticsearch index information. For reference only."]</span>
</div>
</div>
</div>
Expand All @@ -35,8 +35,8 @@
setTheme();

var modelUri = monaco.Uri.parse("x://orchardcore.search.elastic.mappings.json");
var editor = document.getElementById('@Html.IdFor(x => x.Mappings)_editor');
var textArea = document.getElementById('@Html.IdFor(x => x.Mappings)');
var editor = document.getElementById('@Html.IdFor(x => x.IndexInfo)_editor');
var textArea = document.getElementById('@Html.IdFor(x => x.IndexInfo)');
var schema = JSON.parse(editor.dataset.schema)
var model = monaco.editor.createModel(textArea.value, "json", modelUri);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,6 @@ public class ElasticsearchOptions
public string IndexPrefix { get; set; }

public Dictionary<string, JsonObject> Analyzers { get; } = [];

public Dictionary<string, JsonObject> Filter { get; } = [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filters

Copy link
Contributor Author

@denispetrische denispetrische Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

I think we should leave the name as "Filter", because it is how this section called in elasticsearch terminology.

https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-custom-analyzer.html
Settings.Analysis.FIlter and there listed custom token filters

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, ok, that's weird but I guess that we can live with it.

}
Loading