Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply the date histogram rewrite optimization to range aggregation #13865

Merged
merged 42 commits into from
Jun 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
53cb70f
Refactor the ranges representation
bowenlan-amzn May 21, 2024
69730b1
Refactor try fast filter
bowenlan-amzn May 22, 2024
1e2d7f4
Main work finished; left the handling of different numeric data types
bowenlan-amzn May 23, 2024
95b04dd
buildRanges accepts field type
bowenlan-amzn May 28, 2024
8dd1dda
first working draft probably
bowenlan-amzn May 29, 2024
c5d2175
Merge branch 'main' into 13531-range-agg
bowenlan-amzn May 29, 2024
ed79e02
add change log
bowenlan-amzn May 29, 2024
c7043e4
accommodate geo distance agg
bowenlan-amzn May 29, 2024
90d6790
Fix test
bowenlan-amzn May 29, 2024
67c281c
Merge branch 'main' into 13531-range-agg
bowenlan-amzn May 29, 2024
c10c775
[Refactor] range is lower inclusive, right exclusive
bowenlan-amzn May 31, 2024
783b14a
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 2, 2024
06b3372
adding test
bowenlan-amzn Jun 5, 2024
c6b5a9c
Adding test and refactor
bowenlan-amzn Jun 5, 2024
d590081
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 5, 2024
58e5281
refactor
bowenlan-amzn Jun 5, 2024
37c6d84
add test
bowenlan-amzn Jun 6, 2024
e0ba84b
add test and update the compare logic in tree traversal
bowenlan-amzn Jun 6, 2024
4603ec0
fix test, add random test
bowenlan-amzn Jun 6, 2024
afbce0c
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 6, 2024
9359fc2
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 6, 2024
54bfe92
refactor to address comments
bowenlan-amzn Jun 6, 2024
6ae1a9b
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 7, 2024
cc92c44
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 7, 2024
6546736
small potential performance update
bowenlan-amzn Jun 8, 2024
328006b
fix precommit
bowenlan-amzn Jun 9, 2024
a290e1d
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 9, 2024
23bbcbb
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 10, 2024
1b586bb
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 10, 2024
65de090
refactor
bowenlan-amzn Jun 11, 2024
f3c07c7
refactor
bowenlan-amzn Jun 11, 2024
fc0aff5
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 11, 2024
bab28e6
set refresh_interval to -1
bowenlan-amzn Jun 11, 2024
e545b90
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 11, 2024
78b4d9d
address comment
bowenlan-amzn Jun 11, 2024
185ed4e
address comment
bowenlan-amzn Jun 12, 2024
910b66a
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 12, 2024
a2d50ce
address comment
bowenlan-amzn Jun 13, 2024
fe85ad3
Fix test
bowenlan-amzn Jun 13, 2024
48a03a4
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 18, 2024
07a5293
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 18, 2024
9764b23
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
### Added
- Add fingerprint ingest processor ([#13724](https://github.com/opensearch-project/OpenSearch/pull/13724))
- [Remote Store] Rate limiter for remote store low priority uploads ([#14374](https://github.com/opensearch-project/OpenSearch/pull/14374/))
- Apply the date histogram rewrite optimization to range aggregation ([#13865](https://github.com/opensearch-project/OpenSearch/pull/13865))

### Dependencies
- Bump `org.gradle.test-retry` from 1.5.8 to 1.5.9 ([#13442](https://github.com/opensearch-project/OpenSearch/pull/13442))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
import com.fasterxml.jackson.core.JsonParseException;

import org.apache.lucene.document.Field;
import org.apache.lucene.document.LongPoint;
import org.apache.lucene.index.DocValues;
import org.apache.lucene.index.LeafReaderContext;
import org.apache.lucene.index.NumericDocValues;
Expand Down Expand Up @@ -165,7 +166,7 @@

public static final TypeParser PARSER = new TypeParser((n, c) -> new Builder(n, c.getSettings()));

public static final class ScaledFloatFieldType extends SimpleMappedFieldType {
public static final class ScaledFloatFieldType extends SimpleMappedFieldType implements NumericPointEncoder {

private final double scalingFactor;
private final Double nullValue;
Expand All @@ -188,6 +189,21 @@
this(name, true, false, true, Collections.emptyMap(), scalingFactor, null);
}

@Override
public byte[] encodePoint(Number value) {
assert value instanceof Double;
double doubleValue = (Double) value;
byte[] point = new byte[Long.BYTES];

Check warning on line 196 in modules/mapper-extras/src/main/java/org/opensearch/index/mapper/ScaledFloatFieldMapper.java

View check run for this annotation

Codecov / codecov/patch

modules/mapper-extras/src/main/java/org/opensearch/index/mapper/ScaledFloatFieldMapper.java#L195-L196

Added lines #L195 - L196 were not covered by tests
if (doubleValue == Double.POSITIVE_INFINITY) {
LongPoint.encodeDimension(Long.MAX_VALUE, point, 0);

Check warning on line 198 in modules/mapper-extras/src/main/java/org/opensearch/index/mapper/ScaledFloatFieldMapper.java

View check run for this annotation

Codecov / codecov/patch

modules/mapper-extras/src/main/java/org/opensearch/index/mapper/ScaledFloatFieldMapper.java#L198

Added line #L198 was not covered by tests
} else if (doubleValue == Double.NEGATIVE_INFINITY) {
LongPoint.encodeDimension(Long.MIN_VALUE, point, 0);

Check warning on line 200 in modules/mapper-extras/src/main/java/org/opensearch/index/mapper/ScaledFloatFieldMapper.java

View check run for this annotation

Codecov / codecov/patch

modules/mapper-extras/src/main/java/org/opensearch/index/mapper/ScaledFloatFieldMapper.java#L200

Added line #L200 was not covered by tests
} else {
LongPoint.encodeDimension(Math.round(scale(value)), point, 0);

Check warning on line 202 in modules/mapper-extras/src/main/java/org/opensearch/index/mapper/ScaledFloatFieldMapper.java

View check run for this annotation

Codecov / codecov/patch

modules/mapper-extras/src/main/java/org/opensearch/index/mapper/ScaledFloatFieldMapper.java#L202

Added line #L202 was not covered by tests
}
return point;

Check warning on line 204 in modules/mapper-extras/src/main/java/org/opensearch/index/mapper/ScaledFloatFieldMapper.java

View check run for this annotation

Codecov / codecov/patch

modules/mapper-extras/src/main/java/org/opensearch/index/mapper/ScaledFloatFieldMapper.java#L204

Added line #L204 was not covered by tests
}

public double getScalingFactor() {
return scalingFactor;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ setup:
date:
type: date
format: epoch_second
scaled_field:
type: scaled_float
scaling_factor: 100

- do:
cluster.health:
Expand Down Expand Up @@ -528,3 +531,139 @@ setup:
- is_false: aggregations.unsigned_long_range.buckets.2.to

- match: { aggregations.unsigned_long_range.buckets.2.doc_count: 0 }

---
"Double range profiler shows filter rewrite info":
- skip:
version: " - 2.99.99"
reason: debug info for filter rewrite added in 3.0.0 (to be backported to 2.15.0)

- do:
indices.create:
index: test_profile
body:
settings:
number_of_replicas: 0
refresh_interval: -1
mappings:
properties:
ip:
type: ip
double:
type: double
date:
type: date
format: epoch_second

- do:
bulk:
index: test_profile
refresh: true
body:
- '{"index": {}}'
- '{"double" : 42}'
- '{"index": {}}'
- '{"double" : 100}'
- '{"index": {}}'
- '{"double" : 50}'

- do:
search:
index: test_profile
body:
size: 0
profile: true
aggs:
double_range:
range:
field: double
ranges:
- to: 50
- from: 50
to: 150
- from: 150

- length: { aggregations.double_range.buckets: 3 }

- match: { aggregations.double_range.buckets.0.key: "*-50.0" }
- is_false: aggregations.double_range.buckets.0.from
- match: { aggregations.double_range.buckets.0.to: 50.0 }
- match: { aggregations.double_range.buckets.0.doc_count: 1 }
- match: { aggregations.double_range.buckets.1.key: "50.0-150.0" }
- match: { aggregations.double_range.buckets.1.from: 50.0 }
- match: { aggregations.double_range.buckets.1.to: 150.0 }
- match: { aggregations.double_range.buckets.1.doc_count: 2 }
- match: { aggregations.double_range.buckets.2.key: "150.0-*" }
- match: { aggregations.double_range.buckets.2.from: 150.0 }
- is_false: aggregations.double_range.buckets.2.to
- match: { aggregations.double_range.buckets.2.doc_count: 0 }

- match: { profile.shards.0.aggregations.0.debug.optimized_segments: 1 }
bowenlan-amzn marked this conversation as resolved.
Show resolved Hide resolved
- match: { profile.shards.0.aggregations.0.debug.unoptimized_segments: 0 }
- match: { profile.shards.0.aggregations.0.debug.leaf_visited: 1 }
- match: { profile.shards.0.aggregations.0.debug.inner_visited: 0 }

---
"Scaled Float Range Aggregation":
- do:
index:
index: test
id: 1
body: { "scaled_field": 1 }

- do:
index:
index: test
id: 2
body: { "scaled_field": 1.53 }

- do:
index:
index: test
id: 3
body: { "scaled_field": -2.1 }

- do:
index:
index: test
id: 4
body: { "scaled_field": 1.53 }

- do:
indices.refresh: { }

- do:
search:
index: test
body:
size: 0
aggs:
my_range:
range:
field: scaled_field
ranges:
- to: 0
- from: 0
to: 1
- from: 1
to: 1.5
- from: 1.5

- length: { aggregations.my_range.buckets: 4 }

- match: { aggregations.my_range.buckets.0.key: "*-0.0" }
- is_false: aggregations.my_range.buckets.0.from
- match: { aggregations.my_range.buckets.0.to: 0.0 }
- match: { aggregations.my_range.buckets.0.doc_count: 1 }
- match: { aggregations.my_range.buckets.1.key: "0.0-1.0" }
- match: { aggregations.my_range.buckets.1.from: 0.0 }
- match: { aggregations.my_range.buckets.1.to: 1.0 }
- match: { aggregations.my_range.buckets.1.doc_count: 0 }
- match: { aggregations.my_range.buckets.2.key: "1.0-1.5" }
- match: { aggregations.my_range.buckets.2.from: 1.0 }
- match: { aggregations.my_range.buckets.2.to: 1.5 }
- match: { aggregations.my_range.buckets.2.doc_count: 1 }
- match: { aggregations.my_range.buckets.3.key: "1.5-*" }
- match: { aggregations.my_range.buckets.3.from: 1.5 }
- is_false: aggregations.my_range.buckets.3.to
- match: { aggregations.my_range.buckets.3.doc_count: 2 }
Original file line number Diff line number Diff line change
Expand Up @@ -348,7 +348,7 @@ public DateFieldMapper build(BuilderContext context) {
*
* @opensearch.internal
*/
public static final class DateFieldType extends MappedFieldType {
public static final class DateFieldType extends MappedFieldType implements NumericPointEncoder {
protected final DateFormatter dateTimeFormatter;
protected final DateMathParser dateMathParser;
protected final Resolution resolution;
Expand Down Expand Up @@ -549,6 +549,13 @@ public static long parseToLong(
return resolution.convert(dateParser.parse(BytesRefs.toString(value), now, roundUp, zone));
}

@Override
public byte[] encodePoint(Number value) {
byte[] point = new byte[Long.BYTES];
LongPoint.encodeDimension(value.longValue(), point, 0);
return point;
}

@Override
public Query distanceFeatureQuery(Object origin, String pivot, float boost, QueryShardContext context) {
failIfNotIndexedAndNoDocValues();
Expand Down
Loading
Loading