Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

projection segment merge fixes #17460

Conversation

clintropolis
Copy link
Member

@clintropolis clintropolis commented Nov 7, 2024

changes:

  • fix issue when merging projections from multiple-incremental persists which was hoping that some 'dim conversion' buffers were not closed, but they already were (by the merging iterator). fix involves selectively persisting these conversion buffers to temp files in the segment write out directory and mapping them and tying them to the segment level closer so that they are available after the lifetime of the parent merger
  • modify auto column serializers to use segment write out directory for temp files instead of java.io.tmpdir
  • fix queryable index projection to not put the time-like column as a dimension, instead only adding it as __time

Alternative to #17388, this does a nicer thing of persisting temporary files containing the id mapping buffers of parent mergers to re-use when merging projections, similar to what auto columns were doing with their dictionary.

I made a conservative but kind of odd change to DimensionHandler interface involving a default implementation of makeMerger that calls a now deprecated existing version of makeMerger to do a signature change to make this PR a bit less disruptive in the event we want to do a 31.1 patch release (i would like to do this to fix another issue with the 31 release, I will start a thread soonly). DimensionHandler is not officially an extension point, but is the mechanism which custom dimensions can be defined, so I wanted to be chill about this change. After this PR I will open another one to delete the deprecated thing and remove the default implementation. The interface was changed to add a parameter for the path to where we are writing out the segment in case a merger or serializer needs to create any temporary files, since prior to this the task define storage location was not directly available to the mergers, so this lets things write stuff to a more correct place.

While I was here I modified the 'auto' column merger to use this new segment write out path argument for its temp files instead of java.io.tmpDir.

Also fixes an issue with projections on a QueryableIndex to make them not put the time-like column as a dimension, instead only adding it as __time to ensure it doesn't get handled incorrectly.

changes:
* fix issue when merging projections from multiple-incremental persists which was hoping that some 'dim conversion' buffers were not closed, but they already were (by the merging iterator). fix involves selectively persisting these conversion buffers to temp files in the segment write out directory and mapping them and tying them to the segment level closer so that they are available after the lifetime of the parent merger
* modify auto column serializers to use segment write out directory for temp files instead of java.io.tmpdir
* fix queryable index projection to not put the time-like column as a dimension, instead only adding it as __time
Comment on lines +154 to +161
return makeMerger(
outputName,
indexSpec,
segmentWriteOutMedium,
capabilities,
progress,
closer
);

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note

Invoking
DimensionHandler.makeMerger
should be avoided because it has been deprecated.
@clintropolis clintropolis mentioned this pull request Nov 12, 2024
/**
* Write a {@link Serializer} to a {@link Path} and then map it as a read-only {@link MappedByteBuffer}
*/
@SuppressFBWarnings("NP_NONNULL_PARAM_VIOLATION")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serializer is in a package where EverythingIsNonnullByDefault. It seems the second arg to writeTo can actually be null though. It would be nice to clean this up.

Please consider one of these things:

  • Add a @Nullable annotation to the second arg, update javadoc describing what happens when the smoosher is null, and remove these warning suppressions. (This one, and the DataFlowIssue one.)
  • Or, replace Serializer here with a more specific subclass or sub-interface that already has @Nullable on the second arg. I see that some do.
  • Or, update the javadoc for mapSerializer to say that not all Serializer are supported: the caller must take care to pass in one that supports a null Smoosher in writeTo.

I'm not sure which approach is best. It depends on whether all Serializer are meant to support null Smoosher or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the answer to this is a bit complicated, yea some serializers do not require a FileSmoosher (basically everything that does not support "large" files greater than 2gb does not need a smoosher). Technically none of the current callers need a smoosher since the new caller is just taking an IntBuffer and persisting/mapping it, and the other callers are using FixedIndexed which does not support large files yet, but accepting a Serializer here is kind of risky for future misuse.

I think I like two options here.

One option is that mapSerializer just makes its own FileSmoosher/SmooshedWriter and just write the serializer into a temporary smoosh at the same name as the temp file name, and then just return the loaded smoosh instead of a mapped byte buffer directly. This is actually already what DictionaryIdLookup (where i refactored this code from) is doing for string dictionaries, https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/segment/nested/DictionaryIdLookup.java#L235 and would let it work with any serializer implementation, and presumably the thing we want to read from the mapped file knows how to load other stuff from the smoosher.

The other option that seems worth considering is as you said, instead of passing in Serializer just pass in some other interface that writes with just WriteableByteChannel and a length argument with the expectation that its only going to write to the channel and then map the byte buffer and it fits in the buffer. This solves the immediate use case I think, but is sort of limited as long as we are stuck with ByteBuffer.

I think I'm sort of leaning towards the 'make a new smoosh' option since it seems the most flexible, its like "make a smoosh file that contains only the contents of this serializer", it does have a little extra overhead for things that do fit in a single buffer, but doesn't seem too bad. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, a good approach is to adjust this to pass an actual FileSmoosher, and to return SmooshFileMapper rather than MappedByteBuffer.

@@ -101,6 +116,9 @@ public abstract class DictionaryEncodedColumnMerger<T extends Comparable<T>> imp
@Nullable
protected T firstDictionaryValue;

protected File segmentBaseDir;
@MonotonicNonNull
protected PersistedIdConversions persistedIdConversions;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Javadoc please that this is used for projections.

@@ -100,6 +102,25 @@ default MultiValueHandling getMultivalueHandling()
*/
DimensionIndexer<EncodedType, EncodedKeyComponentType, ActualType> makeIndexer(boolean useMaxMemoryEstimates);

/**
* @deprecated use {@link #makeMerger(String, IndexSpec, SegmentWriteOutMedium, ColumnCapabilities, ProgressIndicator, File, Closer)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Javadoc please about why this still exists. I think it's to avoid breaking DimensionHandler impls that exist in extensions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, i'll add, though I do plan to remove this shortly after this PR is merged, just trying to not be disruptive in the event we do a 31.1 (which i'd like to do as mentioned in PR description)

* state which might be required for the projection mergers to do their thing. This method MUST be called prior to
* performing any merge work
*/
default void markAsParent()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As to types that may exist in extensions: is this default impl ok? Let's say some type exists in an extension with its own dictionaries, and inherits this default impl. What happens then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a missing piece for this to "just work" with default implementations, this method is fine doing nothing, but attachParent actually needs to be modified to something like

  default void attachParent(DimensionMergerV9 parent, List<IndexableAdapter> projectionAdapters) throws IOException
  {
    // by default fall through to writing merged dictionary
    writeMergedValueDictionary(projectionAdapters);
  }

since it is called as an alternative to writeMergedValueDictionary, https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/segment/IndexMergerV9.java#L395, will fix in this PR

ColumnSerializerUtils.mapSerializer(longDictionaryFile, longDictionaryWriter, fileName)
);
try {
final ByteBuffer buffer = longBufferMapper.mapFile(fileName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these seem to be assuming that there will be no secondary smoosh components. Is that a good assumption? Should it be checked? If this a common pattern: maybe add a mapOnlyFile to SmooshFileMapper that verifies that the mapped file is the only one?

Copy link
Member Author

@clintropolis clintropolis Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is a fine assumption for now, FixedIndexed currently only writes to a single chunk and doesn't support "large" parts that are spread across multiple containers like GenericIndexed. Once this changes, we would still map the entry point file container the same way (assuming we are still stuck with ByteBuffer and have to chunk big stuff) and pass in the smoosh mapper to FixedIndexed.read similar to how we do for the string case:

...
        final ByteBuffer stringBuffer = stringBufferMapper.mapFile(fileName);
        stringDictionary = StringEncodingStrategies.getStringDictionarySupplier(
            stringBufferMapper,
            stringBuffer,
            ByteOrder.nativeOrder()
        ).get();
...

I could add some checks in case someone updates FixedIndexedWriter to allow it to spread across chunks but forgets to update these read call sites, though I imagine that if this is done all of the calls to FixedIndexed.read would be updated to take a mapper if we do it in a similar way as GenericIndexed.

@clintropolis clintropolis merged commit 24a1faf into apache:master Nov 16, 2024
89 checks passed
@clintropolis clintropolis deleted the fix-projection-merge-persistent-idmapping branch November 16, 2024 00:46
@clintropolis clintropolis added this to the 31.0.1 milestone Nov 22, 2024
clintropolis added a commit to clintropolis/druid that referenced this pull request Nov 22, 2024
changes:
* fix issue when merging projections from multiple-incremental persists which was hoping that some 'dim conversion' buffers were not closed, but they already were (by the merging iterator). fix involves selectively persisting these conversion buffers to temp files in the segment write out directory and mapping them and tying them to the segment level closer so that they are available after the lifetime of the parent merger
* modify auto column serializers to use segment write out directory for temp files instead of java.io.tmpdir
* fix queryable index projection to not put the time-like column as a dimension, instead only adding it as __time
* use smoosh for temp files so can safely write any Serializer to a temp smoosh
clintropolis added a commit to clintropolis/druid that referenced this pull request Nov 22, 2024
changes:
* fix issue when merging projections from multiple-incremental persists which was hoping that some 'dim conversion' buffers were not closed, but they already were (by the merging iterator). fix involves selectively persisting these conversion buffers to temp files in the segment write out directory and mapping them and tying them to the segment level closer so that they are available after the lifetime of the parent merger
* modify auto column serializers to use segment write out directory for temp files instead of java.io.tmpdir
* fix queryable index projection to not put the time-like column as a dimension, instead only adding it as __time
* use smoosh for temp files so can safely write any Serializer to a temp smoosh
clintropolis added a commit that referenced this pull request Nov 22, 2024
changes:
* fix issue when merging projections from multiple-incremental persists which was hoping that some 'dim conversion' buffers were not closed, but they already were (by the merging iterator). fix involves selectively persisting these conversion buffers to temp files in the segment write out directory and mapping them and tying them to the segment level closer so that they are available after the lifetime of the parent merger
* modify auto column serializers to use segment write out directory for temp files instead of java.io.tmpdir
* fix queryable index projection to not put the time-like column as a dimension, instead only adding it as __time
* use smoosh for temp files so can safely write any Serializer to a temp smoosh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants