Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NitriteCollection.update slow when unrelated properties in document are indexed #902

Closed
chris9182 opened this issue Feb 1, 2024 · 6 comments

Comments

@chris9182
Copy link

We found that calling:

elementCollection.update(
        FluentFilter.where(ATTRIBUTE_PROPERTY_ID).eq(Long.toString(index)),
        Document.createDocument().put(somePropertyName, someValue));

is very slow when the database is large (>200000 entries). We have indexed (multiple) properties in this collection, but not the one changed in the update (so no index on somePropertyName in the example above).

Starting the debugger and pausing at a random time shows the following stack trace:

String.intern() line: not available [native method]
ObjectStreamField.(String, String, boolean) line: 109
ObjectStreamClass.readNonProxy(ObjectInputStream) line: 714
ObjectInputStream.readClassDescriptor() line: 988
ObjectInputStream.readNonProxyDesc(boolean) line: 2034
ObjectInputStream.readClassDesc(boolean) line: 1909
ObjectInputStream.readOrdinaryObject(boolean) line: 2235
ObjectInputStream.readObject0(Class, boolean) line: 1744 ObjectInputStream.readObject(Class) line: 514
ObjectInputStream.readObject() line: 472
ObjectDataType.deserialize(byte[]) line: 377
ObjectDataType$SerializedObjectType.read(ByteBuffer, int) line: 1612
ObjectDataType.read(ByteBuffer) line: 256
ObjectDataType(BasicDataType).read(ByteBuffer, Object, int) line: 74
Page$Leaf<K,V>(Page<K,V>).read(ByteBuffer) line: 657
Page<K,V>.read(ByteBuffer, long, MVMap<K,V>) line: 262
SingleFileStore(FileStore).readPage(MVMap<K,V>, long) line: 1968
MVStore.readPage(MVMap<K,V>, long) line: 1021
MVMap<K,V>.readPage(long) line: 632
Page$NonLeaf<K,V>.getChildPage(int) line: 1117
Cursor<K,V>.hasNext() line: 64
MVMap$2$1.hasNext() line: 745
NitriteMVMap$1.hasNext() line: 141
IndexOperations.buildIndexInternal(IndexDescriptor, boolean) line: 188
IndexOperations.buildIndex(IndexDescriptor, boolean) line: 76
DocumentIndexWriter.removeIndexEntryInternal(IndexDescriptor, Document, NitriteIndexer) line: 105
DocumentIndexWriter.updateIndexEntry(Document, Document) line: 74
WriteOperations.update(Filter, Document, UpdateOptions) line: 178
CollectionOperations.update(Filter, Document, UpdateOptions) line: 102
DefaultNitriteCollection.update(Filter, Document, UpdateOptions) line: 125
DefaultNitriteCollection(NitriteCollection).update(Filter, Document) line: 131
BackendCollectionWrapper.update(Filter, Document) line: 78
... our code...

Our guess is that the update method does not take the modified properties into account (via the .put method in the example above) and updates all indices for properties available in the document that is retrieved by the filter. Our investigation lets us believe that there are two solutions:

Either these properties should be taken into account starting from org.dizitart.no2.collection.operation.WriteOperations.update(Filter, Document, UpdateOptions) Line 129, where the modified properties are available and documentIndexWriter.updateIndexEntry(oldDocument, processed); could be passed those properties, such that non-modified ones can be skipped.

Or in org.dizitart.no2.collection.operation.DocumentIndexWriter.updateIndexEntry(Document, Document) only the properties differing between oldDocument and newDocument should be indexed anew, which would have more overhead than the suggestion above.

This slow-down is currently a really big problem for us, as updating properties in documents is essential for our workflow.

@anidotnet
Copy link
Contributor

Thanks for the detailed analysis. I'll take a look at it

@anidotnet
Copy link
Contributor

@chris9182 the changes are in latest 4.2.1-SNAPSHOT. Can you test once and report back?

@chris9182
Copy link
Author

Thank you for the changes, could you please release 4.2.1 to maven repository? This would make testing a lot more convenient.

@anidotnet
Copy link
Contributor

I'll make release once you confirm the fix solved the issue you are facing. I know testing with snapshot version require certain changes to your dependency management system, but I really want to avoid unverified fix release.

@chris9182
Copy link
Author

I can confirm that the problems mentioned in both #902 and #901 no longer occur with the 4.2.1 snapshot version. Thank you for the fix!

@anidotnet
Copy link
Contributor

4.2.1 has been released to maven central.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants