-
-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance issues with large datasets #291
Comments
50k is not a large dataset, it easily indexes millions of rows... can you
tell us a bit more about the structure and how/where do you index the
data...?
…On Thu, Jun 29, 2023 at 10:41 AM somegooser ***@***.***> wrote:
Hi,
I have performance issues when indexing large datasets with 50000 records.
Ik takes 30+ minutes.
The indexed content is not even long. It is approximately 50 characters
per row.
This also happens with another datasets with only 500 rows with LONG tekst.
Any information how to boost performance?
—
Reply to this email directly, view it on GitHub
<#291>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAQMGWR77K52LIWKECB57W3XNU5SNANCNFSM6AAAAAAZYGBQLY>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thanks for the reply. I am using a simple dataset with 50000+ company names. I am only using a custom tokenizer. |
ok can you show us the code for the tokenizer and table structure?
…On Thu, Jun 29, 2023 at 10:54 AM somegooser ***@***.***> wrote:
Thanks for the reply.
I am using a simple dataset with 50000+ company names. I am only using a
custom tokenizer.
—
Reply to this email directly, view it on GitHub
<#291 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAQMGWVZLOBLSFEJXV5OE7DXNU7C7ANCNFSM6AAAAAAZYGBQLY>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hi, This is my tokenizer `<?php namespace Search; use TeamTNT\TNTSearch\Support\AbstractTokenizer; class Tokenizer extends AbstractTokenizer implements TokenizerInterface
} My query is super simpel
|
I'm using this package with a result set of 1.5 million records. Indexing from scratch takes ~5 minutes. |
Thats crazy... There is something weird anyways. Indexing 10.000 rows takes like 20 seconds on my server. Could it be something with the size of the index file? |
Hi,
I have performance issues when indexing large datasets with 50000 records. Ik takes 30+ minutes.
The indexed content is not even long. It is approximately 50 characters per row.
This also happens with another datasets with only 500 rows with LONG tekst.
Any information how to boost performance?
The text was updated successfully, but these errors were encountered: