Option to save and reload index table #125

sysimm · 2021-12-08T09:15:27Z

Hello,
I was wondering if it's possible to save the index table with the k-mers generated from input sequences to disk and later retrieve it, in order to speed up clustering. My idea is to do this for large datasets, using cdhit-2d: one input dataset would be provided by the user (i.e. the index table would always be computed on the fly) and the other would come from a prepared selection of datasets. For the latter, I would like to precompute index tables to speed up the overall comparison. I don't know how much of the total runtime is spent creating the index tables but I would imagine it to be considerable for large datasets. Please correct me if I'm wrong.
Please advise if this is possible at all or can be somehow done by tweaking the code.
Thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option to save and reload index table #125

Option to save and reload index table #125

sysimm commented Dec 8, 2021

Option to save and reload index table #125

Option to save and reload index table #125

Comments

sysimm commented Dec 8, 2021