Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing exported indexes doesn't populate store (#249) #384

Open
zanzlender opened this issue Apr 3, 2023 · 5 comments
Open

Importing exported indexes doesn't populate store (#249) #384

zanzlender opened this issue Apr 3, 2023 · 5 comments

Comments

@zanzlender
Copy link

zanzlender commented Apr 3, 2023

I've been having some trouble implementing exporting and then importing those same indexes...

I found this issue #249 . Although it says it should have been fixed, I still have the same problem.

I followed this linked Stackoverflow issue and exported my indexes like so;:

// Write to FlexSearch index file
const flexIndex = new Document({
  tokenize: "forward",
  document: {
    id: "id",
    index: ["id", "url", "transcript", "timestamp"],
    store: true,
  },
  context: {
    resolution: 5,
    depth: 3,
  },
  cache: true,
});


/*
type Transcript = {
  id: string;
  url: string;
  transcript: Array<{
    timestamp: string;
    transcript: string;
  }>;
};
*/ 
transcriptsJson.forEach((_video) => {
  _video.transcript.forEach((_transcript, _index) => {
    flexIndex.add({
      id: `${_video.id}-${_index}`,
      url: _video.url,
      transcript: _transcript.transcript,
      timestamp: _transcript.timestamp,
    });
  });
});
    
const searchIndexPath2 = path.join(cwd(), "/src/content/flex-search/");

const res = await flexIndex.export(function (key, data) {
  fs.writeFileSync(
    `${searchIndexPath2}${key}.json`,
    data !== undefined ? (data as string) : ""
  );
});

And later I try to import them like so:

const keys = fs
  .readdirSync(searchIndexPath, { withFileTypes: true })
  .filter((item) => !item.isDirectory() && item.name.includes(".json"))
  .map((item) => item.name.slice(0, -5));

for (let i = 0, key; i < keys.length; i += 1) {
  key = keys[i];
  const data = fs.readFileSync(
    `${searchIndexPath}${key ?? ""}.json`,
    "utf8"
  );

  await flexIndex.import(key as string, data ?? null);
}

And finally I can search like so

const res = flexIndex.search("03", {
  index: ["transcript", "timestamp"],
  enrich: true,
});

const xy = res.find((x) => res.field === "timestamp")?.result;
console.log(xy);

Everything works fine up to this point and I get the results I wanted, but the doc object is undefined...

image

However, when I try to do the same, but only create the indexes like in the first code example, then everything works as expected:

image

Does this mean #254 is not fixed yet or am I doing something wrong? Do I need to handle the data object while importing in a special way instead of just importing the whole data?

@zanzlender zanzlender changed the title Importing exported indexes doesn't populate store Importing exported indexes doesn't populate store (#254) Apr 3, 2023
@zanzlender zanzlender changed the title Importing exported indexes doesn't populate store (#254) Importing exported indexes doesn't populate store (#249) Apr 3, 2023
@zanzlender
Copy link
Author

zanzlender commented Apr 4, 2023

I've also noticed that for some reason one of the saved files is timestamp.store.json I don't know how it's decided what the name is but it seems kind of unintuitive since most of my data is actually in the transcript property, but is then not saved in a transcript.json or transcript.store.json.

image

But I don't know if this plays any role in my problem.

@grimsteel
Copy link

I'm also experiencing this.

JSFiddle Example: https://jsfiddle.net/tnx5qLzd/

@bcspragu
Copy link

Chiming in with the same issue. My setup looks something like:

// Exporting
import flexsearch from 'flexsearch'

const docIndex = new flexsearch.Document({
  document: {
    id: 'id',
    index: ['title', 'description', 'source', 'tags', 'body'],
    store: ['title', 'description', 'tags'],
  },
});

  documents.forEach((doc) => {
    docIndex.add(doc.slug, {
      title: doc.title,
      description: doc.description,
      source: doc.source,
      tags: doc.tags,
      body: doc.body,
    })
  })

  docIndex.export((key, data) => {
    // Line-delimited JSON-objects, plays nicely with the async-ish nature of export
    stdout.write(JSON.stringify({key, data}) + '\n')
  })
})
// Importing
import { Document } from 'flexsearch'

const docIndex = new Document({
  document: {
    id: 'id',
    index: ['title', 'description', 'source', 'tags', 'body'],
    store: ['title', 'description', 'tags'],
  },
}) as Document<Post, string[]>;

await readByLines('/flexsearch.json', (line: string) => {
  const imp = JSON.parse(line)
  docIndex.import(imp.key, imp.data);
})

const searchResults = docIndex.search({
  query: 'the query',
  enrich: true,
});

// searchResults[].result[].doc is undefined

@maxhoffmann
Copy link

I’m running into the same bug. This remains a problem with version 0.7.34

@kgwosh
Copy link

kgwosh commented May 12, 2024

I'm running into the same bug. like:
const keys = fs
.readdirSync(searchIndexPath, { withFileTypes: true })
.filter(item => !item.isDirectory())
.map(item => item.name)

for(let i = 0, key; i < keys.length; i++){

key = keys[i];
// console.log(key.slice(0, -5));
const data = fs.readFileSync(`${searchIndexPath}${key}`, 'utf8')
console.log(key.slice(0, -5) , data);
index.import(key.slice(0, -5) , data);

}

but i find it fix when running like this:
const keys = fs
.readdirSync(searchIndexPath, { withFileTypes: true })
.filter(item => !item.isDirectory())
.map(item => item.name)

for(let i = 0, key; i < keys.length; i++){

key = keys[i];
// console.log(key.slice(0, -5));
const data = fs.readFileSync(`${searchIndexPath}${key}`, 'utf8')
const parsedData = JSON.parse(data);
console.log(key.slice(0, -5) , parsedData );
index.import(key.slice(0, -5) , parsedData );

}

adding JSON.parse(data); is OK ,have a try

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants