Implement Hash-Based Session Caching in Genomeshader #11

bshifaw · 2024-06-20T17:03:38Z

Problem:

Currently, Genomeshader does not have a mechanism to reuse previously created sessions. This leads to unnecessary computation and storage usage when the same session is created multiple times.

Proposed Solution:

Implement a caching mechanism using hash-based identifiers for sessions. The idea is to generate a unique hash from the input used to create a session and use this hash as the name of the session's parquet file stored in the cache directory (either based locally or cloud). When a user starts a new session, Genomeshader should:

Generate a hash from the provided input.
Check if a parquet file with a name matching the generated hash already exists in the cache directory.
If a match is found, reuse the existing parquet file to create the session.
If no match is found, create a new session and save the session's parquet file in the cache directory with the generated hash as its name.

This approach will allow Genomeshader to avoid unnecessary computations and storage usage by reusing previously created sessions when the same input is provided.

Tasks:

Implement a function to generate a unique hash from the input used to create a session.
Modify the session creation process to check for an existing parquet file with a name matching the generated hash before creating a new session.
If a matching parquet file is found, modify the session creation process to reuse the existing parquet file.
If no matching parquet file is found, modify the session creation process to save the new session's parquet file with the generated hash as its name.
Test the new functionality with various inputs to ensure it works as expected.

Acceptance Criteria:

A session should be able to be created with a unique hash generated from its input.
If a session with the same hash already exists, Genomeshader should reuse the existing session instead of creating a new one.
If a session with the same hash does not exist, Genomeshader should create a new session and save its parquet file with the generated hash as its name.
The new functionality should be covered by tests to ensure it works as expected.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Hash-Based Session Caching in Genomeshader #11

Implement Hash-Based Session Caching in Genomeshader #11

bshifaw commented Jun 20, 2024 •

edited

Loading

Implement Hash-Based Session Caching in Genomeshader #11

Implement Hash-Based Session Caching in Genomeshader #11

Comments

bshifaw commented Jun 20, 2024 • edited Loading

bshifaw commented Jun 20, 2024 •

edited

Loading