Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Memory cache for preheat tasks #3742

Open
SouthWest7 opened this issue Jan 2, 2025 · 0 comments
Open

[RFC] Memory cache for preheat tasks #3742

SouthWest7 opened this issue Jan 2, 2025 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@SouthWest7
Copy link

SouthWest7 commented Jan 2, 2025

Feature request:

Currently, Dragonfly downloads data directly to disk when processing preheat tasks. To enhance performance and reduce latency, I propose introducing a caching mechanism. This will optimize both download and upload efficiency by writing data to both memory and disk, allowing faster access from memory while ensuring data persistence on disk. Specifically, the caching mechanism will work as follows:

  • During preheat tasks: Instead of downloading the data to disk, the content will be written to the cache, enabling faster access in future operations.
  • During regular uploads: The system will first check the cache for the required content. If a cache miss occurs, it will then fall back to disk storage to retrieve the data.

This approach aims to reduce disk IO, improve overall system efficiency, and significantly lower the time spent retrieving data from remote peers during preheat tasks.

Use case:

UI Example:

Scope:

  • The caching mechanism will only affect whether piece content is read/written from the cache or disk during downloads and uploads. Other functional modules are not impacted.

Design

Write to Cache

  • Goal: Store downloaded piece content into the local cache after retrieving it from a remote peer.
  • Design Details:
    • Extend the existing method to include cache-writing logic after processing the downloaded content.
    • Design a caching mechanism that ensures the downloaded pieces can be retrieved on demand.
      wirte

Read from Cache

  • Goal: Retrieve piece content from the cache. If the cache does not contain the data, fall back to reading from local storage.
  • Design Details:
    • Add cache-reading logic to the download_piece method.
    • If the cache contains the corresponding piece, return it as the content for DownloadPieceResponse. If not, proceed with the current flow to read from local storage.
      read

Configuration

storage:
  # cache defines configuration settings for the cache, used to store piece content for preheat tasks.
  cache:
    # enable determines whether the cache is enabled. Set to true to enable caching of piece content, false to disable it.
    enable: true
    
    # capacity: Specifies the maximum number of entries the cache can hold. The default value is 100 entries.
    # Adjust this value based on the expected number of piece content entries for preheat tasks that need to be cached.
    capacity: 100

API Definition

message Download{
  // load_to_cache indicates whether the content downloaded will be stored in the storage cache.
  // Storage cache is designed to store downloaded piece content from preheat tasks, 
  // allowing other peers to access the content from memory instead of disk.
  bool load_to_cache = 21;
}

Actions:

  • protocol definition & configuration: w1
  • implementation: w1
  • full process : w2
  • unit tests, E2E, rerformance testing: w3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant